Uploaded image for project: 'SnappyData'
  1. SnappyData
  2. SNAP-1753

TPCH Q19 execution performance degraded in 0.9.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: Highest
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 1.0
    • Component/s: Perf
    • Labels:

      Description

      While comparing TPCH query execution performance it is observed that query 19;s performance degraded drastically in 0.9 release.

      Cluster Configuration:
      4 AWS m4.10x large instance : 40 core 160GB ram
      1lead, 1 locator on 1 aws instance, 3 server on 3 separate aws instances

      Version to compare:
      A> 0.8 with persistence with heap of 140GB
      B> 0.9 with persistence with heap of 140GB
      C> 0.9 with persistence with heap of 40GB and memory-size=100GB

      Q19 performance
      A B C
      1387 28569 25643

      A : Q19 Plan

      CollectAggregate SnappyHashAggregate(keys=[], modes=Final, functions=[sum((L_EXTENDEDPRICE#609 * (1.0 - L_DISCOUNT#610)))])
      +- *SnappyHashAggregate(keys=[], modes=Partial, functions=[partial_sum((L_EXTENDEDPRICE#609 * (1.0 - L_DISCOUNT#610)))], output=[sum#6274])
         +- *Project [L_EXTENDEDPRICE#609, L_DISCOUNT#610]
            +- *SortMergeJoin [L_PARTKEY#605], [P_PARTKEY#1176], Inner, (((((((P_BRAND#1179 = Brand#33) && P_CONTAINER#1182 IN (SM CASE,SM BOX,SM PACK,SM PKG)) && (L_QUANTITY#608 >= 2.0)) && (L_QUANTITY#608 <= 12.0)) && (P_SIZE#1181 <= 5)) || (((((P_BRAND#1179 = Brand#14) && P_CONTAINER#1182 IN (MED BAG,MED BOX,MED PKG,MED PACK)) && (L_QUANTITY#608 >= 18.0)) && (L_QUANTITY#608 <= 28.0)) && (P_SIZE#1181 <= 10))) || (((((P_BRAND#1179 = Brand#25) && P_CONTAINER#1182 IN (LG CASE,LG BOX,LG PACK,LG PKG)) && (L_QUANTITY#608 >= 27.0)) && (L_QUANTITY#608 <= 37.0)) && (P_SIZE#1181 <= 15)))
               :- *Sort [L_PARTKEY#605 ASC], false, 0
               :  +- Exchange hashpartitioning(L_PARTKEY#605, 223)
               :     +- *Project [L_PARTKEY#605, L_QUANTITY#608, L_EXTENDEDPRICE#609, L_DISCOUNT#610]
               :        +- *Filter (L_SHIPMODE#618 IN (AIR,AIR REG) && (L_SHIPINSTRUCT#617 = DELIVER IN PERSON))
               :           +- *Partitioned Scan ColumnFormatRelation[APP.LINEITEM] , Requested Columns = [L_PARTKEY#605,L_QUANTITY#608,L_DISCOUNT#610,L_SHIPINSTRUCT#617,L_EXTENDEDPRICE#609,L_SHIPMODE#618] partitionColumns = [l_orderkey#604L] numBuckets= 223 numPartitions= 223
               +- *Sort [P_PARTKEY#1176 ASC], false, 0
                  +- *Filter (P_SIZE#1181 >= 1)
                     +- *Partitioned Scan ColumnFormatRelation[APP.PART] , Requested Columns = [P_PARTKEY#1176,P_BRAND#1179,P_SIZE#1181,P_CONTAINER#1182] partitionColumns = [p_partkey#1176] numBuckets= 223 numPartitions= 223
      
      

      B: Q19 Plan

      CollectAggregate SnappyHashAggregate(keys=[], modes=Final, functions=[sum((L_EXTENDEDPRICE#667 * (1.0 - L_DISCOUNT#668)))])
      +- *CachedPlanHelper
         +- *SnappyHashAggregate(keys=[], modes=Partial, functions=[partial_sum((L_EXTENDEDPRICE#667 * (1.0 - L_DISCOUNT#668)))], output=[sum#4820])
            +- *Project [L_EXTENDEDPRICE#667, L_DISCOUNT#668]
               +- *SortMergeJoin [L_PARTKEY#663], [P_PARTKEY#847], Inner, ((((((((((P_BRAND#850 = Brand#11) && P_CONTAINER#853 IN (SM CASE,SM BOX,SM PACK,SM PKG)) && (L_QUANTITY#666 >= DynamicExpression(cast(8 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((8 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 5)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON)) || ((((((((P_BRAND#850 = Brand#55) && P_CONTAINER#853 IN (MED BAG,MED BOX,MED PKG,MED PACK)) && (L_QUANTITY#666 >= DynamicExpression(cast(13 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((13 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 10)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON))) || ((((((((P_BRAND#850 = Brand#12) && P_CONTAINER#853 IN (LG CASE,LG BOX,LG PACK,LG PKG)) && (L_QUANTITY#666 >= DynamicExpression(cast(21 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((21 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 15)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON)))
                  :- *CachedPlanHelper
                  :  +- *Sort [L_PARTKEY#663 ASC], false, 0
                  :     +- Exchange hashpartitioning(L_PARTKEY#663, 223)
                  :        +- *CachedPlanHelper
                  :           +- *Partitioned Scan ColumnFormatRelation[APP.LINEITEM] , Requested Columns = [L_PARTKEY#663,L_QUANTITY#666,L_EXTENDEDPRICE#667,L_DISCOUNT#668,L_SHIPINSTRUCT#675,L_SHIPMODE#676] partitionColumns = [l_orderkey#662L] numBuckets= 223 numPartitions= 223
                  +- *CachedPlanHelper
                     +- *Sort [P_PARTKEY#847 ASC], false, 0
                        +- *Partitioned Scan ColumnFormatRelation[APP.PART] , Requested Columns = [P_PARTKEY#847,P_BRAND#850,P_SIZE#852,P_CONTAINER#853] partitionColumns = [p_partkey#847] numBuckets= 223 numPartitions= 223
      
      

      C: Q19 Plan

      CollectAggregate SnappyHashAggregate(keys=[], modes=Final, functions=[sum((L_EXTENDEDPRICE#667 * (1.0 - L_DISCOUNT#668)))])
      +- *CachedPlanHelper
         +- *SnappyHashAggregate(keys=[], modes=Partial, functions=[partial_sum((L_EXTENDEDPRICE#667 * (1.0 - L_DISCOUNT#668)))], output=[sum#4820])
            +- *Project [L_EXTENDEDPRICE#667, L_DISCOUNT#668]
               +- *SortMergeJoin [L_PARTKEY#663], [P_PARTKEY#847], Inner, ((((((((((P_BRAND#850 = Brand#43) && P_CONTAINER#853 IN (SM CASE,SM BOX,SM PACK,SM PKG)) && (L_QUANTITY#666 >= DynamicExpression(cast(2 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((2 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 5)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON)) || ((((((((P_BRAND#850 = Brand#12) && P_CONTAINER#853 IN (MED BAG,MED BOX,MED PKG,MED PACK)) && (L_QUANTITY#666 >= DynamicExpression(cast(13 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((13 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 10)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON))) || ((((((((P_BRAND#850 = Brand#55) && P_CONTAINER#853 IN (LG CASE,LG BOX,LG PACK,LG PKG)) && (L_QUANTITY#666 >= DynamicExpression(cast(26 as double)))) && (L_QUANTITY#666 <= DynamicExpression(cast((26 + 10) as double)))) && (P_SIZE#852 >= 1)) && (P_SIZE#852 <= 15)) && L_SHIPMODE#676 IN (AIR,AIR REG)) && (L_SHIPINSTRUCT#675 = DELIVER IN PERSON)))
                  :- *CachedPlanHelper
                  :  +- *Sort [L_PARTKEY#663 ASC], false, 0
                  :     +- Exchange hashpartitioning(L_PARTKEY#663, 223)
                  :        +- *CachedPlanHelper
                  :           +- *Partitioned Scan ColumnFormatRelation[APP.LINEITEM] , Requested Columns = [L_PARTKEY#663,L_QUANTITY#666,L_EXTENDEDPRICE#667,L_DISCOUNT#668,L_SHIPINSTRUCT#675,L_SHIPMODE#676] partitionColumns = [l_orderkey#662L] numBuckets= 223 numPartitions= 223
                  +- *CachedPlanHelper
                     +- *Sort [P_PARTKEY#847 ASC], false, 0
                        +- *Partitioned Scan ColumnFormatRelation[APP.PART] , Requested Columns = [P_PARTKEY#847,P_BRAND#850,P_SIZE#852,P_CONTAINER#853] partitionColumns = [p_partkey#847] numBuckets= 223 numPartitions= 223
      
      

      0.9 Plan clearly shows, all filters are pushed in SortMergeJoin State i.e. filtering is delayed

        Attachments

          Activity

            People

            • Assignee:
              sprabhu Shyja Prabhu (Inactive)
              Reporter:
              kbachhav Kishor Bachhav (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: