Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1150

Sub-optimal expression pushdown for slightly modified version of Tpch 19

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None

      Description

      A slightly modified version of TPCH 19, called 19_1 in the TestTpchDistributed JUnit test suite produces the following plan on latest master version 699851b. The plan shows several expressions pushed into the Project just above the Lineitem scan whereas these expressions should ideally be evaluated after the join since there is no need to evaluate the expression for a row that does not qualify the join. Also notice that there are 2 Projects above the Lineitem scan...these should have been merged into one.

      00-00 Screen
      00-01 StreamAgg(group=[{}], revenue=[SUM($0)])
      00-02 Project($f0=[*($2, -(1, $3))])
      00-03 SelectionVectorRemover
      00-04 Filter(condition=OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM CASE'), =($14, 'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, $17, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($6, 'DELIVER IN PERSON')), AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14, 'MED BOX'), =($14, 'MED PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG CASE'), =($14, 'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, $22, $23, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($12, 'DELIVER IN PERSON'))))
      00-05 HashJoin(condition=[=($1, $13)], joinType=[inner])
      00-07 Project(l_shipmode=[$5], l_partkey=[$4], l_extendedprice=[$3], l_discount=[$1], $f7=[>=($2, 2)], $f8=[<=($2, +(2, 10))], $f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], $f12=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2, +(23, 10))], $f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"])
      00-09 ProducerConsumer
      00-11 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/lineitem.parquet]], selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath [`l_shipmode`], SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath [`l_discount`], SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
      00-06 Project(p_partkey=[$0], p_container=[$1], $f5=[$2], $f6=[$3], $f70=[$4], $f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], $f120=[$9], $f130=[$10])
      00-08 Project(p_partkey=[$2], p_container=[$3], $f5=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0, 5)], $f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f9=[>=($0, 1)], $f10=[<=($0, 10)], $f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
      00-10 ProducerConsumer
      00-12 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/part.parquet]], selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`], SchemaPath [`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])

        Attachments

          Activity

            People

            • Assignee:
              DrillCommitter DrillCommitter
              Reporter:
              amansinha100 Aman Sinha
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: