[INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building Apache Impala (Incubating) Query Engine Frontend 0.1-SNAPSHOT [INFO] ------------------------------------------------------------------------ [WARNING] Could not transfer metadata com.cloudera.cdh:cdh-root:5.10.0-SNAPSHOT/maven-metadata.xml from/to ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type default using the available factories WagonRepositoryConnectorFactory [INFO] [INFO] --- jacoco-maven-plugin:0.7.6.201602180812:prepare-agent (prepare-jacoco-agent) @ impala-frontend --- [INFO] Skipping JaCoCo execution because property jacoco.skip is set. [INFO] surefireJacocoArg set to empty [INFO] [INFO] --- cup-maven-plugin:1.6-cdh:generate (cup) @ impala-frontend --- [INFO] CUP: Processing 1 cup files Warning : Terminal "EMPTY_IDENT" was declared but never used Warning : Terminal "UNEXPECTED_CHAR" was declared but never used ------- CUP v0.11a czt01 beta Parser Generation Summary ------- 0 errors and 2 warnings 213 terminals, 189 non-terminals, and 725 productions declared, producing 1349 unique parse states. 2 terminals declared but not used. 0 non-terminals declared but not used. 0 productions never reduced. 0 conflicts detected (0 expected). Code written to "SqlParser.java", and "SqlParserSymbols.java". ---------------------------------------------------- (v0.11a czt01 beta) [WARNING] /home/lv/i1/fe/src/main/cup/sql-parser.cup [0:0]: Terminal "EMPTY_IDENT" was declared but never used [WARNING] /home/lv/i1/fe/src/main/cup/sql-parser.cup [0:0]: Terminal "UNEXPECTED_CHAR" was declared but never used [INFO] CUP: generated /home/lv/i1/fe/target/generated-sources/cup/org/apache/impala/analysis/SqlParser.java [INFO] CUP: generated /home/lv/i1/fe/target/generated-sources/cup/org/apache/impala/analysis/SqlParserSymbols.java [INFO] [INFO] --- maven-jflex-plugin:1.4.3:generate (jflex) @ impala-frontend --- [INFO] SqlScanner.java is up to date. [INFO] [INFO] --- build-helper-maven-plugin:1.5:add-source (add-source) @ impala-frontend --- [INFO] Source directory: /home/lv/i1/fe/generated-sources/gen-java added. [INFO] Source directory: /home/lv/i1/fe/target/generated-sources/cup added. [INFO] [INFO] --- maven-resources-plugin:2.3:resources (default-resources) @ impala-frontend --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /home/lv/i1/fe/src/main/resources [INFO] Copying 0 resource [INFO] [INFO] --- maven-compiler-plugin:3.3:compile (default-compile) @ impala-frontend --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 666 source files to /home/lv/i1/fe/target/classes [WARNING] /home/lv/i1/fe/src/main/java/org/apache/impala/util/UnsafeUtil.java:[24,16] sun.misc.Unsafe is internal proprietary API and may be removed in a future release [WARNING] /home/lv/i1/fe/src/main/java/org/apache/impala/util/UnsafeUtil.java:[33,23] sun.misc.Unsafe is internal proprietary API and may be removed in a future release [WARNING] /home/lv/i1/fe/src/main/java/org/apache/impala/util/UnsafeUtil.java:[40,15] sun.misc.Unsafe is internal proprietary API and may be removed in a future release [WARNING] /home/lv/i1/fe/src/main/java/org/apache/impala/util/UnsafeUtil.java:[45,25] sun.misc.Unsafe is internal proprietary API and may be removed in a future release [INFO] /home/lv/i1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: Some input files use or override a deprecated API. [INFO] /home/lv/i1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: Recompile with -Xlint:deprecation for details. [INFO] /home/lv/i1/fe/generated-sources/gen-java/org/apache/impala/thrift/TDistributeByRangeParam.java: Some input files use unchecked or unsafe operations. [INFO] /home/lv/i1/fe/generated-sources/gen-java/org/apache/impala/thrift/TDistributeByRangeParam.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.3:testResources (default-testResources) @ impala-frontend --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 30 resources [INFO] [INFO] --- maven-compiler-plugin:3.3:testCompile (default-testCompile) @ impala-frontend --- [INFO] Nothing to compile - all classes are up to date [INFO] [INFO] --- maven-surefire-plugin:2.18:test (default-test) @ impala-frontend --- [INFO] Surefire report directory: /home/lv/i1/logs/fe_tests ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.impala.planner.PlannerTest Tests run: 44, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 42.116 sec <<< FAILURE! - in org.apache.impala.planner.PlannerTest testTpchNested(org.apache.impala.planner.PlannerTest) Time elapsed: 2.322 sec <<< FAILURE! java.lang.AssertionError: Section PLAN of query: select s_name, count(*) as numwait from supplier s, customer c, c.c_orders o, o.o_lineitems l1, region.r_nations n where s_suppkey = l1.l_suppkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from o.o_lineitems l2 where l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from o.o_lineitems l3 where l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'SAUDI ARABIA' group by s_name order by numwait desc, s_name limit 100 Actual does not match expected result: 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Expected: 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=4.18KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=111.08MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=577.87MB predicates: !empty(c.c_orders) predicates on o: !empty(o.o_lineitems), o_orderstatus = 'F' predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | hosts=3 per-host-mem=unavailable | tuple-ids=10 row-size=42B cardinality=100 | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | hosts=3 per-host-mem=unavailable | tuple-ids=9 row-size=42B cardinality=9965 | 18:SUBPLAN | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | |--12:SINGULAR ROW SRC | | | parent-subplan=18 | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | 13:UNNEST [o.o_lineitems l2] | | parent-subplan=18 | | hosts=3 per-host-mem=unavailable | | tuple-ids=5 row-size=8B cardinality=10 | | | 14:UNNEST [o.o_lineitems l3] | parent-subplan=18 | hosts=3 per-host-mem=unavailable | tuple-ids=7 row-size=40B cardinality=10 | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | table stats: 5 rows total | column stats: all | hosts=1 per-host-mem=unavailable | tuple-ids=4 row-size=18B cardinality=5 | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0 row-size=164B cardinality=15000000 | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | table stats: 10000 rows total | column stats: all | hosts=1 per-host-mem=unavailable | tuple-ids=0 row-size=44B cardinality=10000 | 02:SUBPLAN | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1 row-size=120B cardinality=15000000 | |--09:NESTED LOOP JOIN [CROSS JOIN] | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2,1 row-size=120B cardinality=100 | | | |--03:SINGULAR ROW SRC | | parent-subplan=02 | | hosts=3 per-host-mem=unavailable | | tuple-ids=1 row-size=16B cardinality=1 | | | 05:SUBPLAN | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2 row-size=104B cardinality=100 | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2 row-size=104B cardinality=10 | | | | | |--06:SINGULAR ROW SRC | | | parent-subplan=05 | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=2 row-size=64B cardinality=1 | | | | | 07:UNNEST [o.o_lineitems l1] | | parent-subplan=05 | | hosts=3 per-host-mem=unavailable | | tuple-ids=3 row-size=0B cardinality=10 | | | 04:UNNEST [c.c_orders o] | parent-subplan=02 | hosts=3 per-host-mem=unavailable | tuple-ids=2 row-size=0B cardinality=10 | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate table stats: 150000 rows total column stats: unavailable hosts=3 per-host-mem=unavailable tuple-ids=1 row-size=16B cardinality=150000 Section DISTRIBUTEDPLAN of query: select s_name, count(*) as numwait from supplier s, customer c, c.c_orders o, o.o_lineitems l1, region.r_nations n where s_suppkey = l1.l_suppkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from o.o_lineitems l2 where l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from o.o_lineitems l3 where l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'SAUDI ARABIA' group by s_name order by numwait desc, s_name limit 100 Actual does not match expected result: 25:MERGING-EXCHANGE [UNPARTITIONED] | order by: count(*) DESC, s_name ASC | limit: 100 | 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | 23:EXCHANGE [HASH(s_name)] | 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--22:EXCHANGE [BROADCAST] | | | 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | |--21:EXCHANGE [BROADCAST] | | | 00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Expected: 25:MERGING-EXCHANGE [UNPARTITIONED] | order by: count(*) DESC, s_name ASC | limit: 100 | 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | 23:EXCHANGE [HASH(s_name)] | 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--22:EXCHANGE [BROADCAST] | | | 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=4.18KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | |--21:EXCHANGE [BROADCAST] | | | 00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=111.08MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=577.87MB predicates: !empty(c.c_orders) predicates on o: !empty(o.o_lineitems), o_orderstatus = 'F' predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Verbose plan: F04:PLAN FRAGMENT [UNPARTITIONED] 25:MERGING-EXCHANGE [UNPARTITIONED] order by: count(*) DESC, s_name ASC limit: 100 hosts=3 per-host-mem=unavailable tuple-ids=10 row-size=42B cardinality=100 F03:PLAN FRAGMENT [HASH(s_name)] DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=25, UNPARTITIONED] 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | hosts=3 per-host-mem=4.10KB | tuple-ids=10 row-size=42B cardinality=100 | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | hosts=3 per-host-mem=10.00MB | tuple-ids=9 row-size=42B cardinality=9965 | 23:EXCHANGE [HASH(s_name)] hosts=3 per-host-mem=0B tuple-ids=9 row-size=42B cardinality=9965 F00:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=23, HASH(s_name)] 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | hosts=3 per-host-mem=10.00MB | tuple-ids=9 row-size=42B cardinality=9965 | 18:SUBPLAN | hosts=3 per-host-mem=0B | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | hosts=3 per-host-mem=182B | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | hosts=3 per-host-mem=182B | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | |--12:SINGULAR ROW SRC | | | parent-subplan=18 | | | hosts=3 per-host-mem=0B | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | 13:UNNEST [o.o_lineitems l2] | | parent-subplan=18 | | hosts=3 per-host-mem=0B | | tuple-ids=5 row-size=8B cardinality=10 | | | 14:UNNEST [o.o_lineitems l3] | parent-subplan=18 | hosts=3 per-host-mem=0B | tuple-ids=7 row-size=40B cardinality=10 | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | hosts=3 per-host-mem=100B | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--22:EXCHANGE [BROADCAST] | hosts=1 per-host-mem=0B | tuple-ids=4 row-size=18B cardinality=5 | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | hosts=3 per-host-mem=472.66KB | tuple-ids=3,2,1,0 row-size=164B cardinality=15000000 | |--21:EXCHANGE [BROADCAST] | hosts=1 per-host-mem=0B | tuple-ids=0 row-size=44B cardinality=10000 | 02:SUBPLAN | hosts=3 per-host-mem=0B | tuple-ids=3,2,1 row-size=120B cardinality=15000000 | |--09:NESTED LOOP JOIN [CROSS JOIN] | | hosts=3 per-host-mem=16B | | tuple-ids=3,2,1 row-size=120B cardinality=100 | | | |--03:SINGULAR ROW SRC | | parent-subplan=02 | | hosts=3 per-host-mem=0B | | tuple-ids=1 row-size=16B cardinality=1 | | | 05:SUBPLAN | | hosts=3 per-host-mem=0B | | tuple-ids=3,2 row-size=104B cardinality=100 | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | hosts=3 per-host-mem=64B | | | tuple-ids=3,2 row-size=104B cardinality=10 | | | | | |--06:SINGULAR ROW SRC | | | parent-subplan=05 | | | hosts=3 per-host-mem=0B | | | tuple-ids=2 row-size=64B cardinality=1 | | | | | 07:UNNEST [o.o_lineitems l1] | | parent-subplan=05 | | hosts=3 per-host-mem=0B | | tuple-ids=3 row-size=0B cardinality=10 | | | 04:UNNEST [c.c_orders o] | parent-subplan=02 | hosts=3 per-host-mem=0B | tuple-ids=2 row-size=0B cardinality=10 | 01:SCAN HDFS [tpch_nested_parquet.customer c, RANDOM] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate table stats: 150000 rows total column stats: unavailable hosts=3 per-host-mem=88.00MB tuple-ids=1 row-size=16B cardinality=150000 F02:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=22, BROADCAST] 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n, RANDOM] partitions=1/1 files=1 size=3.24KB predicates: n_name = 'SAUDI ARABIA' table stats: 5 rows total column stats: all hosts=1 per-host-mem=32.00MB tuple-ids=4 row-size=18B cardinality=5 F01:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=21, BROADCAST] 00:SCAN HDFS [tpch_nested_parquet.supplier s, RANDOM] partitions=1/1 files=1 size=43.00MB runtime filters: RF000 -> s_nationkey table stats: 10000 rows total column stats: all hosts=1 per-host-mem=168.00MB tuple-ids=0 row-size=44B cardinality=10000 at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:682) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:691) at org.apache.impala.planner.PlannerTest.testTpchNested(PlannerTest.java:199) testRuntimeFilterPropagation(org.apache.impala.planner.PlannerTest) Time elapsed: 0.497 sec <<< FAILURE! java.lang.AssertionError: Section PLAN of query: with big_six as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), small_two as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bool_col = b.bool_col ), big_eight as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bool_col = b.bool_col and a.date_string_col = b.date_string_col and a.double_col = b.double_col and a.smallint_col = b.smallint_col and a.string_col = b.string_col and a.timestamp_col = b.timestamp_col and a.tinyint_col = b.tinyint_col ), small_four as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.double_col = b.double_col and a.float_col = b.float_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), big_one as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id ), nan as ( with zero_card as ( select straight_join b.id, b.int_col from (values(1 id) limit 0) a inner join functional.alltypes b on a.id = b.id ) select straight_join 1 from zero_card z inner join functional.alltypestiny x on x.id = z.id ), small_six as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), big_three as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bool_col = b.bool_col and a.tinyint_col = b.tinyint_col ), small_four_2 as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.double_col = b.double_col and a.float_col = b.float_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ) select straight_join 1 from big_six inner join small_two inner join big_eight inner join small_four inner join big_one inner join nan inner join small_six inner join big_three inner join small_four_2 Actual does not match expected result: 36:NESTED LOOP JOIN [CROSS JOIN] | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 35:NESTED LOOP JOIN [CROSS JOIN] | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 34:NESTED LOOP JOIN [CROSS JOIN] | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 33:NESTED LOOP JOIN [CROSS JOIN] | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 15:EMPTYSET | 32:NESTED LOOP JOIN [CROSS JOIN] | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 31:NESTED LOOP JOIN [CROSS JOIN] | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF017 <- b.bool_col, RF016 <- b.bigint_col, RF019 <- b.float_col, RF018 <- b.double_col, RF021 <- b.int_col, RF020 <- b.id, RF023 <- b.tinyint_col, RF022 <- b.smallint_col ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF017 -> a.bool_col, RF016 -> a.bigint_col, RF019 -> a.float_col, RF018 -> a.double_col, RF021 -> a.int_col, RF020 -> a.id, RF023 -> a.tinyint_col, RF022 -> a.smallint_col | 30:NESTED LOOP JOIN [CROSS JOIN] | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 29:NESTED LOOP JOIN [CROSS JOIN] | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB Expected: 36:NESTED LOOP JOIN [CROSS JOIN] | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 35:NESTED LOOP JOIN [CROSS JOIN] | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 34:NESTED LOOP JOIN [CROSS JOIN] | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 33:NESTED LOOP JOIN [CROSS JOIN] | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 15:EMPTYSET | 32:NESTED LOOP JOIN [CROSS JOIN] | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 31:NESTED LOOP JOIN [CROSS JOIN] | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF016 <- b.bigint_col, RF017 <- b.bool_col, RF018 <- b.double_col, RF019 <- b.float_col, RF020 <- b.id, RF021 <- b.int_col, RF022 <- b.smallint_col, RF023 <- b.tinyint_col | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF016 -> a.bigint_col, RF017 -> a.bool_col, RF018 -> a.double_col, RF019 -> a.float_col, RF020 -> a.id, RF021 -> a.int_col, RF022 -> a.smallint_col, RF023 -> a.tinyint_col | 30:NESTED LOOP JOIN [CROSS JOIN] | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 29:NESTED LOOP JOIN [CROSS JOIN] | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] 36:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22,24,25,27,28 row-size=393B cardinality=0 | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=27,28 row-size=64B cardinality=8 | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=28 row-size=32B cardinality=8 | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=27 row-size=32B cardinality=7300 | 35:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22,24,25 row-size=329B cardinality=0 | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=24,25 row-size=12B cardinality=7300 | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=25 row-size=6B cardinality=7300 | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=24 row-size=6B cardinality=7300 | 34:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22 row-size=317B cardinality=0 | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=21,22 row-size=40B cardinality=8 | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=22 row-size=20B cardinality=8 | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=21 row-size=20B cardinality=7300 | 33:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19 row-size=277B cardinality=0 | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | hosts=1 per-host-mem=unavailable | | tuple-ids=15,17,19 row-size=9B cardinality=0 | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=19 row-size=4B cardinality=8 | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | hosts=1 per-host-mem=unavailable | | tuple-ids=15,17 row-size=5B cardinality=0 | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=17 row-size=4B cardinality=7300 | | | 15:EMPTYSET | hosts=1 per-host-mem=0B | tuple-ids=15 row-size=0B cardinality=0 | 32:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13 row-size=268B cardinality=24897088000000 | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | hosts=3 per-host-mem=unavailable | | tuple-ids=12,13 row-size=8B cardinality=7300 | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=13 row-size=4B cardinality=7300 | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=12 row-size=4B cardinality=7300 | 31:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10 row-size=260B cardinality=3410560000 | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF017 <- b.bool_col, RF016 <- b.bigint_col, RF019 <- b.float_col, RF018 <- b.double_col, RF021 <- b.int_col, RF020 <- b.id, RF023 <- b.tinyint_col, RF022 <- b.smallint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=9,10 row-size=64B cardinality=8 | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=10 row-size=32B cardinality=8 | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF017 -> a.bool_col, RF016 -> a.bigint_col, RF019 -> a.float_col, RF018 -> a.double_col, RF021 -> a.int_col, RF020 -> a.id, RF023 -> a.tinyint_col, RF022 -> a.smallint_col | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=9 row-size=32B cardinality=7300 | 30:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7 row-size=196B cardinality=426320000 | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=6,7 row-size=146B cardinality=7300 | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=7 row-size=73B cardinality=7300 | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=6 row-size=73B cardinality=7300 | 29:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4 row-size=50B cardinality=58400 | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,4 row-size=10B cardinality=8 | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=4 row-size=5B cardinality=8 | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=3 row-size=5B cardinality=7300 | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | hosts=3 per-host-mem=unavailable | tuple-ids=0,1 row-size=40B cardinality=7300 | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=1 row-size=20B cardinality=7300 | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB table stats: 7300 rows total column stats: all hosts=3 per-host-mem=unavailable tuple-ids=0 row-size=20B cardinality=7300 at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:682) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:646) at org.apache.impala.planner.PlannerTest.testRuntimeFilterPropagation(PlannerTest.java:243) testLineage(org.apache.impala.planner.PlannerTest) Time elapsed: 0.918 sec <<< FAILURE! java.lang.AssertionError: section LINEAGE of query: select * from ( select tinyint_col + int_col x from functional.alltypes union all select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2 Output: {"queryText":"select * from (\n select tinyint_col + int_col x from functional.alltypes\n union all\n select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2","hash":"25456c60a2e874a20732f42c7af27553","user":"lv","timestamp":1475939633,"edges":[{"sources":[1,2,3],"targets":[0],"edgeType":"PROJECTION"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"x"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"}]} Expected: { "queryText":"select * from (\n select tinyint_col + int_col x from functional.alltypes\n union all\n select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2", "hash":"25456c60a2e874a20732f42c7af27553", "user":"dev", "timestamp":1446159271, "edges":[ { "sources":[ 1, 2, 3 ], "targets":[ 0 ], "edgeType":"PROJECTION" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"x" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" } ] } section LINEAGE of query: select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id), count(b.string_col), b.timestamp_col from functional.alltypes a join functional.alltypessmall b on (a.id = b.id) where a.year = 2010 and b.float_col > 0 group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col having count(a.int_col) > 10 order by b.bigint_col limit 10 Output: {"queryText":"select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\nfrom functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2010 and b.float_col > 0\ngroup by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\nhaving count(a.int_col) > 10\norder by b.bigint_col limit 10","hash":"e0309eeff9811f53c82657d62c1e04eb","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[2,3],"targets":[0],"edgeType":"PREDICATE"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[1,2,3,5,7,8,9,10,11,12],"targets":[0,4,6],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"sum(a.tinyint_col) OVER(...)"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.smallint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"count(b.string_col)"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"timestamp_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.timestamp_col"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.bigint_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypes.year"}]} Expected: { "queryText":"select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\nfrom functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2010 and b.float_col > 0\ngroup by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\nhaving count(a.int_col) > 10\norder by b.bigint_col limit 10", "hash":"e0309eeff9811f53c82657d62c1e04eb", "user":"dev", "timestamp":1446159271, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 2, 3 ], "targets":[ 0 ], "edgeType":"PREDICATE" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 5, 7, 8, 9, 10, 11, 12 ], "targets":[ 0, 4, 6 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"sum(a.tinyint_col) OVER(...)" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.smallint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"count(b.string_col)" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"timestamp_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.timestamp_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.bigint_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypes.year" } ] } section LINEAGE of query: select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col from functional.alltypessmall c join ( select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day, a.int_col int_col, a.month month, b.float_col float_col, b.id id from ( select * from functional.alltypesagg a where month=1 ) a join functional.alltypessmall b on (a.smallint_col = b.id) ) x on (x.tinyint_col = c.id) where x.day=1 and x.int_col > 899 and x.float_col > 4.5 and c.string_col < '7' and x.int_col + x.float_col + cast(c.string_col as float) < 1000 Output: {"queryText":"select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\nfrom functional.alltypessmall c\njoin (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\nwhere x.day=1\nand x.int_col > 899\nand x.float_col > 4.5\nand c.string_col < '7'\nand x.int_col + x.float_col + cast(c.string_col as float) < 1000","hash":"4edf165aed5982ede63f7c91074f4b44","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[11],"targets":[10],"edgeType":"PROJECTION"},{"sources":[1,3,5,7,9,11,12,13],"targets":[0,2,4,6,8,10],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"smallint_col"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.smallint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":4,"vertexType":"COLUMN","vertexId":"tinyint_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.tinyint_col"},{"id":6,"vertexType":"COLUMN","vertexId":"int_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":8,"vertexType":"COLUMN","vertexId":"float_col"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"string_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.month"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.day"}]} Expected: { "queryText":"select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\nfrom functional.alltypessmall c\njoin (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\nwhere x.day=1\nand x.int_col > 899\nand x.float_col > 4.5\nand c.string_col < '7'\nand x.int_col + x.float_col + cast(c.string_col as float) < 1000", "hash":"4edf165aed5982ede63f7c91074f4b44", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 11 ], "targets":[ 10 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 3, 5, 7, 9, 11, 12, 13 ], "targets":[ 0, 2, 4, 6, 8, 10 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"smallint_col" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.smallint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"id" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"tinyint_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.tinyint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"int_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"float_col" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"string_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.day" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.month" } ] } section LINEAGE of query: select int_col + 1, tinyint_col - 1 from functional.alltypes a where a.int_col < (select max(int_col) from functional.alltypesagg g where g.bool_col = true) and a.bigint_col > 10 Output: {"queryText":"select int_col + 1, tinyint_col - 1\nfrom functional.alltypes a\nwhere a.int_col <\n (select max(int_col) from functional.alltypesagg g where g.bool_col = true)\nand a.bigint_col > 10","hash":"5e6227f323793ea4441e2a3119af2f09","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[1,4,5,6],"targets":[0,2],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"int_col + 1"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":2,"vertexType":"COLUMN","vertexId":"tinyint_col - 1"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.bool_col"},{"id":6,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"}]} Expected: { "queryText":"select int_col + 1, tinyint_col - 1\nfrom functional.alltypes a\nwhere a.int_col <\n (select max(int_col) from functional.alltypesagg g where g.bool_col = true)\nand a.bigint_col > 10", "hash":"5e6227f323793ea4441e2a3119af2f09", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 4, 5, 6 ], "targets":[ 0, 2 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"int_col + 1" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"tinyint_col - 1" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.bool_col" } ] } section LINEAGE of query: select lead(a) over (partition by b order by c) from (select lead(id) over (partition by int_col order by bigint_col) as a, max(id) over (partition by tinyint_col order by int_col) as b, min(int_col) over (partition by string_col order by bool_col) as c from functional.alltypes) v Output: {"queryText":"select lead(a) over (partition by b order by c)\nfrom\n (select lead(id) over (partition by int_col order by bigint_col) as a,\n max(id) over (partition by tinyint_col order by int_col) as b,\n min(int_col) over (partition by string_col order by bool_col) as c\n from functional.alltypes) v","hash":"aa95e5e6f39fc80bb3c318a2515dc77d","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[1,2,3,4,5,6],"targets":[0],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"lead(a, 1, NULL) OVER(...)"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypes.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"functional.alltypes.bool_col"}]} Expected: { "queryText":"select lead(a) over (partition by b order by c)\nfrom\n (select lead(id) over (partition by int_col order by bigint_col) as a,\n max(id) over (partition by tinyint_col order by int_col) as b,\n min(int_col) over (partition by string_col order by bool_col) as c\n from functional.alltypes) v", "hash":"aa95e5e6f39fc80bb3c318a2515dc77d", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 4, 5, 6 ], "targets":[ 0 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"lead(a, 1, NULL) OVER(...)" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bool_col" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.alltypes.string_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" } ] } section LINEAGE of query: create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col from functional.alltypessmall c join ( select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day, a.int_col int_col, a.month month, b.float_col float_col, b.id id from ( select * from functional.alltypesagg a where month=1 ) a join functional.alltypessmall b on (a.smallint_col = b.id) ) x on (x.tinyint_col = c.id) where x.day=1 and x.int_col > 899 and x.float_col > 4.5 and c.string_col < '7' and x.int_col + x.float_col + cast(c.string_col as float) < 1000 Output: {"queryText":"create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as\n select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\n from functional.alltypessmall c\n join (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\n where x.day=1\n and x.int_col > 899\n and x.float_col > 4.5\n and c.string_col < '7'\n and x.int_col + x.float_col + cast(c.string_col as float) < 1000","hash":"ffbe643df8f26e92907fb45de1aeda36","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[3],"targets":[6],"edgeType":"PROJECTION"},{"sources":[8],"targets":[7],"edgeType":"PROJECTION"},{"sources":[10],"targets":[9],"edgeType":"PROJECTION"},{"sources":[12],"targets":[11],"edgeType":"PROJECTION"},{"sources":[1,3,5,8,10,12,13,14],"targets":[0,2,4,6,7,9,11],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a1"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.smallint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a2"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":4,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a3"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.tinyint_col"},{"id":6,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a4"},{"id":7,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a5"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":9,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a6"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":11,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a7"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.month"},{"id":14,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.day"}]} Expected: { "queryText":"create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as\n select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\n from functional.alltypessmall c\n join (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\n where x.day=1\n and x.int_col > 899\n and x.float_col > 4.5\n and c.string_col < '7'\n and x.int_col + x.float_col + cast(c.string_col as float) < 1000", "hash":"ffbe643df8f26e92907fb45de1aeda36", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 8 ], "targets":[ 7 ], "edgeType":"PROJECTION" }, { "sources":[ 10 ], "targets":[ 9 ], "edgeType":"PROJECTION" }, { "sources":[ 12 ], "targets":[ 11 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 3, 5, 8, 10, 12, 13, 14 ], "targets":[ 0, 2, 4, 6, 7, 9, 11 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a1" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.smallint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a2" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a3" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.tinyint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a4" }, { "id":7, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a5" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":9, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a6" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a7" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.day" }, { "id":14, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.month" } ] } section LINEAGE of query: create view test_view_lineage as select * from ( select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id), count(b.string_col), b.timestamp_col from functional.alltypes a join functional.alltypessmall b on (a.id = b.id) where a.year = 2010 and b.float_col > 0 group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col having count(a.int_col) > 10 order by b.bigint_col limit 10) t Output: {"queryText":"create view test_view_lineage as\n select * from (\n select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\n from functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\n where a.year = 2010 and b.float_col > 0\n group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\n having count(a.int_col) > 10\n order by b.bigint_col limit 10) t","hash":"d4b9e2d63548088f911816b2ae29d7c2","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[2,3],"targets":[0],"edgeType":"PREDICATE"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[1,2,3,5,7,8,9,10,11,12],"targets":[0,4,6],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"default.test_view_lineage._c0"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.smallint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"default.test_view_lineage._c1"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.timestamp_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.timestamp_col"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.bigint_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypes.year"}]} Expected: { "queryText":"create view test_view_lineage as\n select * from (\n select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\n from functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\n where a.year = 2010 and b.float_col > 0\n group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\n having count(a.int_col) > 10\n order by b.bigint_col limit 10) t", "hash":"d4b9e2d63548088f911816b2ae29d7c2", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 2, 3 ], "targets":[ 0 ], "edgeType":"PREDICATE" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 5, 7, 8, 9, 10, 11, 12 ], "targets":[ 0, 4, 6 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage._c0" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.smallint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage._c1" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.timestamp_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.timestamp_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.bigint_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypes.year" } ] } section LINEAGE of query: select * from ( select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes where year = 2000 order by nested_struct_col.f2.f12.f21 limit 10 union all select sum(f1) y from (select complex_struct_col.f1 f1 from functional.allcomplextypes group by 1) v1) v2 Output: {"queryText":"select * from (\n select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes\n where year = 2000\n order by nested_struct_col.f2.f12.f21 limit 10\n union all\n select sum(f1) y from\n (select complex_struct_col.f1 f1 from functional.allcomplextypes\n group by 1) v1) v2","hash":"4fb3ceddbf596097335af607d528f5a7","user":"lv","timestamp":1475939634,"edges":[{"sources":[1,2,3],"targets":[0],"edgeType":"PROJECTION"},{"sources":[4,5],"targets":[0],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"x"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_struct_col.f1"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_struct_col.f2"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_struct_col.f1"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.nested_struct_col.f2.f12.f21"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.year"}]} Expected: { "queryText":"select * from (\n select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes\n where year = 2000\n order by nested_struct_col.f2.f12.f21 limit 10\n union all\n select sum(f1) y from\n (select complex_struct_col.f1 f1 from functional.allcomplextypes\n group by 1) v1) v2", "hash":"4fb3ceddbf596097335af607d528f5a7", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1, 2, 3 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 4, 5 ], "targets":[ 0 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"x" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_struct_col.f1" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_struct_col.f2" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_struct_col.f1" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.nested_struct_col.f2.f12.f21" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.year" } ] } section LINEAGE of query: select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m where a.item = m.f1 Output: {"queryText":"select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m\n where a.item = m.f1","hash":"1b0db371b32e90d33629ed7779332cf7","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[11],"targets":[10],"edgeType":"PROJECTION"},{"sources":[13],"targets":[12],"edgeType":"PROJECTION"},{"sources":[7,11,14,15],"targets":[0,2,4,6,8,10,12],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"id"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.id"},{"id":2,"vertexType":"COLUMN","vertexId":"year"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.year"},{"id":4,"vertexType":"COLUMN","vertexId":"month"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.month"},{"id":6,"vertexType":"COLUMN","vertexId":"item"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col.item"},{"id":8,"vertexType":"COLUMN","vertexId":"key"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.key"},{"id":10,"vertexType":"COLUMN","vertexId":"f1"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f1"},{"id":12,"vertexType":"COLUMN","vertexId":"f2"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f2"},{"id":14,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col"},{"id":15,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col"}]} Expected: { "queryText":"select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m\n where a.item = m.f1", "hash":"1b0db371b32e90d33629ed7779332cf7", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 11 ], "targets":[ 10 ], "edgeType":"PROJECTION" }, { "sources":[ 13 ], "targets":[ 12 ], "edgeType":"PROJECTION" }, { "sources":[ 7, 11, 14, 15 ], "targets":[ 0, 2, 4, 6, 8, 10, 12 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"id" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.id" }, { "id":2, "vertexType":"COLUMN", "vertexId":"year" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.year" }, { "id":4, "vertexType":"COLUMN", "vertexId":"month" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.month" }, { "id":6, "vertexType":"COLUMN", "vertexId":"item" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col.item" }, { "id":8, "vertexType":"COLUMN", "vertexId":"key" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.key" }, { "id":10, "vertexType":"COLUMN", "vertexId":"f1" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f1" }, { "id":12, "vertexType":"COLUMN", "vertexId":"f2" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f2" }, { "id":14, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col" }, { "id":15, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col" } ] } section LINEAGE of query: select a + b as ab, c, d, e from functional.allcomplextypes t, (select sum(item) a from t.int_array_col where item < 10) v1, (select count(f1) b from t.struct_map_col group by key) v2, (select avg(value) over(partition by key) c from t.map_map_col.value) v3, (select item d from t.int_array_col union all select value from t.int_map_col) v4, (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5 Output: {"queryText":"select a + b as ab, c, d, e from functional.allcomplextypes t,\n (select sum(item) a from t.int_array_col\n where item < 10) v1,\n (select count(f1) b from t.struct_map_col\n group by key) v2,\n (select avg(value) over(partition by key) c from t.map_map_col.value) v3,\n (select item d from t.int_array_col\n union all\n select value from t.int_map_col) v4,\n (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5","hash":"4affc0d1e384475d1ff2fc2e19643064","user":"lv","timestamp":1475939634,"edges":[{"sources":[1,2],"targets":[0],"edgeType":"PROJECTION"},{"sources":[4],"targets":[3],"edgeType":"PROJECTION"},{"sources":[5],"targets":[3],"edgeType":"PREDICATE"},{"sources":[2,7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[2,10,11],"targets":[0,3,6,8],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"ab"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f1"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col.item"},{"id":3,"vertexType":"COLUMN","vertexId":"c"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.map_map_col.value.value"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.map_map_col.value.key"},{"id":6,"vertexType":"COLUMN","vertexId":"d"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_map_col.value"},{"id":8,"vertexType":"COLUMN","vertexId":"e"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.value.f21"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.key"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.key"}]} Expected: { "queryText":"select a + b as ab, c, d, e from functional.allcomplextypes t,\n (select sum(item) a from t.int_array_col\n where item < 10) v1,\n (select count(f1) b from t.struct_map_col\n group by key) v2,\n (select avg(value) over(partition by key) c from t.map_map_col.value) v3,\n (select item d from t.int_array_col\n union all\n select value from t.int_map_col) v4,\n (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5", "hash":"4affc0d1e384475d1ff2fc2e19643064", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1, 2 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 4 ], "targets":[ 3 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 3 ], "edgeType":"PREDICATE" }, { "sources":[ 1, 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 10, 11 ], "targets":[ 0, 3, 6, 8 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"ab" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col.item" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f1" }, { "id":3, "vertexType":"COLUMN", "vertexId":"c" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.map_map_col.value.value" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.map_map_col.value.key" }, { "id":6, "vertexType":"COLUMN", "vertexId":"d" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_map_col.value" }, { "id":8, "vertexType":"COLUMN", "vertexId":"e" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.value.f21" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.key" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.key" } ] } at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:682) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:687) at org.apache.impala.planner.PlannerTest.testLineage(PlannerTest.java:171) Results : Failed tests: PlannerTest.testLineage:171->PlannerTestBase.runPlannerTestFile:687->PlannerTestBase.runPlannerTestFile:682 section LINEAGE of query: select * from ( select tinyint_col + int_col x from functional.alltypes union all select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2 Output: {"queryText":"select * from (\n select tinyint_col + int_col x from functional.alltypes\n union all\n select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2","hash":"25456c60a2e874a20732f42c7af27553","user":"lv","timestamp":1475939633,"edges":[{"sources":[1,2,3],"targets":[0],"edgeType":"PROJECTION"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"x"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"}]} Expected: { "queryText":"select * from (\n select tinyint_col + int_col x from functional.alltypes\n union all\n select sum(bigint_col) y from (select bigint_col from functional.alltypes) v1) v2", "hash":"25456c60a2e874a20732f42c7af27553", "user":"dev", "timestamp":1446159271, "edges":[ { "sources":[ 1, 2, 3 ], "targets":[ 0 ], "edgeType":"PROJECTION" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"x" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" } ] } section LINEAGE of query: select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id), count(b.string_col), b.timestamp_col from functional.alltypes a join functional.alltypessmall b on (a.id = b.id) where a.year = 2010 and b.float_col > 0 group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col having count(a.int_col) > 10 order by b.bigint_col limit 10 Output: {"queryText":"select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\nfrom functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2010 and b.float_col > 0\ngroup by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\nhaving count(a.int_col) > 10\norder by b.bigint_col limit 10","hash":"e0309eeff9811f53c82657d62c1e04eb","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[2,3],"targets":[0],"edgeType":"PREDICATE"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[1,2,3,5,7,8,9,10,11,12],"targets":[0,4,6],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"sum(a.tinyint_col) OVER(...)"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.smallint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"count(b.string_col)"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"timestamp_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.timestamp_col"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.bigint_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypes.year"}]} Expected: { "queryText":"select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\nfrom functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\nwhere a.year = 2010 and b.float_col > 0\ngroup by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\nhaving count(a.int_col) > 10\norder by b.bigint_col limit 10", "hash":"e0309eeff9811f53c82657d62c1e04eb", "user":"dev", "timestamp":1446159271, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 2, 3 ], "targets":[ 0 ], "edgeType":"PREDICATE" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 5, 7, 8, 9, 10, 11, 12 ], "targets":[ 0, 4, 6 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"sum(a.tinyint_col) OVER(...)" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.smallint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"count(b.string_col)" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"timestamp_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.timestamp_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.bigint_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypes.year" } ] } section LINEAGE of query: select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col from functional.alltypessmall c join ( select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day, a.int_col int_col, a.month month, b.float_col float_col, b.id id from ( select * from functional.alltypesagg a where month=1 ) a join functional.alltypessmall b on (a.smallint_col = b.id) ) x on (x.tinyint_col = c.id) where x.day=1 and x.int_col > 899 and x.float_col > 4.5 and c.string_col < '7' and x.int_col + x.float_col + cast(c.string_col as float) < 1000 Output: {"queryText":"select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\nfrom functional.alltypessmall c\njoin (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\nwhere x.day=1\nand x.int_col > 899\nand x.float_col > 4.5\nand c.string_col < '7'\nand x.int_col + x.float_col + cast(c.string_col as float) < 1000","hash":"4edf165aed5982ede63f7c91074f4b44","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[11],"targets":[10],"edgeType":"PROJECTION"},{"sources":[1,3,5,7,9,11,12,13],"targets":[0,2,4,6,8,10],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"smallint_col"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.smallint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":4,"vertexType":"COLUMN","vertexId":"tinyint_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.tinyint_col"},{"id":6,"vertexType":"COLUMN","vertexId":"int_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":8,"vertexType":"COLUMN","vertexId":"float_col"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"string_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.month"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.day"}]} Expected: { "queryText":"select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\nfrom functional.alltypessmall c\njoin (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\nwhere x.day=1\nand x.int_col > 899\nand x.float_col > 4.5\nand c.string_col < '7'\nand x.int_col + x.float_col + cast(c.string_col as float) < 1000", "hash":"4edf165aed5982ede63f7c91074f4b44", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 11 ], "targets":[ 10 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 3, 5, 7, 9, 11, 12, 13 ], "targets":[ 0, 2, 4, 6, 8, 10 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"smallint_col" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.smallint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"id" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"tinyint_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.tinyint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"int_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"float_col" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"string_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.day" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.month" } ] } section LINEAGE of query: select int_col + 1, tinyint_col - 1 from functional.alltypes a where a.int_col < (select max(int_col) from functional.alltypesagg g where g.bool_col = true) and a.bigint_col > 10 Output: {"queryText":"select int_col + 1, tinyint_col - 1\nfrom functional.alltypes a\nwhere a.int_col <\n (select max(int_col) from functional.alltypesagg g where g.bool_col = true)\nand a.bigint_col > 10","hash":"5e6227f323793ea4441e2a3119af2f09","user":"lv","timestamp":1475939633,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[1,4,5,6],"targets":[0,2],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"int_col + 1"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":2,"vertexType":"COLUMN","vertexId":"tinyint_col - 1"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.bool_col"},{"id":6,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"}]} Expected: { "queryText":"select int_col + 1, tinyint_col - 1\nfrom functional.alltypes a\nwhere a.int_col <\n (select max(int_col) from functional.alltypesagg g where g.bool_col = true)\nand a.bigint_col > 10", "hash":"5e6227f323793ea4441e2a3119af2f09", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 4, 5, 6 ], "targets":[ 0, 2 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"int_col + 1" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"tinyint_col - 1" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.bool_col" } ] } section LINEAGE of query: select lead(a) over (partition by b order by c) from (select lead(id) over (partition by int_col order by bigint_col) as a, max(id) over (partition by tinyint_col order by int_col) as b, min(int_col) over (partition by string_col order by bool_col) as c from functional.alltypes) v Output: {"queryText":"select lead(a) over (partition by b order by c)\nfrom\n (select lead(id) over (partition by int_col order by bigint_col) as a,\n max(id) over (partition by tinyint_col order by int_col) as b,\n min(int_col) over (partition by string_col order by bool_col) as c\n from functional.alltypes) v","hash":"aa95e5e6f39fc80bb3c318a2515dc77d","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[1,2,3,4,5,6],"targets":[0],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"lead(a, 1, NULL) OVER(...)"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.alltypes.bigint_col"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypes.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"functional.alltypes.bool_col"}]} Expected: { "queryText":"select lead(a) over (partition by b order by c)\nfrom\n (select lead(id) over (partition by int_col order by bigint_col) as a,\n max(id) over (partition by tinyint_col order by int_col) as b,\n min(int_col) over (partition by string_col order by bool_col) as c\n from functional.alltypes) v", "hash":"aa95e5e6f39fc80bb3c318a2515dc77d", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 4, 5, 6 ], "targets":[ 0 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"lead(a, 1, NULL) OVER(...)" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bool_col" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.alltypes.string_col" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypes.bigint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" } ] } section LINEAGE of query: create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col from functional.alltypessmall c join ( select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day, a.int_col int_col, a.month month, b.float_col float_col, b.id id from ( select * from functional.alltypesagg a where month=1 ) a join functional.alltypessmall b on (a.smallint_col = b.id) ) x on (x.tinyint_col = c.id) where x.day=1 and x.int_col > 899 and x.float_col > 4.5 and c.string_col < '7' and x.int_col + x.float_col + cast(c.string_col as float) < 1000 Output: {"queryText":"create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as\n select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\n from functional.alltypessmall c\n join (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\n where x.day=1\n and x.int_col > 899\n and x.float_col > 4.5\n and c.string_col < '7'\n and x.int_col + x.float_col + cast(c.string_col as float) < 1000","hash":"ffbe643df8f26e92907fb45de1aeda36","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[3],"targets":[6],"edgeType":"PROJECTION"},{"sources":[8],"targets":[7],"edgeType":"PROJECTION"},{"sources":[10],"targets":[9],"edgeType":"PROJECTION"},{"sources":[12],"targets":[11],"edgeType":"PROJECTION"},{"sources":[1,3,5,8,10,12,13,14],"targets":[0,2,4,6,7,9,11],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a1"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.smallint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a2"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":4,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a3"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.tinyint_col"},{"id":6,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a4"},{"id":7,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a5"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.int_col"},{"id":9,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a6"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":11,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.a7"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.month"},{"id":14,"vertexType":"COLUMN","vertexId":"functional.alltypesagg.day"}]} Expected: { "queryText":"create view test_view_lineage (a1, a2, a3, a4, a5, a6, a7) as\n select x.smallint_col, x.id, x.tinyint_col, c.id, x.int_col, x.float_col, c.string_col\n from functional.alltypessmall c\n join (\n select a.smallint_col smallint_col, a.tinyint_col tinyint_col, a.day day,\n a.int_col int_col, a.month month, b.float_col float_col, b.id id\n from ( select * from functional.alltypesagg a where month=1 ) a\n join functional.alltypessmall b on (a.smallint_col = b.id)\n ) x on (x.tinyint_col = c.id)\n where x.day=1\n and x.int_col > 899\n and x.float_col > 4.5\n and c.string_col < '7'\n and x.int_col + x.float_col + cast(c.string_col as float) < 1000", "hash":"ffbe643df8f26e92907fb45de1aeda36", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 8 ], "targets":[ 7 ], "edgeType":"PROJECTION" }, { "sources":[ 10 ], "targets":[ 9 ], "edgeType":"PROJECTION" }, { "sources":[ 12 ], "targets":[ 11 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 3, 5, 8, 10, 12, 13, 14 ], "targets":[ 0, 2, 4, 6, 7, 9, 11 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a1" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.smallint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a2" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a3" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.tinyint_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a4" }, { "id":7, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a5" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.int_col" }, { "id":9, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a6" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.a7" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.day" }, { "id":14, "vertexType":"COLUMN", "vertexId":"functional.alltypesagg.month" } ] } section LINEAGE of query: create view test_view_lineage as select * from ( select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id), count(b.string_col), b.timestamp_col from functional.alltypes a join functional.alltypessmall b on (a.id = b.id) where a.year = 2010 and b.float_col > 0 group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col having count(a.int_col) > 10 order by b.bigint_col limit 10) t Output: {"queryText":"create view test_view_lineage as\n select * from (\n select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\n from functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\n where a.year = 2010 and b.float_col > 0\n group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\n having count(a.int_col) > 10\n order by b.bigint_col limit 10) t","hash":"d4b9e2d63548088f911816b2ae29d7c2","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[2,3],"targets":[0],"edgeType":"PREDICATE"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[1,2,3,5,7,8,9,10,11,12],"targets":[0,4,6],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"default.test_view_lineage._c0"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.alltypes.tinyint_col"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.alltypes.id"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.alltypes.smallint_col"},{"id":4,"vertexType":"COLUMN","vertexId":"default.test_view_lineage._c1"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.string_col"},{"id":6,"vertexType":"COLUMN","vertexId":"default.test_view_lineage.timestamp_col"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.timestamp_col"},{"id":8,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.id"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.float_col"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.alltypes.int_col"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.alltypessmall.bigint_col"},{"id":12,"vertexType":"COLUMN","vertexId":"functional.alltypes.year"}]} Expected: { "queryText":"create view test_view_lineage as\n select * from (\n select sum(a.tinyint_col) over (partition by a.smallint_col order by a.id),\n count(b.string_col), b.timestamp_col\n from functional.alltypes a join functional.alltypessmall b on (a.id = b.id)\n where a.year = 2010 and b.float_col > 0\n group by a.tinyint_col, a.smallint_col, a.id, b.string_col, b.timestamp_col, b.bigint_col\n having count(a.int_col) > 10\n order by b.bigint_col limit 10) t", "hash":"d4b9e2d63548088f911816b2ae29d7c2", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 2, 3 ], "targets":[ 0 ], "edgeType":"PREDICATE" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 2, 3, 5, 7, 8, 9, 10, 11, 12 ], "targets":[ 0, 4, 6 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage._c0" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.alltypes.tinyint_col" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.alltypes.smallint_col" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.alltypes.id" }, { "id":4, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage._c1" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.string_col" }, { "id":6, "vertexType":"COLUMN", "vertexId":"default.test_view_lineage.timestamp_col" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.timestamp_col" }, { "id":8, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.id" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.alltypes.int_col" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.bigint_col" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.alltypessmall.float_col" }, { "id":12, "vertexType":"COLUMN", "vertexId":"functional.alltypes.year" } ] } section LINEAGE of query: select * from ( select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes where year = 2000 order by nested_struct_col.f2.f12.f21 limit 10 union all select sum(f1) y from (select complex_struct_col.f1 f1 from functional.allcomplextypes group by 1) v1) v2 Output: {"queryText":"select * from (\n select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes\n where year = 2000\n order by nested_struct_col.f2.f12.f21 limit 10\n union all\n select sum(f1) y from\n (select complex_struct_col.f1 f1 from functional.allcomplextypes\n group by 1) v1) v2","hash":"4fb3ceddbf596097335af607d528f5a7","user":"lv","timestamp":1475939634,"edges":[{"sources":[1,2,3],"targets":[0],"edgeType":"PROJECTION"},{"sources":[4,5],"targets":[0],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"x"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_struct_col.f1"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_struct_col.f2"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_struct_col.f1"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.nested_struct_col.f2.f12.f21"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.year"}]} Expected: { "queryText":"select * from (\n select int_struct_col.f1 + int_struct_col.f2 x from functional.allcomplextypes\n where year = 2000\n order by nested_struct_col.f2.f12.f21 limit 10\n union all\n select sum(f1) y from\n (select complex_struct_col.f1 f1 from functional.allcomplextypes\n group by 1) v1) v2", "hash":"4fb3ceddbf596097335af607d528f5a7", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1, 2, 3 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 4, 5 ], "targets":[ 0 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"x" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_struct_col.f1" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_struct_col.f2" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_struct_col.f1" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.nested_struct_col.f2.f12.f21" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.year" } ] } section LINEAGE of query: select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m where a.item = m.f1 Output: {"queryText":"select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m\n where a.item = m.f1","hash":"1b0db371b32e90d33629ed7779332cf7","user":"lv","timestamp":1475939634,"edges":[{"sources":[1],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[2],"edgeType":"PROJECTION"},{"sources":[5],"targets":[4],"edgeType":"PROJECTION"},{"sources":[7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[11],"targets":[10],"edgeType":"PROJECTION"},{"sources":[13],"targets":[12],"edgeType":"PROJECTION"},{"sources":[7,11,14,15],"targets":[0,2,4,6,8,10,12],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"id"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.id"},{"id":2,"vertexType":"COLUMN","vertexId":"year"},{"id":3,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.year"},{"id":4,"vertexType":"COLUMN","vertexId":"month"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.month"},{"id":6,"vertexType":"COLUMN","vertexId":"item"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col.item"},{"id":8,"vertexType":"COLUMN","vertexId":"key"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.key"},{"id":10,"vertexType":"COLUMN","vertexId":"f1"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f1"},{"id":12,"vertexType":"COLUMN","vertexId":"f2"},{"id":13,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f2"},{"id":14,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col"},{"id":15,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col"}]} Expected: { "queryText":"select * from functional.allcomplextypes t, t.int_array_col a, t.struct_map_col m\n where a.item = m.f1", "hash":"1b0db371b32e90d33629ed7779332cf7", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 3 ], "targets":[ 2 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 4 ], "edgeType":"PROJECTION" }, { "sources":[ 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 11 ], "targets":[ 10 ], "edgeType":"PROJECTION" }, { "sources":[ 13 ], "targets":[ 12 ], "edgeType":"PROJECTION" }, { "sources":[ 7, 11, 14, 15 ], "targets":[ 0, 2, 4, 6, 8, 10, 12 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"id" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.id" }, { "id":2, "vertexType":"COLUMN", "vertexId":"year" }, { "id":3, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.year" }, { "id":4, "vertexType":"COLUMN", "vertexId":"month" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.month" }, { "id":6, "vertexType":"COLUMN", "vertexId":"item" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col.item" }, { "id":8, "vertexType":"COLUMN", "vertexId":"key" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.key" }, { "id":10, "vertexType":"COLUMN", "vertexId":"f1" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f1" }, { "id":12, "vertexType":"COLUMN", "vertexId":"f2" }, { "id":13, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f2" }, { "id":14, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col" }, { "id":15, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col" } ] } section LINEAGE of query: select a + b as ab, c, d, e from functional.allcomplextypes t, (select sum(item) a from t.int_array_col where item < 10) v1, (select count(f1) b from t.struct_map_col group by key) v2, (select avg(value) over(partition by key) c from t.map_map_col.value) v3, (select item d from t.int_array_col union all select value from t.int_map_col) v4, (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5 Output: {"queryText":"select a + b as ab, c, d, e from functional.allcomplextypes t,\n (select sum(item) a from t.int_array_col\n where item < 10) v1,\n (select count(f1) b from t.struct_map_col\n group by key) v2,\n (select avg(value) over(partition by key) c from t.map_map_col.value) v3,\n (select item d from t.int_array_col\n union all\n select value from t.int_map_col) v4,\n (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5","hash":"4affc0d1e384475d1ff2fc2e19643064","user":"lv","timestamp":1475939634,"edges":[{"sources":[1,2],"targets":[0],"edgeType":"PROJECTION"},{"sources":[4],"targets":[3],"edgeType":"PROJECTION"},{"sources":[5],"targets":[3],"edgeType":"PREDICATE"},{"sources":[2,7],"targets":[6],"edgeType":"PROJECTION"},{"sources":[9],"targets":[8],"edgeType":"PROJECTION"},{"sources":[2,10,11],"targets":[0,3,6,8],"edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"ab"},{"id":1,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.value.f1"},{"id":2,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_array_col.item"},{"id":3,"vertexType":"COLUMN","vertexId":"c"},{"id":4,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.map_map_col.value.value"},{"id":5,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.map_map_col.value.key"},{"id":6,"vertexType":"COLUMN","vertexId":"d"},{"id":7,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.int_map_col.value"},{"id":8,"vertexType":"COLUMN","vertexId":"e"},{"id":9,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.value.f21"},{"id":10,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.key"},{"id":11,"vertexType":"COLUMN","vertexId":"functional.allcomplextypes.struct_map_col.key"}]} Expected: { "queryText":"select a + b as ab, c, d, e from functional.allcomplextypes t,\n (select sum(item) a from t.int_array_col\n where item < 10) v1,\n (select count(f1) b from t.struct_map_col\n group by key) v2,\n (select avg(value) over(partition by key) c from t.map_map_col.value) v3,\n (select item d from t.int_array_col\n union all\n select value from t.int_map_col) v4,\n (select f21 e from t.complex_nested_struct_col.f2.f12 order by key limit 10) v5", "hash":"4affc0d1e384475d1ff2fc2e19643064", "user":"dev", "timestamp":1446159272, "edges":[ { "sources":[ 1, 2 ], "targets":[ 0 ], "edgeType":"PROJECTION" }, { "sources":[ 4 ], "targets":[ 3 ], "edgeType":"PROJECTION" }, { "sources":[ 5 ], "targets":[ 3 ], "edgeType":"PREDICATE" }, { "sources":[ 1, 7 ], "targets":[ 6 ], "edgeType":"PROJECTION" }, { "sources":[ 9 ], "targets":[ 8 ], "edgeType":"PROJECTION" }, { "sources":[ 1, 10, 11 ], "targets":[ 0, 3, 6, 8 ], "edgeType":"PREDICATE" } ], "vertices":[ { "id":0, "vertexType":"COLUMN", "vertexId":"ab" }, { "id":1, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_array_col.item" }, { "id":2, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.value.f1" }, { "id":3, "vertexType":"COLUMN", "vertexId":"c" }, { "id":4, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.map_map_col.value.value" }, { "id":5, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.map_map_col.value.key" }, { "id":6, "vertexType":"COLUMN", "vertexId":"d" }, { "id":7, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.int_map_col.value" }, { "id":8, "vertexType":"COLUMN", "vertexId":"e" }, { "id":9, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.value.f21" }, { "id":10, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.complex_nested_struct_col.f2.item.f12.key" }, { "id":11, "vertexType":"COLUMN", "vertexId":"functional.allcomplextypes.struct_map_col.key" } ] } PlannerTest.testRuntimeFilterPropagation:243->PlannerTestBase.runPlannerTestFile:646->PlannerTestBase.runPlannerTestFile:682 Section PLAN of query: with big_six as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), small_two as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bool_col = b.bool_col ), big_eight as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bool_col = b.bool_col and a.date_string_col = b.date_string_col and a.double_col = b.double_col and a.smallint_col = b.smallint_col and a.string_col = b.string_col and a.timestamp_col = b.timestamp_col and a.tinyint_col = b.tinyint_col ), small_four as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.double_col = b.double_col and a.float_col = b.float_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), big_one as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id ), nan as ( with zero_card as ( select straight_join b.id, b.int_col from (values(1 id) limit 0) a inner join functional.alltypes b on a.id = b.id ) select straight_join 1 from zero_card z inner join functional.alltypestiny x on x.id = z.id ), small_six as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ), big_three as ( select straight_join a.id from functional.alltypes a inner join functional.alltypes b on a.id = b.id and a.bool_col = b.bool_col and a.tinyint_col = b.tinyint_col ), small_four_2 as ( select straight_join a.bool_col from functional.alltypes a inner join functional.alltypestiny b on a.id = b.id and a.bigint_col = b.bigint_col and a.bool_col = b.bool_col and a.double_col = b.double_col and a.float_col = b.float_col and a.int_col = b.int_col and a.smallint_col = b.smallint_col and a.tinyint_col = b.tinyint_col ) select straight_join 1 from big_six inner join small_two inner join big_eight inner join small_four inner join big_one inner join nan inner join small_six inner join big_three inner join small_four_2 Actual does not match expected result: 36:NESTED LOOP JOIN [CROSS JOIN] | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 35:NESTED LOOP JOIN [CROSS JOIN] | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 34:NESTED LOOP JOIN [CROSS JOIN] | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 33:NESTED LOOP JOIN [CROSS JOIN] | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 15:EMPTYSET | 32:NESTED LOOP JOIN [CROSS JOIN] | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 31:NESTED LOOP JOIN [CROSS JOIN] | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF017 <- b.bool_col, RF016 <- b.bigint_col, RF019 <- b.float_col, RF018 <- b.double_col, RF021 <- b.int_col, RF020 <- b.id, RF023 <- b.tinyint_col, RF022 <- b.smallint_col ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF017 -> a.bool_col, RF016 -> a.bigint_col, RF019 -> a.float_col, RF018 -> a.double_col, RF021 -> a.int_col, RF020 -> a.id, RF023 -> a.tinyint_col, RF022 -> a.smallint_col | 30:NESTED LOOP JOIN [CROSS JOIN] | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 29:NESTED LOOP JOIN [CROSS JOIN] | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB Expected: 36:NESTED LOOP JOIN [CROSS JOIN] | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 35:NESTED LOOP JOIN [CROSS JOIN] | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 34:NESTED LOOP JOIN [CROSS JOIN] | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 33:NESTED LOOP JOIN [CROSS JOIN] | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 15:EMPTYSET | 32:NESTED LOOP JOIN [CROSS JOIN] | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 31:NESTED LOOP JOIN [CROSS JOIN] | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF016 <- b.bigint_col, RF017 <- b.bool_col, RF018 <- b.double_col, RF019 <- b.float_col, RF020 <- b.id, RF021 <- b.int_col, RF022 <- b.smallint_col, RF023 <- b.tinyint_col | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF016 -> a.bigint_col, RF017 -> a.bool_col, RF018 -> a.double_col, RF019 -> a.float_col, RF020 -> a.id, RF021 -> a.int_col, RF022 -> a.smallint_col, RF023 -> a.tinyint_col | 30:NESTED LOOP JOIN [CROSS JOIN] | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | 29:NESTED LOOP JOIN [CROSS JOIN] | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] 36:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22,24,25,27,28 row-size=393B cardinality=0 | |--28:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=27,28 row-size=64B cardinality=8 | | | |--27:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=28 row-size=32B cardinality=8 | | | 26:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=27 row-size=32B cardinality=7300 | 35:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22,24,25 row-size=329B cardinality=0 | |--25:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=24,25 row-size=12B cardinality=7300 | | | |--24:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=25 row-size=6B cardinality=7300 | | | 23:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=24 row-size=6B cardinality=7300 | 34:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19,21,22 row-size=317B cardinality=0 | |--22:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=21,22 row-size=40B cardinality=8 | | | |--21:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=22 row-size=20B cardinality=8 | | | 20:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=21 row-size=20B cardinality=7300 | 33:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13,15,17,19 row-size=277B cardinality=0 | |--19:HASH JOIN [INNER JOIN] | | hash predicates: b.id = x.id | | hosts=1 per-host-mem=unavailable | | tuple-ids=15,17,19 row-size=9B cardinality=0 | | | |--18:SCAN HDFS [functional.alltypestiny x] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=19 row-size=4B cardinality=8 | | | 17:HASH JOIN [INNER JOIN] | | hash predicates: id = b.id | | hosts=1 per-host-mem=unavailable | | tuple-ids=15,17 row-size=5B cardinality=0 | | | |--16:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=17 row-size=4B cardinality=7300 | | | 15:EMPTYSET | hosts=1 per-host-mem=0B | tuple-ids=15 row-size=0B cardinality=0 | 32:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10,12,13 row-size=268B cardinality=24897088000000 | |--14:HASH JOIN [INNER JOIN] | | hash predicates: a.id = b.id | | hosts=3 per-host-mem=unavailable | | tuple-ids=12,13 row-size=8B cardinality=7300 | | | |--13:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=13 row-size=4B cardinality=7300 | | | 12:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=12 row-size=4B cardinality=7300 | 31:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7,9,10 row-size=260B cardinality=3410560000 | |--11:HASH JOIN [INNER JOIN] | | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.double_col = b.double_col, a.float_col = b.float_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | | runtime filters: RF017 <- b.bool_col, RF016 <- b.bigint_col, RF019 <- b.float_col, RF018 <- b.double_col, RF021 <- b.int_col, RF020 <- b.id, RF023 <- b.tinyint_col, RF022 <- b.smallint_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=9,10 row-size=64B cardinality=8 | | | |--10:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=10 row-size=32B cardinality=8 | | | 09:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF017 -> a.bool_col, RF016 -> a.bigint_col, RF019 -> a.float_col, RF018 -> a.double_col, RF021 -> a.int_col, RF020 -> a.id, RF023 -> a.tinyint_col, RF022 -> a.smallint_col | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=9 row-size=32B cardinality=7300 | 30:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4,6,7 row-size=196B cardinality=426320000 | |--08:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.double_col = b.double_col, a.id = b.id, a.smallint_col = b.smallint_col, a.timestamp_col = b.timestamp_col, a.tinyint_col = b.tinyint_col, a.string_col = b.string_col, a.date_string_col = b.date_string_col | | hosts=3 per-host-mem=unavailable | | tuple-ids=6,7 row-size=146B cardinality=7300 | | | |--07:SCAN HDFS [functional.alltypes b] | | partitions=24/24 files=24 size=478.45KB | | table stats: 7300 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=7 row-size=73B cardinality=7300 | | | 06:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=6 row-size=73B cardinality=7300 | 29:NESTED LOOP JOIN [CROSS JOIN] | hosts=3 per-host-mem=unavailable | tuple-ids=0,1,3,4 row-size=50B cardinality=58400 | |--05:HASH JOIN [INNER JOIN] | | hash predicates: a.bool_col = b.bool_col, a.id = b.id | | runtime filters: RF006 <- b.bool_col, RF007 <- b.id | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,4 row-size=10B cardinality=8 | | | |--04:SCAN HDFS [functional.alltypestiny b] | | partitions=4/4 files=4 size=460B | | table stats: 8 rows total | | column stats: all | | hosts=3 per-host-mem=unavailable | | tuple-ids=4 row-size=5B cardinality=8 | | | 03:SCAN HDFS [functional.alltypes a] | partitions=24/24 files=24 size=478.45KB | runtime filters: RF006 -> a.bool_col, RF007 -> a.id | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=3 row-size=5B cardinality=7300 | 02:HASH JOIN [INNER JOIN] | hash predicates: a.bigint_col = b.bigint_col, a.bool_col = b.bool_col, a.id = b.id, a.int_col = b.int_col, a.smallint_col = b.smallint_col, a.tinyint_col = b.tinyint_col | hosts=3 per-host-mem=unavailable | tuple-ids=0,1 row-size=40B cardinality=7300 | |--01:SCAN HDFS [functional.alltypes b] | partitions=24/24 files=24 size=478.45KB | table stats: 7300 rows total | column stats: all | hosts=3 per-host-mem=unavailable | tuple-ids=1 row-size=20B cardinality=7300 | 00:SCAN HDFS [functional.alltypes a] partitions=24/24 files=24 size=478.45KB table stats: 7300 rows total column stats: all hosts=3 per-host-mem=unavailable tuple-ids=0 row-size=20B cardinality=7300 PlannerTest.testTpchNested:199->PlannerTestBase.runPlannerTestFile:691->PlannerTestBase.runPlannerTestFile:682 Section PLAN of query: select s_name, count(*) as numwait from supplier s, customer c, c.c_orders o, o.o_lineitems l1, region.r_nations n where s_suppkey = l1.l_suppkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from o.o_lineitems l2 where l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from o.o_lineitems l3 where l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'SAUDI ARABIA' group by s_name order by numwait desc, s_name limit 100 Actual does not match expected result: 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Expected: 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=4.18KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=111.08MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=577.87MB predicates: !empty(c.c_orders) predicates on o: !empty(o.o_lineitems), o_orderstatus = 'F' predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Verbose plan: F00:PLAN FRAGMENT [UNPARTITIONED] 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | hosts=3 per-host-mem=unavailable | tuple-ids=10 row-size=42B cardinality=100 | 19:AGGREGATE [FINALIZE] | output: count(*) | group by: s_name | hosts=3 per-host-mem=unavailable | tuple-ids=9 row-size=42B cardinality=9965 | 18:SUBPLAN | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | |--12:SINGULAR ROW SRC | | | parent-subplan=18 | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | 13:UNNEST [o.o_lineitems l2] | | parent-subplan=18 | | hosts=3 per-host-mem=unavailable | | tuple-ids=5 row-size=8B cardinality=10 | | | 14:UNNEST [o.o_lineitems l3] | parent-subplan=18 | hosts=3 per-host-mem=unavailable | tuple-ids=7 row-size=40B cardinality=10 | 17:HASH JOIN [INNER JOIN] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | table stats: 5 rows total | column stats: all | hosts=1 per-host-mem=unavailable | tuple-ids=4 row-size=18B cardinality=5 | 11:HASH JOIN [INNER JOIN] | hash predicates: l1.l_suppkey = s_suppkey | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1,0 row-size=164B cardinality=15000000 | |--00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | table stats: 10000 rows total | column stats: all | hosts=1 per-host-mem=unavailable | tuple-ids=0 row-size=44B cardinality=10000 | 02:SUBPLAN | hosts=3 per-host-mem=unavailable | tuple-ids=3,2,1 row-size=120B cardinality=15000000 | |--09:NESTED LOOP JOIN [CROSS JOIN] | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2,1 row-size=120B cardinality=100 | | | |--03:SINGULAR ROW SRC | | parent-subplan=02 | | hosts=3 per-host-mem=unavailable | | tuple-ids=1 row-size=16B cardinality=1 | | | 05:SUBPLAN | | hosts=3 per-host-mem=unavailable | | tuple-ids=3,2 row-size=104B cardinality=100 | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=3,2 row-size=104B cardinality=10 | | | | | |--06:SINGULAR ROW SRC | | | parent-subplan=05 | | | hosts=3 per-host-mem=unavailable | | | tuple-ids=2 row-size=64B cardinality=1 | | | | | 07:UNNEST [o.o_lineitems l1] | | parent-subplan=05 | | hosts=3 per-host-mem=unavailable | | tuple-ids=3 row-size=0B cardinality=10 | | | 04:UNNEST [c.c_orders o] | parent-subplan=02 | hosts=3 per-host-mem=unavailable | tuple-ids=2 row-size=0B cardinality=10 | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate table stats: 150000 rows total column stats: unavailable hosts=3 per-host-mem=unavailable tuple-ids=1 row-size=16B cardinality=150000 Section DISTRIBUTEDPLAN of query: select s_name, count(*) as numwait from supplier s, customer c, c.c_orders o, o.o_lineitems l1, region.r_nations n where s_suppkey = l1.l_suppkey and o_orderstatus = 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from o.o_lineitems l2 where l2.l_suppkey <> l1.l_suppkey ) and not exists ( select * from o.o_lineitems l3 where l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate > l3.l_commitdate ) and s_nationkey = n_nationkey and n_name = 'SAUDI ARABIA' group by s_name order by numwait desc, s_name limit 100 Actual does not match expected result: 25:MERGING-EXCHANGE [UNPARTITIONED] | order by: count(*) DESC, s_name ASC | limit: 100 | 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | 23:EXCHANGE [HASH(s_name)] | 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--22:EXCHANGE [BROADCAST] | | | 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=3.24KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | |--21:EXCHANGE [BROADCAST] | | | 00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=43.00MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Expected: 25:MERGING-EXCHANGE [UNPARTITIONED] | order by: count(*) DESC, s_name ASC | limit: 100 | 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | 23:EXCHANGE [HASH(s_name)] | 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | 18:SUBPLAN | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | | | |--12:SINGULAR ROW SRC | | | | | 13:UNNEST [o.o_lineitems l2] | | | 14:UNNEST [o.o_lineitems l3] | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | |--22:EXCHANGE [BROADCAST] | | | 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n] | partitions=1/1 files=1 size=4.18KB | predicates: n_name = 'SAUDI ARABIA' | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | |--21:EXCHANGE [BROADCAST] | | | 00:SCAN HDFS [tpch_nested_parquet.supplier s] | partitions=1/1 files=1 size=111.08MB | runtime filters: RF000 -> s_nationkey | 02:SUBPLAN | |--09:NESTED LOOP JOIN [CROSS JOIN] | | | |--03:SINGULAR ROW SRC | | | 05:SUBPLAN | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | | | |--06:SINGULAR ROW SRC | | | | | 07:UNNEST [o.o_lineitems l1] | | | 04:UNNEST [c.c_orders o] | 01:SCAN HDFS [tpch_nested_parquet.customer c] partitions=1/1 files=4 size=577.87MB predicates: !empty(c.c_orders) predicates on o: !empty(o.o_lineitems), o_orderstatus = 'F' predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate Verbose plan: F04:PLAN FRAGMENT [UNPARTITIONED] 25:MERGING-EXCHANGE [UNPARTITIONED] order by: count(*) DESC, s_name ASC limit: 100 hosts=3 per-host-mem=unavailable tuple-ids=10 row-size=42B cardinality=100 F03:PLAN FRAGMENT [HASH(s_name)] DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=25, UNPARTITIONED] 20:TOP-N [LIMIT=100] | order by: count(*) DESC, s_name ASC | hosts=3 per-host-mem=4.10KB | tuple-ids=10 row-size=42B cardinality=100 | 24:AGGREGATE [FINALIZE] | output: count:merge(*) | group by: s_name | hosts=3 per-host-mem=10.00MB | tuple-ids=9 row-size=42B cardinality=9965 | 23:EXCHANGE [HASH(s_name)] hosts=3 per-host-mem=0B tuple-ids=9 row-size=42B cardinality=9965 F00:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=23, HASH(s_name)] 19:AGGREGATE [STREAMING] | output: count(*) | group by: s_name | hosts=3 per-host-mem=10.00MB | tuple-ids=9 row-size=42B cardinality=9965 | 18:SUBPLAN | hosts=3 per-host-mem=0B | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--16:NESTED LOOP JOIN [RIGHT ANTI JOIN] | | join predicates: l3.l_suppkey != l1.l_suppkey | | hosts=3 per-host-mem=182B | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | |--15:NESTED LOOP JOIN [RIGHT SEMI JOIN] | | | join predicates: l2.l_suppkey != l1.l_suppkey | | | hosts=3 per-host-mem=182B | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | |--12:SINGULAR ROW SRC | | | parent-subplan=18 | | | hosts=3 per-host-mem=0B | | | tuple-ids=3,2,1,0,4 row-size=182B cardinality=1 | | | | | 13:UNNEST [o.o_lineitems l2] | | parent-subplan=18 | | hosts=3 per-host-mem=0B | | tuple-ids=5 row-size=8B cardinality=10 | | | 14:UNNEST [o.o_lineitems l3] | parent-subplan=18 | hosts=3 per-host-mem=0B | tuple-ids=7 row-size=40B cardinality=10 | 17:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: s_nationkey = n_nationkey | runtime filters: RF000 <- n_nationkey | hosts=3 per-host-mem=100B | tuple-ids=3,2,1,0,4 row-size=182B cardinality=15000000 | |--22:EXCHANGE [BROADCAST] | hosts=1 per-host-mem=0B | tuple-ids=4 row-size=18B cardinality=5 | 11:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l1.l_suppkey = s_suppkey | hosts=3 per-host-mem=472.66KB | tuple-ids=3,2,1,0 row-size=164B cardinality=15000000 | |--21:EXCHANGE [BROADCAST] | hosts=1 per-host-mem=0B | tuple-ids=0 row-size=44B cardinality=10000 | 02:SUBPLAN | hosts=3 per-host-mem=0B | tuple-ids=3,2,1 row-size=120B cardinality=15000000 | |--09:NESTED LOOP JOIN [CROSS JOIN] | | hosts=3 per-host-mem=16B | | tuple-ids=3,2,1 row-size=120B cardinality=100 | | | |--03:SINGULAR ROW SRC | | parent-subplan=02 | | hosts=3 per-host-mem=0B | | tuple-ids=1 row-size=16B cardinality=1 | | | 05:SUBPLAN | | hosts=3 per-host-mem=0B | | tuple-ids=3,2 row-size=104B cardinality=100 | | | |--08:NESTED LOOP JOIN [CROSS JOIN] | | | hosts=3 per-host-mem=64B | | | tuple-ids=3,2 row-size=104B cardinality=10 | | | | | |--06:SINGULAR ROW SRC | | | parent-subplan=05 | | | hosts=3 per-host-mem=0B | | | tuple-ids=2 row-size=64B cardinality=1 | | | | | 07:UNNEST [o.o_lineitems l1] | | parent-subplan=05 | | hosts=3 per-host-mem=0B | | tuple-ids=3 row-size=0B cardinality=10 | | | 04:UNNEST [c.c_orders o] | parent-subplan=02 | hosts=3 per-host-mem=0B | tuple-ids=2 row-size=0B cardinality=10 | 01:SCAN HDFS [tpch_nested_parquet.customer c, RANDOM] partitions=1/1 files=4 size=292.35MB predicates: !empty(c.c_orders) predicates on o: o_orderstatus = 'F', !empty(o.o_lineitems) predicates on l1: l1.l_receiptdate > l1.l_commitdate predicates on l3: l3.l_receiptdate > l3.l_commitdate table stats: 150000 rows total column stats: unavailable hosts=3 per-host-mem=88.00MB tuple-ids=1 row-size=16B cardinality=150000 F02:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=22, BROADCAST] 10:SCAN HDFS [tpch_nested_parquet.region.r_nations n, RANDOM] partitions=1/1 files=1 size=3.24KB predicates: n_name = 'SAUDI ARABIA' table stats: 5 rows total column stats: all hosts=1 per-host-mem=32.00MB tuple-ids=4 row-size=18B cardinality=5 F01:PLAN FRAGMENT [RANDOM] DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=21, BROADCAST] 00:SCAN HDFS [tpch_nested_parquet.supplier s, RANDOM] partitions=1/1 files=1 size=43.00MB runtime filters: RF000 -> s_nationkey table stats: 10000 rows total column stats: all hosts=1 per-host-mem=168.00MB tuple-ids=0 row-size=44B cardinality=10000 Tests run: 44, Failures: 3, Errors: 0, Skipped: 0 [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 55.616s [INFO] Finished at: Sat Oct 08 08:14:05 PDT 2016 [INFO] Final Memory: 76M/1311M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18:test (default-test) on project impala-frontend: There are test failures. [ERROR] [ERROR] Please refer to /home/lv/i1/logs/fe_tests for the individual test results. [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException