Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
For a query involving distinct operator, group-by is always executed at the last step of the query. Let me consider an example query as follows.
default> select distinct a.col3 from test as a left outer join lineitem b on a.col1 = b.l_orderkey order by a.col3;
The plan for this query is
GROUP_BY(5)(col3) => target list: default.a.col3 (TEXT) => out schema:{(1) default.a.col3 (TEXT)} => in schema:{(1) default.a.col3 (TEXT)} SORT(3) => Sort Keys: default.a.col3 (TEXT) (asc) JOIN(7)(LEFT_OUTER) => Join Cond: default.a.col1 (INT4) = default.b.l_orderkey (INT4) => target list: default.a.col3 (TEXT) => out schema: {(1) default.a.col3 (TEXT)} => in schema: {(3) default.a.col3 (TEXT), default.a.col1 (INT4), default.b.l_orderkey (INT4)} SCAN(1) on default.lineitem_large as b => target list: default.b.l_orderkey (INT4) => out schema: {(1) default.b.l_orderkey (INT4)} => in schema: {(16) default.b.l_orderkey (INT4), default.b.l_partkey (INT4), default.b.l_suppkey (INT4), default.b.l_linenumber (INT4), default.b.l_quantity (FLOAT8), default.b.l_extendedprice (FLOAT8), default.b.l_discount (FLOAT8), default.b.l_tax (FLOAT8), default.b.l_returnflag (TEXT), default.b.l_linestatus (TEXT), default.b.l_shipdate (TEXT), default.b.l_commitdate (TEXT), default.b.l_receiptdate (TEXT), default.b.l_shipinstruct (TEXT), default.b.l_shipmode (TEXT), default.b.l_comment (TEXT)} PARTITIONS_SCAN(8) on default.testbroadcastmulticolumnpartitiontable as a => target list: default.a.col3 (TEXT), default.a.col1 (INT4) => num of filtered paths: 3 => out schema: {(2) default.a.col3 (TEXT), default.a.col1 (INT4)} => in schema: {(2) default.a.col1 (INT4), default.a.col2 (FLOAT4)} => 0: hdfs://localhost:52705/tajo/warehouse/default/testbroadcastmulticolumnpartitiontable/col3=01/col4=1996 => 1: hdfs://localhost:52705/tajo/warehouse/default/testbroadcastmulticolumnpartitiontable/col3=10/col4=1993 => 2: hdfs://localhost:52705/tajo/warehouse/default/testbroadcastmulticolumnpartitiontable/col3=12/col4=1996