Description
Follow-up of HIVE-13068.
When we create a RS with constant expressions as keys/values, and immediately after we create a SEL operator that backtracks the expressions from the RS. Currently, we automatically create references for all the keys/values.
Before, we could rely on Hive ConstantPropagate to propagate the constants to the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get exercised anymore. Thus, we can simply create constant expressions when we create the SEL operator instead of a reference.
Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out
EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c FROM alltypesorc WHERE (cdouble IS NULL) ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c LIMIT 10
Plan:
EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c FROM alltypesorc WHERE (cdouble IS NULL) ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c LIMIT 10 POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: alltypesorc Statistics: Num rows: 12288 Data size: 2641964 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: cdouble is null (type: boolean) Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: cstring1 (type: string), cint (type: int), cfloat (type: float), csmallint (type: smallint), COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string) outputColumnNames: _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: null (type: double), _col1 (type: string), _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: string) sort order: ++++++ Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.1 Execution mode: vectorized Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 10 Processor Tree: ListSink
Attachments
Attachments
Issue Links
- relates to
-
HIVE-13068 Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II
- Closed
- links to