Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
1.4.0
-
None
-
None
Description
Hive's parser ignores Window spec when a distinct aggregation is used.
scala> Seq((1, 2, 3)).toDF("i", "j", "k").registerTempTable("t") scala> sql("select count(distinct j) over (partition by i) from t").explain(true) == Parsed Logical Plan == 'Project [UnresolvedAlias CountDistinct UnresolvedAttribute [j] ] 'UnresolvedRelation [t], None == Analyzed Logical Plan == _c0: bigint Aggregate [COUNT(DISTINCT j#23) AS _c0#27L] Subquery t Project [_1#19 AS i#22,_2#20 AS j#23,_3#21 AS k#24] LocalRelation [_1#19,_2#20,_3#21], [[1,2,3]] == Optimized Logical Plan == Aggregate [COUNT(DISTINCT j#23) AS _c0#27L] LocalRelation [j#23], [[2]] == Physical Plan == GeneratedAggregate false, [CombineAndCount(partialSets#28) AS _c0#27L], false Exchange SinglePartition GeneratedAggregate true, [AddToHashSet(j#23) AS partialSets#28], false LocalTableScan [j#23], [[2]] Code Generation: true == RDD == scala> sql("select count(j) over (partition by i) from t").explain(true) == Parsed Logical Plan == 'Project [UnresolvedAlias WindowExpression UnresolvedWindowFunction count UnresolvedAttribute [j] WindowSpecDefinition UnspecifiedFrame UnresolvedAttribute [i] ] 'UnresolvedRelation [t], None == Analyzed Logical Plan == _c0: bigint Project [_c0#31L] Project [j#23,i#22,_c0#31L,_c0#31L] Window [j#23,i#22], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount(j#23) WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING AS _c0#31L], WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING Project [j#23,i#22] Subquery t Project [_1#19 AS i#22,_2#20 AS j#23,_3#21 AS k#24] LocalRelation [_1#19,_2#20,_3#21], [[1,2,3]] == Optimized Logical Plan == Project [_c0#31L] Window [j#23,i#22], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount(j#23) WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING AS _c0#31L], WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING LocalRelation [j#23,i#22], [[2,1]] == Physical Plan == Project [_c0#31L] Window [j#23,i#22], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount(j#23) WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING AS _c0#31L], WindowSpecDefinition ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ExternalSort [i#22 ASC], false Exchange (HashPartitioning 200) LocalTableScan [j#23,i#22], [[2,1]] Code Generation: true == RDD ==
Attachments
Issue Links
- is blocked by
-
HIVE-10141 count(distinct) not supported in Windowing function
- Resolved
- relates to
-
SPARK-10746 count ( distinct columnref) over () returns wrong result set
- Resolved
- links to