Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
The following query can give wrong results when CBO is on:
select * from ( select part1,randum123 from (SELECT *, cast(rand() as double) AS randum123 FROM testA where part1='CA' and part2 = 'ABC') a where randum123 <= 0.5) s where s.randum123 > 0.25 limit 20; The plan of the query is as follows: STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: testa Statistics: Num rows: 2 Data size: 4580 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((rand() <= 0.5D) and (rand() > 0.25D)) (type: boolean) Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: 'CA' (type: string), rand() (type: double) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 20 Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 20 Processor Tree: ListSink
The relevant part in the plan is the filter:
Filter Operator
predicate: ((rand() <= 0.5D) and (rand() > 0.25D)) (type: boolean)
The predicates randum123 <= 0.5 and s.randum123 > 0.25 were pushed down. And randum123 was resolved to rand(). This is bad because it will result in invocation of rand() two times and rand() UDF is non-deterministic. Both the rand calls can generate values that can satisfy the predicates independently, but not together, whereas the original intention of the query is to give results when rand falls between 0.25 and 0.5.
A sample result:
CA 0.9191984370369802 CA 0.397933021566812
where the condition was not satisfied.
Attachments
Attachments
Issue Links
- links to