Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
select ss_sold_date_sk from store_sales, date_dim where date_dim.d_year in (1998,1998+1,1998+2) and store_sales.ss_sold_date_sk = date_dim.d_date_sk limit 100; select distinct ss_sold_date_sk from store_sales, date_dim where date_dim.d_year in (1998,1998+1,1998+2) and store_sales.ss_sold_date_sk = date_dim.d_date_sk limit 100;
Queries like the above generate a large number of map tasks. Currently they don't bail out after generating enough amount of data.
It would be good to make use of ObjectCache & retain the number of records generated. LimitOperator/VectorLimitOperator can bail out for the later tasks in the operator's init phase itself.
Attachments
Issue Links
- causes
-
HIVE-28281 Hive produces incorrect result when a vertex contains multiple LimitOperator
- Open
-
HIVE-25751 Ignore exceptions related to interruption when the limit is reached
- Closed
- is blocked by
-
TEZ-4238 Check null mrReader in MRInput.close
- Resolved
- relates to
-
HIVE-25274 TestLimitOperator fails if default engine is Tez
- Closed
- links to