Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
git.commit.id.abbrev=3d0b4b0
TPCDS Query 84:
SELECT c_customer_id AS customer_id, c_last_name || ', ' || c_first_name AS customername FROM customer, customer_address, customer_demographics, household_demographics, income_band, store_returns WHERE ca_city = 'Green Acres' AND c_current_addr_sk = ca_address_sk AND ib_lower_bound >= 54986 AND ib_upper_bound <= 54986 + 50000 AND ib_income_band_sk = hd_income_band_sk AND cd_demo_sk = c_current_cdemo_sk AND hd_demo_sk = c_current_hdemo_sk AND sr_cdemo_sk = cd_demo_sk ORDER BY c_customer_id LIMIT 100;
Execution times :
Hive Plugin : 12.34 seconds Hive Native Reader : 360.866 DFS Parquet Reader : 84.3 seconds
Note : These data sets were generated by hive and the underlying parquet files have more than 1 row groups (household_demographics has ~8000 row groups)
The data files are larger than 10 MB to attach them here. Reach out to me if you need anything else
Attachments
Issue Links
- blocks
-
DRILL-4309 Make this option store.hive.optimize_scan_with_native_readers=true default
- Open