Description
Now that multi-table insertion is committed to branch, we should enable those related qtests.
Here is a list of qfiles that should be activated (some of them may already be activated).
The list may not be comprehensive.
add_part_multiple.q auto_smb_mapjoin_14.q bucket5.q column_access_stats.q date_udf.q groupby10.q groupby11.q groupby3_map_multi_distinct.q groupby3_map.q groupby3_map_skew.q groupby3_noskew_multi_distinct.q groupby3_noskew.q groupby7_map_multi_single_reducer.q groupby7_map.q groupby7_map_skew.q groupby7_noskew_multi_single_reducer.q groupby7_noskew.q groupby7.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby_complex_types_multi_single_reducer.q groupby_complex_types.q groupby_cube1.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q groupby_multi_insert_common_distinct.q groupby_multi_single_reducer2.q groupby_multi_single_reducer3.q groupby_multi_single_reducer.q groupby_position.q groupby_ppr.q groupby_rollup1.q groupby_sort_1_23.q groupby_sort_1.q groupby_sort_skew_1_23.q infer_bucket_sort_multi_insert.q innerjoin.q input12_hadoop20.q input12.q input13.q input14.q input17.q input18.q input1_limit.q input_part2.q insert_into3.q join_nullsafe.q load_dyn_part8.q metadata_only_queries_with_filters.q multigroupby_singlemr.q multi_insert_gby2.q multi_insert_gby3.q multi_insert_gby.q multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q multi_insert.q parallel.q partition_date2.q pcr.q ppd_multi_insert.q ppd_transform.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q stats4.q subquery_multiinsert.q table_access_keys_stats.q tez_dml.q udaf_percentile_approx_20.q udaf_percentile_approx_23.q union17.q union18.q union19.q
There are some tests that cannot be enabled right now, due to various reasons:
1. ForwardOperator Issue, including
groupby7_noskew_multi_single_reducer.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby10.q groupby_multi_insert_common_distinct.q union17.q
Reason: currently, if the node to break in the operator tree is a ForwardOperator, we simple do nothing. However, we may have the following case:
... RS_0 | FOR | / \ GBY_1 GBY_2 | | ... ... | | RS_1 RS_2 | | ... ... | | FS_1 FS_2
which may result to:
RW / \ RW RW
and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated (and same) inputs.
2. Stats issue, including:
bucket5.q infer_bucket_sort_multi_insert.q stats4.q smb_mapjoin_13.q smb_mapjoin_15.q
Reason: In these tests, I get diff error because numRows and rawDataSize are -1, but they are expected to be some positive value. I don't think this is related to multi-insertion.
3. Join/SMB Join Issue, including
auto_smb_mapjoin_14.q auto_sortmerge_join_13.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q
Reason: These tests either failed with exception or failed with diff. I think it's because SMB Join (HIVE-8202) isn't supported right now.
4. Result doesn't match, including
groupby3_map_skew.q groupby_map_ppr_multi_distinct.q groupby_complex_types_multi_single_reducer.q groupby_map_ppr.q partition_date2.q udaf_percentile_approx_23.q
Reason: The results from these tests are different from MR's. For instance, test for groupby3_map_skew.q failed because:
< 130091.0 260.182 256.10355987055016 98.0 0.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- > 130091.0 260.182 256.10355987055016 98.0 0.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109
I don't know why this will happen. But, I think they may not be related to multi-insertion.