Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
ghx-label-6
Description
IMPALA-11384 reveals an issue in testdata/bin/create-load-data.sh.
if [[ $SKIP_METADATA_LOAD -eq 1 ]]; then # Tests depend on the kudu data being clean, so load the data from scratch. # This is only necessary if this is not a full dataload, because a full dataload # already loads Kudu functional and TPC-H tables from scratch. run-step-backgroundable "Loading Kudu functional" load-kudu.log \ load-data "functional-query" "core" "kudu/none/none" force run-step-backgroundable "Loading Kudu TPCH" load-kudu-tpch.log \ load-data "tpch" "core" "kudu/none/none" force fi run-step-backgroundable "Loading Hive UDFs" build-and-copy-hive-udfs.log \ build-and-copy-hive-udfs
If $SKIP_METADATA_LOAD is true, all three of "Loading Kudu functional", "Loading Kudu TPCH", and "Loading Hive UDFs" will be run in parallel in the background. The later background step seemingly override the thrift generated python code under shell/gen-py/hive_metastore/ and shell/gen-py/beeswaxd/. This in turn cause sporadic python error upon invocation of bin/load-data.py of the two former Kudu background steps. Adding run-step-wait-all after the Kudu data loading seems to fix the issue.