Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11415

Add run-step-wait-all after loading Kudu data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 4.2.0
    • None
    • ghx-label-6

    Description

      IMPALA-11384 reveals an issue in testdata/bin/create-load-data.sh.

      if [[ $SKIP_METADATA_LOAD -eq 1 ]]; then
        # Tests depend on the kudu data being clean, so load the data from scratch.
        # This is only necessary if this is not a full dataload, because a full dataload
        # already loads Kudu functional and TPC-H tables from scratch.
        run-step-backgroundable "Loading Kudu functional" load-kudu.log \
              load-data "functional-query" "core" "kudu/none/none" force
        run-step-backgroundable "Loading Kudu TPCH" load-kudu-tpch.log \
              load-data "tpch" "core" "kudu/none/none" force
      fi
      run-step-backgroundable "Loading Hive UDFs" build-and-copy-hive-udfs.log \
          build-and-copy-hive-udfs 

      If $SKIP_METADATA_LOAD is true, all three of "Loading Kudu functional", "Loading Kudu TPCH", and "Loading Hive UDFs" will be run in parallel in the background. The later background step seemingly override the thrift generated python code under shell/gen-py/hive_metastore/ and shell/gen-py/beeswaxd/. This in turn cause sporadic python error upon invocation of bin/load-data.py of the two former Kudu background steps. Adding run-step-wait-all after the Kudu data loading seems to fix the issue.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rizaon Riza Suminto
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: