Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4259

Allow building Impala without thirdparty/ test dependencies

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Infrastructure
    • Labels:

      Description

      Currently the Impala build process does not distinguish well between build dependencies and test dependencies. In particular, it assumes the presence of a number of Hadoop components under thirdparty that are only needed for the test cluster.

      I did a quick experiment to determine what the true build dependencies are. I was able to get Impala to build with only a few files under thirdparty:

      $ find thirdparty/ -type f
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.so.0.0.0
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.so
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.a
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/include/hdfs.h
      thirdparty/hive-1.1.0-cdh5.10.0-SNAPSHOT/src/metastore/if/hive_metastore.thrift
      

      I had to comment out a few lines of test setup code that are executed unconditionally:

      $ git diff
      diff --git a/buildall.sh b/buildall.sh
      index a71c521..28f3786 100755
      --- a/buildall.sh
      +++ b/buildall.sh
      @@ -278,13 +278,16 @@ fi
       
       if [ -e "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.so ]
       then
      -  cp "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.* "$HADOOP_HOME/lib/native"
      +  # TODO: should only need to do this if testing
      +  # cp "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.* "$HADOOP_HOME/lib/native"
      +  :
       else
         echo "No hadoop-lzo found"
       fi
       
       # Stop any running Impala services.
      -"${IMPALA_HOME}/bin/start-impala-cluster.py" --kill --force
      +# TODO: should only need to do this if testing
      +#"${IMPALA_HOME}/bin/start-impala-cluster.py" --kill --force
       
       if [[ "$CLEAN_ACTION" -eq 1 || "$FORMAT_METASTORE" -eq 1 || "$FORMAT_CLUSTER" -eq 1 ||
              "$FORMAT_SENTRY_POLICY_DB" -eq 1 || -n "$METASTORE_SNAPSHOT_FILE" ]]
      @@ -306,7 +309,8 @@ if [[ "$FORMAT_METASTORE" -eq 1 && -z "$METASTORE_SNAPSHOT_FILE" ]]; then
       fi
       
       # Generate the Hadoop configs needed by Impala
      -"${IMPALA_HOME}/bin/create-test-configuration.sh" ${CREATE_TEST_CONFIG_ARGS}
      +# TODO: should only do this if testing
      +# "${IMPALA_HOME}/bin/create-test-configuration.sh" ${CREATE_TEST_CONFIG_ARGS}
       
       # If a metastore snapshot exists, load it.
       if [ "$METASTORE_SNAPSHOT_FILE" ]; then
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: