Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4259

Allow building Impala without thirdparty/ test dependencies

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Infrastructure
    • Labels:

      Description

      Currently the Impala build process does not distinguish well between build dependencies and test dependencies. In particular, it assumes the presence of a number of Hadoop components under thirdparty that are only needed for the test cluster.

      I did a quick experiment to determine what the true build dependencies are. I was able to get Impala to build with only a few files under thirdparty:

      $ find thirdparty/ -type f
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.so.0.0.0
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.so
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/lib/native/libhdfs.a
      thirdparty/hadoop-2.6.0-cdh5.10.0-SNAPSHOT/include/hdfs.h
      thirdparty/hive-1.1.0-cdh5.10.0-SNAPSHOT/src/metastore/if/hive_metastore.thrift
      

      I had to comment out a few lines of test setup code that are executed unconditionally:

      $ git diff
      diff --git a/buildall.sh b/buildall.sh
      index a71c521..28f3786 100755
      --- a/buildall.sh
      +++ b/buildall.sh
      @@ -278,13 +278,16 @@ fi
       
       if [ -e "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.so ]
       then
      -  cp "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.* "$HADOOP_HOME/lib/native"
      +  # TODO: should only need to do this if testing
      +  # cp "$HADOOP_LZO"/build/native/Linux-*-*/lib/libgplcompression.* "$HADOOP_HOME/lib/native"
      +  :
       else
         echo "No hadoop-lzo found"
       fi
       
       # Stop any running Impala services.
      -"${IMPALA_HOME}/bin/start-impala-cluster.py" --kill --force
      +# TODO: should only need to do this if testing
      +#"${IMPALA_HOME}/bin/start-impala-cluster.py" --kill --force
       
       if [[ "$CLEAN_ACTION" -eq 1 || "$FORMAT_METASTORE" -eq 1 || "$FORMAT_CLUSTER" -eq 1 ||
              "$FORMAT_SENTRY_POLICY_DB" -eq 1 || -n "$METASTORE_SNAPSHOT_FILE" ]]
      @@ -306,7 +309,8 @@ if [[ "$FORMAT_METASTORE" -eq 1 && -z "$METASTORE_SNAPSHOT_FILE" ]]; then
       fi
       
       # Generate the Hadoop configs needed by Impala
      -"${IMPALA_HOME}/bin/create-test-configuration.sh" ${CREATE_TEST_CONFIG_ARGS}
      +# TODO: should only do this if testing
      +# "${IMPALA_HOME}/bin/create-test-configuration.sh" ${CREATE_TEST_CONFIG_ARGS}
       
       # If a metastore snapshot exists, load it.
       if [ "$METASTORE_SNAPSHOT_FILE" ]; then
      

        Issue Links

          Activity

          Hide
          tarmstrong Tim Armstrong added a comment -

          IMPALA-4259: build Impala without any test cluster setup.

          The main outcome of this change is to avoid making unnecessary
          modification to the Impala or other source trees when we don't need the
          test cluster.

          To achieve that, this refactors the script to make the flow easier
          to understand and makes it more consistent which build steps are
          executed in which modes.

          Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
          Reviewed-on: http://gerrit.cloudera.org:8080/4685
          Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
          Tested-by: Internal Jenkins

          Show
          tarmstrong Tim Armstrong added a comment - IMPALA-4259 : build Impala without any test cluster setup. The main outcome of this change is to avoid making unnecessary modification to the Impala or other source trees when we don't need the test cluster. To achieve that, this refactors the script to make the flow easier to understand and makes it more consistent which build steps are executed in which modes. Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137 Reviewed-on: http://gerrit.cloudera.org:8080/4685 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins

            People

            • Assignee:
              tarmstrong Tim Armstrong
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development