Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-10115

Exclude duplicate jars in hadoop package under different component's lib

    Details

    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Jars in the various subproject lib directories are now de-duplicated against Hadoop common. Users who interact directly with those directories must be sure to pull in common's dependencies as well.

      Description

      In the hadoop package distribution there are more than 90% of the jars are duplicated in multiple places.
      For Ex:
      almost all jars in share/hadoop/hdfs/lib are already there in share/hadoop/common/lib

      Same case for all other lib in share directory.

      Anyway for all the daemon processes all directories are added to classpath.

      So to reduce the package distribution size and the classpath overhead, remove the duplicate jars from the distribution.

      1. HADOOP-10115.patch
        5 kB
        Vinayakumar B
      2. HADOOP-10115.patch
        4 kB
        Vinayakumar B
      3. HADOOP-10115.patch
        5 kB
        Vinayakumar B
      4. HADOOP-10115-004.patch
        5 kB
        Vinayakumar B
      5. HADOOP-10115-005.patch
        6 kB
        Vinayakumar B
      6. HADOOP-10115-006.patch
        6 kB
        Vinayakumar B
      7. HADOOP-10115-007.patch
        6 kB
        Vinayakumar B

        Issue Links

          Activity

          Hide
          vinayrpet Vinayakumar B added a comment -

          Uploading a patch, which checks for the duplicate lib. If not already present then only it will copy.

          Show
          vinayrpet Vinayakumar B added a comment - Uploading a patch, which checks for the duplicate lib. If not already present then only it will copy.
          Hide
          vinayrpet Vinayakumar B added a comment -

          small correction

          Show
          vinayrpet Vinayakumar B added a comment - small correction
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12614561/HADOOP-10115.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3297//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3297//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614561/HADOOP-10115.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3297//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3297//console This message is automatically generated.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Updated

          Show
          vinayrpet Vinayakumar B added a comment - Updated
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12616682/HADOOP-10115.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3326//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3326//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616682/HADOOP-10115.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3326//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3326//console This message is automatically generated.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Hi all,
          Can some one take a look at the patch.
          Thanks in advance

          Show
          vinayrpet Vinayakumar B added a comment - Hi all, Can some one take a look at the patch. Thanks in advance
          Hide
          vinayrpet Vinayakumar B added a comment -

          rebased the the patch and included kms too.

          Reduces to 136MB from 176 MB in my windows build.

          Show
          vinayrpet Vinayakumar B added a comment - rebased the the patch and included kms too. Reduces to 136MB from 176 MB in my windows build.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12687686/HADOOP-10115-004.patch
          against trunk revision e996a1b.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5287//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5287//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687686/HADOOP-10115-004.patch against trunk revision e996a1b. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5287//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5287//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Keeping in mind I'm not a Maven expert by any stretch, at what point should dist-layout-stitching.sh just be something in dev-support/ rather than having a long string of shell inside the pom.xml file?

          Running the entire shell code into shellcheck, we get quite a few warnings and errors. It'd be good to get these fixed. There are lots of SC2086 errors all over the place (unquoted variables which contain paths that will break if those paths have IFS, etc, metachars); too many to list here. Ignoring those, the big ones that stick out to me:

          In /tmp/2 line 16:
                                  local count=`find $dir -iname $file|wc -l`
                                              ^-- SC2006: Use $(..) instead of deprecated `..`
          
          In /tmp/2 line 25:
                                    if [[ $srcName != *.jar ]] || [ `findFileInDir $srcName` -eq "0" ]; then
                                                                    ^-- SC2046: Quote this to prevent word splitting.
                                                                    ^-- SC2006: Use $(..) instead of deprecated `..`
          
          In /tmp/2 line 31:
                                    for child in `ls $src`; do
                                                 ^-- SC2045: Iterating over ls output is fragile. Use globs.
                                                 ^-- SC2006: Use $(..) instead of deprecated `..`
          
          In /tmp/2 line 48:
                                    for child in `ls $src`; do
                                                 ^-- SC2045: Iterating over ls output is fragile. Use globs.
                                                 ^-- SC2006: Use $(..) instead of deprecated `..`
          
          

          Bonus points for switching all the [/] to [[/]] pairs.

          Actually running the existent patch throws an error:

               [exec] ./dist-layout-stitching.sh: line 81: //: is a directory
          
          Show
          aw Allen Wittenauer added a comment - Keeping in mind I'm not a Maven expert by any stretch, at what point should dist-layout-stitching.sh just be something in dev-support/ rather than having a long string of shell inside the pom.xml file? Running the entire shell code into shellcheck, we get quite a few warnings and errors. It'd be good to get these fixed. There are lots of SC2086 errors all over the place (unquoted variables which contain paths that will break if those paths have IFS, etc, metachars); too many to list here. Ignoring those, the big ones that stick out to me: In /tmp/2 line 16: local count=`find $dir -iname $file|wc -l` ^-- SC2006: Use $(..) instead of deprecated `..` In /tmp/2 line 25: if [[ $srcName != *.jar ]] || [ `findFileInDir $srcName` -eq "0" ]; then ^-- SC2046: Quote this to prevent word splitting. ^-- SC2006: Use $(..) instead of deprecated `..` In /tmp/2 line 31: for child in `ls $src`; do ^-- SC2045: Iterating over ls output is fragile. Use globs. ^-- SC2006: Use $(..) instead of deprecated `..` In /tmp/2 line 48: for child in `ls $src`; do ^-- SC2045: Iterating over ls output is fragile. Use globs. ^-- SC2006: Use $(..) instead of deprecated `..` Bonus points for switching all the [/] to [ [/] ] pairs. Actually running the existent patch throws an error: [exec] ./dist-layout-stitching.sh: line 81: //: is a directory
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12687686/HADOOP-10115-004.patch
          against trunk revision 7ce3c76.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5884//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5884//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687686/HADOOP-10115-004.patch against trunk revision 7ce3c76. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5884//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5884//console This message is automatically generated.
          Hide
          vinayrpet Vinayakumar B added a comment -

          Here is the modified code.
          Fixed SC2086 errors for the possible problematic areas.

          Show
          vinayrpet Vinayakumar B added a comment - Here is the modified code. Fixed SC2086 errors for the possible problematic areas.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12703348/HADOOP-10115-005.patch
          against trunk revision dfd8da7.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5886//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5886//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703348/HADOOP-10115-005.patch against trunk revision dfd8da7. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5886//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5886//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment - - edited

          It's interesting that the test-patch passed yet while this patch failed on my machine... bizarre!

          In any case, both the copyfIfNotExists and copy functions need some surgery now that the fragile ls has been fixed. $child is a full path now, so we need to do something with it. I'm thinking the easiest is to do something like:

                                  local $dir
                                  local $child
                                  if [ -d "$src" ]; then
                                    for dir in "$src"/* ;
                                    do
                                      child=${dir/${src}\/}
          

          in both functions. This will restore $child to be the directory in a (I think) metachar safe-way as well as properly localizing those vars!

          Also, this line:

          // we need to copy httpfs and kms as is
          

          needs to have the // replaced with #. This is shell code, after all.

          Thanks! This is looking good!

          Show
          aw Allen Wittenauer added a comment - - edited It's interesting that the test-patch passed yet while this patch failed on my machine... bizarre! In any case, both the copyfIfNotExists and copy functions need some surgery now that the fragile ls has been fixed. $child is a full path now, so we need to do something with it. I'm thinking the easiest is to do something like: local $dir local $child if [ -d "$src" ]; then for dir in "$src" /* ; do child=${dir/${src}\/} in both functions. This will restore $child to be the directory in a (I think) metachar safe-way as well as properly localizing those vars! Also, this line: // we need to copy httpfs and kms as is needs to have the // replaced with #. This is shell code, after all. Thanks! This is looking good!
          Hide
          aw Allen Wittenauer added a comment -

          BTW, for other committers watching this, my tests using the tarball on trunk says that this de-dupe works. At least, I'm able to launch all the daemons.

          After Vinayakumar B finishes putting up with my shell nitpicky-ness , I'm pretty much a +1 and will commit to trunk unless someone has an objection...

          Show
          aw Allen Wittenauer added a comment - BTW, for other committers watching this, my tests using the tarball on trunk says that this de-dupe works. At least, I'm able to launch all the daemons. After Vinayakumar B finishes putting up with my shell nitpicky-ness , I'm pretty much a +1 and will commit to trunk unless someone has an objection...
          Hide
          vinayrpet Vinayakumar B added a comment -

          Thanks Allen Wittenauer
          Updated the patch.

          for childPath in "$src"/* ; do
          child=$(basename "$childPath");

          Extracting the $child from the absolute path for comparing in both copy() and copyIfNotExists(). Hopefully remaining things works same.

          // we need to copy httpfs and kms as is

          Thats a miss. Thanks for picking

          Show
          vinayrpet Vinayakumar B added a comment - Thanks Allen Wittenauer Updated the patch. for childPath in "$src" /* ; do child=$(basename "$childPath" ); Extracting the $child from the absolute path for comparing in both copy() and copyIfNotExists(). Hopefully remaining things works same. // we need to copy httpfs and kms as is Thats a miss. Thanks for picking
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12703363/HADOOP-10115-006.patch
          against trunk revision 5578e22.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5887//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5887//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703363/HADOOP-10115-006.patch against trunk revision 5578e22. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5887//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5887//console This message is automatically generated.
          Hide
          busbey Sean Busbey added a comment -

          One possible gap caused by just skipping the jars (rather than symlinking) is that if folks rely on the directory layout at deployment time to grab needed jars they might miss out. Presumably they're already grabbing the common share dir though?

          Keeping in mind I'm not a Maven expert by any stretch, at what point should dist-layout-stitching.sh just be something in dev-support/ rather than having a long string of shell inside the pom.xml file?

          IMHO, we should do this sooner rather than later. One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script.

          +                      # Shellcheck SC2086
          +                      ROOT=$(cd ../..;pwd)
          

          Could we use a maven variable for this instead of cd/pwd?

          +                      run copy "$ROOT"/hadoop-common-project/hadoop-common/target/hadoop-common-${project.version} .
          +                      run copy "$ROOT"/hadoop-common-project/hadoop-nfs/target/hadoop-nfs-${project.version} .
          +                      run copy "$ROOT"/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-${project.version} .
          +                      run copy "$ROOT"/hadoop-hdfs-project/hadoop-hdfs-nfs/target/hadoop-hdfs-nfs-${project.version} .
          +                      run copy "$ROOT"/hadoop-yarn-project/target/hadoop-yarn-project-${project.version} .
          +                      run copy "$ROOT"/hadoop-mapreduce-project/target/hadoop-mapreduce-${project.version} .
          +                      run copy "$ROOT"/hadoop-tools/hadoop-tools-dist/target/hadoop-tools-dist-${project.version} .
          

          Could you add a comment here that it's important we process the hadoop-common project first, so that common always has all the dependencies it declares?

          Should the yarn get processed before the NFS projects?

          Show
          busbey Sean Busbey added a comment - One possible gap caused by just skipping the jars (rather than symlinking) is that if folks rely on the directory layout at deployment time to grab needed jars they might miss out. Presumably they're already grabbing the common share dir though? Keeping in mind I'm not a Maven expert by any stretch, at what point should dist-layout-stitching.sh just be something in dev-support/ rather than having a long string of shell inside the pom.xml file? IMHO, we should do this sooner rather than later. One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script. + # Shellcheck SC2086 + ROOT=$(cd ../..;pwd) Could we use a maven variable for this instead of cd/pwd? + run copy "$ROOT" /hadoop-common-project/hadoop-common/target/hadoop-common-${project.version} . + run copy "$ROOT" /hadoop-common-project/hadoop-nfs/target/hadoop-nfs-${project.version} . + run copy "$ROOT" /hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-${project.version} . + run copy "$ROOT" /hadoop-hdfs-project/hadoop-hdfs-nfs/target/hadoop-hdfs-nfs-${project.version} . + run copy "$ROOT" /hadoop-yarn-project/target/hadoop-yarn-project-${project.version} . + run copy "$ROOT" /hadoop-mapreduce-project/target/hadoop-mapreduce-${project.version} . + run copy "$ROOT" /hadoop-tools/hadoop-tools-dist/target/hadoop-tools-dist-${project.version} . Could you add a comment here that it's important we process the hadoop-common project first, so that common always has all the dependencies it declares? Should the yarn get processed before the NFS projects?
          Hide
          aw Allen Wittenauer added a comment -

          One possible gap caused by just skipping the jars (rather than symlinking) is that if folks rely on the directory layout at deployment time to grab needed jars they might miss out. Presumably they're already grabbing the common share dir though?

          If you symlink, is there actually any benefit? It shrinks the distribution size, sure, but I suspect the JVM won't resolve the link to a degree that it realizes it is the same jar. Also, given that, e.g., HDFS requires common, if folks are only grabbing the HDFS deps and not the common deps, they are doing Bad Things (tm). But if we only commit this to trunk, it's even less of a concern.

          One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script.

          I'm inclined to commit this now and fix this up either as a maven assembly or a separate script as a separate JIRA under the guiding principle of "don't let best stop better." I don't think there is any real question of whether or not this is better than what is currently there. Best might end up being more subjective and take longer.

          (the two code comments)

          Yes, probably a good idea.

          Should the yarn get processed before the NFS projects?

          I'm not sure if it matters much.

          Show
          aw Allen Wittenauer added a comment - One possible gap caused by just skipping the jars (rather than symlinking) is that if folks rely on the directory layout at deployment time to grab needed jars they might miss out. Presumably they're already grabbing the common share dir though? If you symlink, is there actually any benefit? It shrinks the distribution size, sure, but I suspect the JVM won't resolve the link to a degree that it realizes it is the same jar. Also, given that, e.g., HDFS requires common, if folks are only grabbing the HDFS deps and not the common deps, they are doing Bad Things (tm). But if we only commit this to trunk, it's even less of a concern. One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script. I'm inclined to commit this now and fix this up either as a maven assembly or a separate script as a separate JIRA under the guiding principle of "don't let best stop better." I don't think there is any real question of whether or not this is better than what is currently there. Best might end up being more subjective and take longer. (the two code comments) Yes, probably a good idea. Should the yarn get processed before the NFS projects? I'm not sure if it matters much.
          Hide
          busbey Sean Busbey added a comment -

          One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script.

          I'm inclined to commit this now and fix this up either as a maven assembly or a separate script as a separate JIRA under the guiding principle of "don't let best stop better." I don't think there is any real question of whether or not this is better than what is currently there. Best might end up being more subjective and take longer.

          +1

          Show
          busbey Sean Busbey added a comment - One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script. I'm inclined to commit this now and fix this up either as a maven assembly or a separate script as a separate JIRA under the guiding principle of "don't let best stop better." I don't think there is any real question of whether or not this is better than what is currently there. Best might end up being more subjective and take longer. +1
          Hide
          vinayrpet Vinayakumar B added a comment -

          Updated the patch.

          Could we use a maven variable for this instead of cd/pwd?

          Yes, done. Used as

          ROOT=$(cd "${project.build.directory}"/../..;pwd)

          Could you add a comment here that it's important we process the hadoop-common project first, so that common always has all the dependencies it declares?

          done

          Should the yarn get processed before the NFS projects?

          NFS projects are depend on common and hdfs only respectively.
          And they will be copied to common/hdfs directory itself. So copying these will not affect much for the Yarn projects.

          Show
          vinayrpet Vinayakumar B added a comment - Updated the patch. Could we use a maven variable for this instead of cd/pwd? Yes, done. Used as ROOT=$(cd "${project.build.directory}" /../..;pwd) Could you add a comment here that it's important we process the hadoop-common project first, so that common always has all the dependencies it declares? done Should the yarn get processed before the NFS projects? NFS projects are depend on common and hdfs only respectively. And they will be copied to common/hdfs directory itself. So copying these will not affect much for the Yarn projects.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12703582/HADOOP-10115-007.patch
          against trunk revision 82db334.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-dist.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5901//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5901//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703582/HADOOP-10115-007.patch against trunk revision 82db334. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-dist. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5901//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5901//console This message is automatically generated.
          Hide
          busbey Sean Busbey added a comment -

          +1 lgtm for trunk. I have some nits, but none of them matter enough to hold things up.

          Show
          busbey Sean Busbey added a comment - +1 lgtm for trunk. I have some nits, but none of them matter enough to hold things up.
          Hide
          aw Allen Wittenauer added a comment -

          +1 committed to trunk.

          Thanks everyone!

          Show
          aw Allen Wittenauer added a comment - +1 committed to trunk. Thanks everyone!
          Hide
          vinayrpet Vinayakumar B added a comment -
          Show
          vinayrpet Vinayakumar B added a comment - Thanks Allen Wittenauer and Sean Busbey .
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #7295 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7295/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-dist/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7295 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7295/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-dist/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #128 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/128/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-dist/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #128 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/128/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-dist/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #862 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/862/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-dist/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #862 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/862/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-dist/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2060 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2060/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-dist/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2060 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2060/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-dist/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #119 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/119/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-dist/pom.xml
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #119 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/119/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-dist/pom.xml hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #128 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/128/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-dist/pom.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #128 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/128/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-dist/pom.xml
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2078 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2078/)
          HADOOP-10115. Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-dist/pom.xml
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2078 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2078/ ) HADOOP-10115 . Exclude duplicate jars in hadoop package under different component's lib (Vinayakumar B via aw) (aw: rev 47f7f18d4cc9145607ef3dfb70aa88748cd9dbec) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-dist/pom.xml

            People

            • Assignee:
              vinayrpet Vinayakumar B
              Reporter:
              vinayrpet Vinayakumar B
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development