Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.23.0
    • Component/s: build
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We are now able to publish hadoop artifacts to the maven repo successfully [ Hadoop-6382]
      Drawbacks with the current approach:

      • Use ivy for dependency management with ivy.xml
      • Use maven-ant-task for artifact publishing to the maven repository
      • pom files are not generated dynamically

      To address this I propose we use maven to build hadoop-common, which would help us to manage dependencies, publish artifacts and have one single xml file(POM) for dependency management and artifact publishing.

      I would like to have a branch created to work on mavenizing hadoop common.

      1. build.png
        153 kB
        Lars Francke
      2. hadoop-commons-maven.patch
        26 kB
        Alejandro Abdelnur
      3. mvn-layout.sh
        1 kB
        Alejandro Abdelnur
      4. mvn-layout.sh
        1 kB
        Alejandro Abdelnur
      5. HADOOP-6671.patch
        38 kB
        Alejandro Abdelnur
      6. HADOOP-6671b.patch
        40 kB
        Alejandro Abdelnur
      7. HADOOP-6671c.patch
        42 kB
        Alejandro Abdelnur
      8. HADOOP-6671d.patch
        39 kB
        Alejandro Abdelnur
      9. mvn-layout2.sh
        2 kB
        Alejandro Abdelnur
      10. mvn-layout2.sh
        2 kB
        Owen O'Malley
      11. HADOOP-6671-e.patch
        160 kB
        Alejandro Abdelnur
      12. mvn-layout-e.sh
        2 kB
        Alejandro Abdelnur
      13. HADOOP-6671-cross-project-HDFS.patch
        2 kB
        Tom White
      14. HADOOP-6671-f.patch
        155 kB
        Alejandro Abdelnur
      15. mvn-layout-f.sh
        2 kB
        Alejandro Abdelnur
      16. HADOOP-6671-g.patch
        159 kB
        Alejandro Abdelnur
      17. HADOOP-6671-h.patch
        161 kB
        Alejandro Abdelnur
      18. common-mvn-layout-i.sh
        2 kB
        Alejandro Abdelnur
      19. HADOOP-6671-i.patch
        173 kB
        Alejandro Abdelnur
      20. HADOOP-6671-j.patch
        184 kB
        Alejandro Abdelnur
      21. mvn-layout-k.sh
        4 kB
        Alejandro Abdelnur
      22. HADOOP-6671-k.sh
        205 kB
        Alejandro Abdelnur
      23. mvn-layout-l.sh
        4 kB
        Alejandro Abdelnur
      24. HADOOP-6671-l.patch
        212 kB
        Alejandro Abdelnur
      25. mvn-layout-m.sh
        4 kB
        Alejandro Abdelnur
      26. HADOOP-6671-m.patch
        213 kB
        Alejandro Abdelnur
      27. mvn-layout-n.sh
        4 kB
        Alejandro Abdelnur
      28. HADOOP-6671-n.patch
        214 kB
        Alejandro Abdelnur
      29. mvn-layout-o.sh
        4 kB
        Alejandro Abdelnur
      30. HADOOP-6671-o.patch
        214 kB
        Alejandro Abdelnur
      31. mvn-layout-p.sh
        4 kB
        Alejandro Abdelnur
      32. HADOOP-6671-p.patch
        214 kB
        Alejandro Abdelnur
      33. mvn-layout-q.sh
        4 kB
        Alejandro Abdelnur
      34. HADOOP-6671-q.patch
        217 kB
        Alejandro Abdelnur
      35. HADOOP-6671-AA.patch
        279 kB
        Alejandro Abdelnur
      36. mvn-layout-AA.sh
        4 kB
        Alejandro Abdelnur
      37. HADOOP-6671-AB.patch
        229 kB
        Alejandro Abdelnur
      38. mvn-layout-AB.sh
        3 kB
        Alejandro Abdelnur
      39. HADOOP-6671-AC.patch
        238 kB
        Alejandro Abdelnur
      40. HADOOP-6671-AC.sh
        4 kB
        Alejandro Abdelnur
      41. HADOOP-6671-AD.patch
        237 kB
        Alejandro Abdelnur
      42. HADOOP-6671-AD.sh
        4 kB
        Alejandro Abdelnur

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          Alejandro, thanks for the prompt response. Chris is going to fix it.

          Show
          Tsz Wo Nicholas Sze added a comment - Alejandro, thanks for the prompt response. Chris is going to fix it.
          Hide
          Chris Nauroth added a comment -

          Nicholas, thank you for helping me track down the change. Alejandro, thank you for the quick response. I have changes in progress in this area for HADOOP-8957, and it's easy enough for me to just roll this one-line change in with my patch.

          Show
          Chris Nauroth added a comment - Nicholas, thank you for helping me track down the change. Alejandro, thank you for the quick response. I have changes in progress in this area for HADOOP-8957 , and it's easy enough for me to just roll this one-line change in with my patch.
          Hide
          Alejandro Abdelnur added a comment -

          (enter too soon) Good catch, thx, are you following up with a JIRA/patch?

          Show
          Alejandro Abdelnur added a comment - (enter too soon) Good catch, thx, are you following up with a JIRA/patch?
          Hide
          Alejandro Abdelnur added a comment -

          Nicholas, definitely it is, I guess it may have been while doing F&R of some testcases paths.

          Show
          Alejandro Abdelnur added a comment - Nicholas, definitely it is, I guess it may have been while doing F&R of some testcases paths.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          diff --git hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java
          index de72eee..f4632f3 100644
          --- hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java
          +++ hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java
          @@ -91,7 +91,7 @@ public abstract class AbstractFileSystem {
               StringTokenizer tokens = new StringTokenizer(src, Path.SEPARATOR);
               while(tokens.hasMoreTokens()) {
                 String element = tokens.nextToken();
          -      if (element.equals("..") || 
          +      if (element.equals("target/generated-sources") ||
                     element.equals(".")  ||
                     (element.indexOf(":") >= 0)) {
                   return false;
          

          Why replace ".." with "target/generated-sources"? Is it by mistake?

          Show
          Tsz Wo Nicholas Sze added a comment - diff --git hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java index de72eee..f4632f3 100644 --- hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java +++ hadoop-common/src/main/java/org/apache/hadoop/fs/AbstractFileSystem.java @@ -91,7 +91,7 @@ public abstract class AbstractFileSystem { StringTokenizer tokens = new StringTokenizer(src, Path.SEPARATOR); while (tokens.hasMoreTokens()) { String element = tokens.nextToken(); - if (element.equals( ".." ) || + if (element.equals( "target/generated-sources" ) || element.equals( "." ) || (element.indexOf( ":" ) >= 0)) { return false ; Why replace ".." with "target/generated-sources"? Is it by mistake?
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/746/)
          HADOOP-7536. Correct the dependency version regressions introduced in HADOOP-6671. Contributed by Alejandro Abdelnur.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1158347
          Files :

          • /hadoop/common/trunk/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-project/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/746/ ) HADOOP-7536 . Correct the dependency version regressions introduced in HADOOP-6671 . Contributed by Alejandro Abdelnur. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1158347 Files : /hadoop/common/trunk/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-project/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #751 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/751/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #751 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/751/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #760 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/760/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #760 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/760/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-1073-branch #23 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/23/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-1073-branch #23 (See https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/23/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Uma Maheswara Rao G added a comment -

          Yes Alejandro,
          I have verified it and posted the comments in HADOOP-7528.

          --Thanks

          Show
          Uma Maheswara Rao G added a comment - Yes Alejandro, I have verified it and posted the comments in HADOOP-7528 . --Thanks
          Hide
          Alejandro Abdelnur added a comment -

          Uma, HADOOP-7528 has a patch avail already, would you give it a try?

          Show
          Alejandro Abdelnur added a comment - Uma, HADOOP-7528 has a patch avail already, would you give it a try?
          Hide
          Uma Maheswara Rao G added a comment -

          Thanks Alejandro, for raising the separate JIRA.

          Show
          Uma Maheswara Rao G added a comment - Thanks Alejandro, for raising the separate JIRA.
          Hide
          Alejandro Abdelnur added a comment -
          Show
          Alejandro Abdelnur added a comment - @Uma, HADOOP-7528
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thanks Alejandro. It works.

          Show
          Tsz Wo Nicholas Sze added a comment - Thanks Alejandro. It works.
          Hide
          Uma Maheswara Rao G added a comment -

          Hi Alejandro & Tom,

          Can you please check my comments above?
          Please correct me if i am doing something wrong.

          Show
          Uma Maheswara Rao G added a comment - Hi Alejandro & Tom, Can you please check my comments above? Please correct me if i am doing something wrong.
          Hide
          Tom White added a comment -

          BTW I've just opened HADOOP-7525 to simplify the script.

          Show
          Tom White added a comment - BTW I've just opened HADOOP-7525 to simplify the script.
          Hide
          Alejandro Abdelnur added a comment -

          Nicholas,

          It is in the how-to-contribute wiki, section 'testing your patch'

          export MAVEN_HOME=...
          dev-support/test-patch.sh DEVELOPER \
            /path/to/my.patch \
            /tmp \
            svn \
            grep \
            patch \
            $FINDBUGS_HOME \
            $FORREST_HOME \
            `pwd`
          
          Show
          Alejandro Abdelnur added a comment - Nicholas, It is in the how-to-contribute wiki, section 'testing your patch' export MAVEN_HOME=... dev-support/test-patch.sh DEVELOPER \ /path/to/my.patch \ /tmp \ svn \ grep \ patch \ $FINDBUGS_HOME \ $FORREST_HOME \ `pwd`
          Hide
          Tsz Wo Nicholas Sze added a comment -

          How do we run test-patch in common?

          Show
          Tsz Wo Nicholas Sze added a comment - How do we run test-patch in common?
          Hide
          Uma Maheswara Rao G added a comment -

          small correction for above comment:

          Any one treid to build the hadoop in Windows7?

          This question is for mavenized builds (latest changes)

          --thanks

          Show
          Uma Maheswara Rao G added a comment - small correction for above comment: Any one treid to build the hadoop in Windows7? This question is for mavenized builds (latest changes) --thanks
          Hide
          Uma Maheswara Rao G added a comment -

          Any one treid to build the hadoop in Windows7?

          I checkout the latest Hadoop trunk code and given the below command in windows.
          I ran the below command at root folder.
          >mvn clean install -DskipTests

          But could not build the project

          below is the clear info.

          [INFO]
          [INFO] — maven-enforcer-plugin:1.0:enforce (clean) @ hadoop-project —
          [WARNING] Rule 2: org.apache.maven.plugins.enforcer.RequireOS failed with messag
          e:
          OS Arch: x86 Family: windows Name: windows 7 Version: 6.1 is not allowed by Fami
          ly=unix
          [INFO] ------------------------------------------------------------------------
          [INFO] Reactor Summary:
          [INFO]
          [INFO] Apache Hadoop Project POM ......................... FAILURE [0.552s]
          [INFO] Apache Hadoop Assemblies .......................... SKIPPED
          [INFO] Apache Hadoop Annotations ......................... SKIPPED
          [INFO] Apache Hadoop Common .............................. SKIPPED
          [INFO] Apache Hadoop Main ................................ SKIPPED
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD FAILURE

          workaround:
          I removed the below tag from <root>/pom.xml, <root>/hadoop-assemblies/pom.xml & <root>/hadoop-project/pom.xml

          <Rules>
          ......
          .....
          <requireOS>
          <family>unix</family>
          </requireOS>

          </Rules>

          After removing this i could proceed some extent but did not get success.

          Started fialing in annotations.

          Trace:

          [ERROR] symbol : variable Standard
          [ERROR] location: class org.apache.hadoop.classification.tools.ExcludePrivateAnn
          otationsStandardDoclet
          [ERROR] \Hadoop_common\hadoop-annotations\src\main\java\org\apache\hadoop\classi
          fication\tools\ExcludePrivateAnnotationsJDiffDoclet.java:[36,11] cannot find sym
          bol
          [ERROR] symbol : variable LanguageVersion
          [ERROR] location: class org.apache.hadoop.classification.tools.ExcludePrivateAnn
          otationsJDiffDoclet
          [ERROR] \Hadoop_common\hadoop-annotations\src\main\java\org\apache\hadoop\classi
          fication\tools\ExcludePrivateAnnotationsJDiffDoclet.java:[42,16] cannot access c
          om.sun.javadoc.Doclet
          [ERROR] class file for com.sun.javadoc.Doclet not found
          [ERROR] return JDiff.start(RootDocProcessor.process(root));
          [ERROR] -> [Help 1]
          [ERROR]

          Looks tools.jar jar is not avialble to class path.

          Java lib has the tools.jar but not able to pick.
          I tried to use -DskipTests but no help.

          Workaround:
          Finally after adding the below entry in hadoop-annotations/pom.xml

          <dependencies>
          ................
          ................
          <dependency>
          <groupId>jdk.tools</groupId>
          <artifactId>jdk.tools</artifactId>
          <version>1.6</version>
          <scope>system</scope>
          <systemPath>$

          {env.JAVA_HOME}

          /lib/tools.jar</systemPath>
          </dependency>

          </dependencies>

          Could see success

          [INFO]
          [INFO] Apache Hadoop Project POM ......................... SUCCESS [0.749s]
          [INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.833s]
          [INFO] Apache Hadoop Annotations ......................... SUCCESS [0.809s]
          [INFO] Apache Hadoop Common .............................. SUCCESS [33.658s]
          [INFO] Apache Hadoop Main ................................ SUCCESS [0.030s]
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESS
          [INFO] ------------------------------------------------------------------------

          Any alternative solutions to work with windows as well? or we need to incorporate this chnages as well?
          Please correct me if i am doing something wrong.

          Show
          Uma Maheswara Rao G added a comment - Any one treid to build the hadoop in Windows7? I checkout the latest Hadoop trunk code and given the below command in windows. I ran the below command at root folder. >mvn clean install -DskipTests But could not build the project below is the clear info. [INFO] [INFO] — maven-enforcer-plugin:1.0:enforce (clean) @ hadoop-project — [WARNING] Rule 2: org.apache.maven.plugins.enforcer.RequireOS failed with messag e: OS Arch: x86 Family: windows Name: windows 7 Version: 6.1 is not allowed by Fami ly=unix [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Project POM ......................... FAILURE [0.552s] [INFO] Apache Hadoop Assemblies .......................... SKIPPED [INFO] Apache Hadoop Annotations ......................... SKIPPED [INFO] Apache Hadoop Common .............................. SKIPPED [INFO] Apache Hadoop Main ................................ SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE workaround: I removed the below tag from <root>/pom.xml, <root>/hadoop-assemblies/pom.xml & <root>/hadoop-project/pom.xml <Rules> ...... ..... <requireOS> <family>unix</family> </requireOS> </Rules> After removing this i could proceed some extent but did not get success. Started fialing in annotations. Trace: [ERROR] symbol : variable Standard [ERROR] location: class org.apache.hadoop.classification.tools.ExcludePrivateAnn otationsStandardDoclet [ERROR] \Hadoop_common\hadoop-annotations\src\main\java\org\apache\hadoop\classi fication\tools\ExcludePrivateAnnotationsJDiffDoclet.java: [36,11] cannot find sym bol [ERROR] symbol : variable LanguageVersion [ERROR] location: class org.apache.hadoop.classification.tools.ExcludePrivateAnn otationsJDiffDoclet [ERROR] \Hadoop_common\hadoop-annotations\src\main\java\org\apache\hadoop\classi fication\tools\ExcludePrivateAnnotationsJDiffDoclet.java: [42,16] cannot access c om.sun.javadoc.Doclet [ERROR] class file for com.sun.javadoc.Doclet not found [ERROR] return JDiff.start(RootDocProcessor.process(root)); [ERROR] -> [Help 1] [ERROR] Looks tools.jar jar is not avialble to class path. Java lib has the tools.jar but not able to pick. I tried to use -DskipTests but no help. Workaround: Finally after adding the below entry in hadoop-annotations/pom.xml <dependencies> ................ ................ <dependency> <groupId>jdk.tools</groupId> <artifactId>jdk.tools</artifactId> <version>1.6</version> <scope>system</scope> <systemPath>$ {env.JAVA_HOME} /lib/tools.jar</systemPath> </dependency> </dependencies> Could see success [INFO] [INFO] Apache Hadoop Project POM ......................... SUCCESS [0.749s] [INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.833s] [INFO] Apache Hadoop Annotations ......................... SUCCESS [0.809s] [INFO] Apache Hadoop Common .............................. SUCCESS [33.658s] [INFO] Apache Hadoop Main ................................ SUCCESS [0.030s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ Any alternative solutions to work with windows as well? or we need to incorporate this chnages as well? Please correct me if i am doing something wrong.
          Hide
          Alejandro Abdelnur added a comment -

          just filed HADOOP-7520 and uploaded a patch for it (hadoop-main should be skipped from deployment).

          Show
          Alejandro Abdelnur added a comment - just filed HADOOP-7520 and uploaded a patch for it (hadoop-main should be skipped from deployment).
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #703 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/703/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #703 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/703/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #812 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/812/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #812 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/812/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #738 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/738/)
          Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit.

          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197
          Files :

          • /hadoop/common/trunk/common/src
          • /hadoop/common/trunk/common/src/test/bin
          • /hadoop/common/trunk/common/src/test
          • /hadoop/common/trunk/common
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #738 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/738/ ) Reinstate common/src/test/bin (used by externals) which was mistakenly deleted in HADOOP-6671 commit. tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1153197 Files : /hadoop/common/trunk/common/src /hadoop/common/trunk/common/src/test/bin /hadoop/common/trunk/common/src/test /hadoop/common/trunk/common
          Hide
          Giridharan Kesavan added a comment -

          executing mvn deploy at the trunk level fails

          trunk $ mvn clean deploy -DskipTests
          ......
          [INFO] --- maven-deploy-plugin:2.5:deploy (default-deploy) @ hadoop-main ---
          [INFO] ------------------------------------------------------------------------
          [INFO] Reactor Summary:
          [INFO] 
          [INFO] Apache Hadoop Project POM ......................... SUCCESS [2.569s]
          [INFO] Apache Hadoop Assemblies .......................... SUCCESS [2.285s]
          [INFO] Apache Hadoop Annotations ......................... SUCCESS [2.036s]
          [INFO] Apache Hadoop Common .............................. SUCCESS [42.847s]
          [INFO] Apache Hadoop Main ................................ FAILURE [0.014s]
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD FAILURE
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 50.009s
          [INFO] Finished at: Fri Aug 05 07:08:20 UTC 2011
          [INFO] Final Memory: 42M/590M
          [INFO] ------------------------------------------------------------------------
          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.5:deploy (default-deploy) on project hadoop-main: Deployment failed: repository element was not specified in the POM inside distributionManagement element or in -DaltDeploymentRepository=id::layout::url parameter -> [Help 1]
          

          Could you please take a look?

          Show
          Giridharan Kesavan added a comment - executing mvn deploy at the trunk level fails trunk $ mvn clean deploy -DskipTests ...... [INFO] --- maven-deploy-plugin:2.5:deploy (default-deploy) @ hadoop-main --- [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Project POM ......................... SUCCESS [2.569s] [INFO] Apache Hadoop Assemblies .......................... SUCCESS [2.285s] [INFO] Apache Hadoop Annotations ......................... SUCCESS [2.036s] [INFO] Apache Hadoop Common .............................. SUCCESS [42.847s] [INFO] Apache Hadoop Main ................................ FAILURE [0.014s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 50.009s [INFO] Finished at: Fri Aug 05 07:08:20 UTC 2011 [INFO] Final Memory: 42M/590M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.5:deploy (default-deploy) on project hadoop-main: Deployment failed: repository element was not specified in the POM inside distributionManagement element or in -DaltDeploymentRepository=id::layout::url parameter -> [Help 1] Could you please take a look?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thanks, Aaron.

          Show
          Tsz Wo Nicholas Sze added a comment - Thanks, Aaron.
          Hide
          Aaron T. Myers added a comment -

          mvn eclipse:eclipse

          Show
          Aaron T. Myers added a comment - mvn eclipse:eclipse
          Hide
          Tsz Wo Nicholas Sze added a comment -

          How do we build eclipse project now?

          Show
          Tsz Wo Nicholas Sze added a comment - How do we build eclipse project now?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hey Tom, your command workded top-level directory. Thanks!

          Show
          Tsz Wo Nicholas Sze added a comment - Hey Tom, your command workded top-level directory. Thanks!
          Hide
          Tom White added a comment -

          It looks like you might not be running from the top-level (i.e. the directory containing hadoop-common, hdfs, mapreduce etc). Can you try from there? Thanks, Tom.

          Show
          Tom White added a comment - It looks like you might not be running from the top-level (i.e. the directory containing hadoop-common, hdfs, mapreduce etc). Can you try from there? Thanks, Tom.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Tom, it still did not work.

          szetszwo hadoop-common$mvn clean install -DskipTests
          [INFO] Scanning for projects...
          [ERROR] The build could not read 1 project -> [Help 1]
          [ERROR]   
          [ERROR]   The project org.apache.hadoop:hadoop-common:0.23.0-SNAPSHOT (/Users/szetszwo/hadoop/t2/hadoop-common/pom.xml) has 1 error
          [ERROR]     Non-resolvable parent POM: Could not find artifact org.apache.hadoop:hadoop-project:pom:0.23.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 17, column 11 -> [Help 2]
          [ERROR] 
          [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
          [ERROR] Re-run Maven using the -X switch to enable full debug logging.
          [ERROR] 
          [ERROR] For more information about the errors and possible solutions, please read the following articles:
          [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
          [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
          
          Show
          Tsz Wo Nicholas Sze added a comment - Hi Tom, it still did not work. szetszwo hadoop-common$mvn clean install -DskipTests [INFO] Scanning for projects... [ERROR] The build could not read 1 project -> [Help 1] [ERROR] [ERROR] The project org.apache.hadoop:hadoop-common:0.23.0-SNAPSHOT (/Users/szetszwo/hadoop/t2/hadoop-common/pom.xml) has 1 error [ERROR] Non-resolvable parent POM: Could not find artifact org.apache.hadoop:hadoop-project:pom:0.23.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 17, column 11 -> [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
          Hide
          Tom White added a comment -

          Hi Nicholas,

          Try building common with mvn clean install -DskipTests, then HDFS with ant veryclean compile -Dresolvers=internal.

          I'll see if I can get the hadoop-annotations jar published to the Apache snapshot repo so the first step isn't necessary.

          Show
          Tom White added a comment - Hi Nicholas, Try building common with mvn clean install -DskipTests , then HDFS with ant veryclean compile -Dresolvers=internal . I'll see if I can get the hadoop-annotations jar published to the Apache snapshot repo so the first step isn't necessary.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi, HDFS cannot be compiled. It failed with

          [ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::
          [ivy:resolve] 		::          UNRESOLVED DEPENDENCIES         ::
          [ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::
          [ivy:resolve] 		:: org.apache.hadoop#hadoop-annotations;0.23.0-SNAPSHOT: not found
          

          Could you take a look?

          Show
          Tsz Wo Nicholas Sze added a comment - Hi, HDFS cannot be compiled. It failed with [ivy:resolve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:resolve] :: UNRESOLVED DEPENDENCIES :: [ivy:resolve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:resolve] :: org.apache.hadoop#hadoop-annotations;0.23.0-SNAPSHOT: not found Could you take a look?
          Hide
          Tom White added a comment -

          I've committed this. Thanks, Alejandro!

          Show
          Tom White added a comment - I've committed this. Thanks, Alejandro!
          Hide
          Eric Yang added a comment -

          'mvn package -Pbintar' creates the new layout. I've just done what trunk does today.

          I see, thanks for the pointer. +1 on HADOOP-6671-AD.patch and HADOOP-6671-AD.sh.

          Show
          Eric Yang added a comment - 'mvn package -Pbintar' creates the new layout. I've just done what trunk does today. I see, thanks for the pointer. +1 on HADOOP-6671 -AD.patch and HADOOP-6671 -AD.sh.
          Hide
          Alejandro Abdelnur added a comment -

          Eric,

          Unless I'm understanding something different, current trunk 'ant tar' creates the old fashion layout, it is 'ant binary' the one that create the new layout. With this patch, 'mvn package -Ptar' create the old fashion layout, 'mvn package -Pbintar' creates the new layout. I've just done what trunk does today.

          Show
          Alejandro Abdelnur added a comment - Eric, Unless I'm understanding something different, current trunk 'ant tar' creates the old fashion layout, it is 'ant binary' the one that create the new layout. With this patch, 'mvn package -Ptar' create the old fashion layout, 'mvn package -Pbintar' creates the new layout. I've just done what trunk does today.
          Hide
          Eric Yang added a comment -

          Alejandro,

          The only concern is the directory structure layout is a regression from HADOOP-6255. HADOOP-7411 does not seem to be the jira which address the file structure layout change. Could you put the file structure layout to be the same as current trunk in this jira? It would make life easier for people who already switched to the new layout. Much appreciated.

          Show
          Eric Yang added a comment - Alejandro, The only concern is the directory structure layout is a regression from HADOOP-6255 . HADOOP-7411 does not seem to be the jira which address the file structure layout change. Could you put the file structure layout to be the same as current trunk in this jira? It would make life easier for people who already switched to the new layout. Much appreciated.
          Hide
          Tom White added a comment -

          +1

          I've tried out all the aspects of building common using Maven and it works well. I successfully managed to do a cross-project builds for HDFS (HDFS-2196) and MapReduce (MAPREDUCE-2741). I created Jenkins jobs for building the tarball (https://builds.apache.org/job/Hadoop-Common-trunk-maven/) and test-patch (https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-HADOOP-Build-maven/) and am happy that these can be switched over when this code goes in. (Note that they are not running at the moment since the Jenkins Hadoop machines are down.)

          I've added the Maven equivalents to http://wiki.apache.org/hadoop/HowToContribute so it's easy for folks to see how to do the common operations (they are in the BUILDING.txt file in this patch too).

          Show
          Tom White added a comment - +1 I've tried out all the aspects of building common using Maven and it works well. I successfully managed to do a cross-project builds for HDFS ( HDFS-2196 ) and MapReduce ( MAPREDUCE-2741 ). I created Jenkins jobs for building the tarball ( https://builds.apache.org/job/Hadoop-Common-trunk-maven/ ) and test-patch ( https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-HADOOP-Build-maven/ ) and am happy that these can be switched over when this code goes in. (Note that they are not running at the moment since the Jenkins Hadoop machines are down.) I've added the Maven equivalents to http://wiki.apache.org/hadoop/HowToContribute so it's easy for folks to see how to do the common operations (they are in the BUILDING.txt file in this patch too).
          Hide
          Alejandro Abdelnur added a comment -

          integrating Tom's request regarding hadoop-common/test/bin

          Show
          Alejandro Abdelnur added a comment - integrating Tom's request regarding hadoop-common/test/bin
          Hide
          Tom White added a comment -

          There's a small issue with moving hadoop-common/src/test/bin to dev-support since it is used as an svn:externals definition from HDFS and MapReduce. I suggest that we leave the bin directory there as well as creating the new dev-support directory, since the test-patch.sh script has been modified to work for Maven, so we need both. When all three projects have been Mavenized then we'll be able to remove the bin directory completely (and the svn:externals).

          Show
          Tom White added a comment - There's a small issue with moving hadoop-common/src/test/bin to dev-support since it is used as an svn:externals definition from HDFS and MapReduce. I suggest that we leave the bin directory there as well as creating the new dev-support directory, since the test-patch.sh script has been modified to work for Maven, so we need both. When all three projects have been Mavenized then we'll be able to remove the bin directory completely (and the svn:externals).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12488040/HADOOP-6671-AC.sh
          against trunk revision 1151594.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/776//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12488040/HADOOP-6671-AC.sh against trunk revision 1151594. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/776//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Attached patch integrates Eric's suggestions:

          • Using the package phase with a profile to create the TAR.
          • There are 2 TAR profiles, one is 'tar' the other is 'bintar'. '-Ptar' does the same TAR 'ant tar' and '-Pbintar' does the same TAR 'ant binary' does.
          • 'mvn install/deploy' pushes 4 JARs: hadoop-common, hadoop-common-tests, hadoop-common-sources and hadoop-common-test-sources.
          • The 'dev' dir has been renamed to 'dev-support'

          At this point all feedback has been integrated into the patch. Rebasing requires constant manual resolutions because this patch touches the layout.

          It would be great if we get this committed (for developers is working and also there is a patch for HDFS ant build that make HDFS ant build work with Mavenized common).

          I'm currently working HDFS mavenization and I should have a first drop later today.

          Show
          Alejandro Abdelnur added a comment - Attached patch integrates Eric's suggestions: Using the package phase with a profile to create the TAR. There are 2 TAR profiles, one is 'tar' the other is 'bintar'. '-Ptar' does the same TAR 'ant tar' and '-Pbintar' does the same TAR 'ant binary' does. 'mvn install/deploy' pushes 4 JARs: hadoop-common, hadoop-common-tests, hadoop-common-sources and hadoop-common-test-sources. The 'dev' dir has been renamed to 'dev-support' At this point all feedback has been integrated into the patch. Rebasing requires constant manual resolutions because this patch touches the layout. It would be great if we get this committed (for developers is working and also there is a patch for HDFS ant build that make HDFS ant build work with Mavenized common). I'm currently working HDFS mavenization and I should have a first drop later today.
          Hide
          Alejandro Abdelnur added a comment -

          Eric,

          I'm already using the trick for binding plugin and leveraging the order of the plugins as you describe. The limitation is that you cannot have the same plugin invoked twice in a single phase (antrun in our case) and a second plugin (assembly) sandwiched between the invocations of the first plugin.

          Maybe we could use pre-package phase to put the SO files/links in place. I'll give it a try.

          Show
          Alejandro Abdelnur added a comment - Eric, I'm already using the trick for binding plugin and leveraging the order of the plugins as you describe. The limitation is that you cannot have the same plugin invoked twice in a single phase (antrun in our case) and a second plugin (assembly) sandwiched between the invocations of the first plugin. Maybe we could use pre-package phase to put the SO files/links in place. I'll give it a try.
          Hide
          Eric Yang added a comment -

          Package Phase

          There is a trick in maven. Multiple plugins can define for the same phase, i.e. <phase>pre-package</phase> for both assembly plugin and ant plugin. It will be executed in the ordered sequence in the pom.xml. Hence, it may be possible to have pre-package that runs ant, assembly, then <phase>package</phase> to run the third ant plugin Execution.

          dev-support works for me.

          If we can make the directory layout like the current trunk, then it makes life easier for me with HADOOP-7411. Thanks

          Show
          Eric Yang added a comment - Package Phase There is a trick in maven. Multiple plugins can define for the same phase, i.e. <phase>pre-package</phase> for both assembly plugin and ant plugin. It will be executed in the ordered sequence in the pom.xml. Hence, it may be possible to have pre-package that runs ant, assembly, then <phase>package</phase> to run the third ant plugin Execution. dev-support works for me. If we can make the directory layout like the current trunk, then it makes life easier for me with HADOOP-7411 . Thanks
          Hide
          Alejandro Abdelnur added a comment -

          Hi Eric,

          Thanks for testing it out and the feedback.

          Regarding the layout

          I've thought the layout you are describing only applies for RPM/DEB. I'll update it accordingly.

          Regarding package assembly in package phase

          I agree the 'verify' phase is not the right place, still the problem I'm facing is that as part of the TAR creation we need to do an antrun task before assembly:single and an antrun task after assembly:single. This is needed to preserve the symlinks of SOs both on copying and on taring (the assembly plugin converts symlinks into copied files both on copy and on taring).

          On solution for this would be to take the assembly work to a separate maven module (common-distro) as I had earlier. As this module will not have other job that creating the TAR (RPM/DEB) we can use intermediate phases.

          The use of this module by the reactor (in the parent POM) would be in a profile, so only if doing -Dtar, -Drpm or -Ddeb it would kick in.

          Regarding your tar profile paragraph

          The idea is that the hadoop-tar descriptor will be generic for common/hdfs/mapred and any other downstream project that wants to use a similar layout.

          The Hadop-profile is a POM project, thus we cannot put the assemblies/annotations in there. In addition, it would break things in Maven if there are classes/resource in there as they would not be compiled as part of the child module, not they could be added a dependencies. Thus, they are separate modules (which eventually could be in a dev-support profile and released independently if we don't want to build them as part of the normal Hadoop build).

          Regarding the 'dev' directory

          I've struggled with the name for a bit, it was bin, maven now dev. I don't like it very much either. patch-build seems a bit odd as well, how about dev-support?

          Regarding the forward porting of the RPM/DEB packaging

          My idea is to first to get developers in Maven first (thus this patch, followed by HDFS and mapreduce mavenization).

          And then continue with packaging, i.e. HADOOP-7411 would take care of the RPM/DEB of common.

          I was hoping I would get some help from you here.

          Thoughts?

          Thxs.

          Alejandro

          Show
          Alejandro Abdelnur added a comment - Hi Eric, Thanks for testing it out and the feedback. Regarding the layout I've thought the layout you are describing only applies for RPM/DEB. I'll update it accordingly. Regarding package assembly in package phase I agree the 'verify' phase is not the right place, still the problem I'm facing is that as part of the TAR creation we need to do an antrun task before assembly:single and an antrun task after assembly:single. This is needed to preserve the symlinks of SOs both on copying and on taring (the assembly plugin converts symlinks into copied files both on copy and on taring). On solution for this would be to take the assembly work to a separate maven module (common-distro) as I had earlier. As this module will not have other job that creating the TAR (RPM/DEB) we can use intermediate phases. The use of this module by the reactor (in the parent POM) would be in a profile, so only if doing -Dtar, -Drpm or -Ddeb it would kick in. Regarding your tar profile paragraph The idea is that the hadoop-tar descriptor will be generic for common/hdfs/mapred and any other downstream project that wants to use a similar layout. The Hadop-profile is a POM project, thus we cannot put the assemblies/annotations in there. In addition, it would break things in Maven if there are classes/resource in there as they would not be compiled as part of the child module, not they could be added a dependencies. Thus, they are separate modules (which eventually could be in a dev-support profile and released independently if we don't want to build them as part of the normal Hadoop build). Regarding the 'dev' directory I've struggled with the name for a bit, it was bin, maven now dev. I don't like it very much either. patch-build seems a bit odd as well, how about dev-support? Regarding the forward porting of the RPM/DEB packaging My idea is to first to get developers in Maven first (thus this patch, followed by HDFS and mapreduce mavenization). And then continue with packaging, i.e. HADOOP-7411 would take care of the RPM/DEB of common. I was hoping I would get some help from you here. Thoughts? Thxs. Alejandro
          Hide
          Eric Yang added a comment -

          The build structure is coming along nicely. The tarball structure is based on structure prior to HADOOP-6255. Could we change the file system layout to the new structure?

          The brief layout of the files structure should look like:

          $PREFIX/bin/hadoop
          $PREFIX/sbin/hadoop-daemon*.sh
                      /[start|stop]-*.sh
                      /hadoop-setup*.sh
                      /hadoop-create-user.sh
          $PREFIX/libexec/hadoop-config.sh
          $PREFIX/etc/hadoop/*.xml
          $PREFIX/lib/*.[so|dylib]
          $PREFIX/share/hadoop/common/hadoop*.jar
          $PREFIX/share/hadoop/common/lib/*.jar
          $PREFIX/share/doc/hadoop/common/*.txt
          

          In addition, could we have package assembly in package phase? Hence, we could have pre integration test phase to deploy a real cluster, and integration test phase to run multi-node tests. Verify phase should be used to verify integration test phase result, and generate clover reports.

          Shouldn't tar profile use property $

          {project.artifactId}

          for the descriptorRef for downstream module to reuse this profile? It would be cleaner if hadoop-assemblies, and hadoop-annotation are part of hadoop-project. The same applies to src and tar profiles, hence the submodules can reuse the same profile code.

          dev directory could use a better name, like patch-build. It may be alarming for old school sys admin to see dev filename here. It would be really great if you could forward porting the rpm/deb packaging target to maven profiles and invoke maven-ant-plugin. Thanks

          Show
          Eric Yang added a comment - The build structure is coming along nicely. The tarball structure is based on structure prior to HADOOP-6255 . Could we change the file system layout to the new structure? The brief layout of the files structure should look like: $PREFIX/bin/hadoop $PREFIX/sbin/hadoop-daemon*.sh /[start|stop]-*.sh /hadoop-setup*.sh /hadoop-create-user.sh $PREFIX/libexec/hadoop-config.sh $PREFIX/etc/hadoop/*.xml $PREFIX/lib/*.[so|dylib] $PREFIX/share/hadoop/common/hadoop*.jar $PREFIX/share/hadoop/common/lib/*.jar $PREFIX/share/doc/hadoop/common/*.txt In addition, could we have package assembly in package phase? Hence, we could have pre integration test phase to deploy a real cluster, and integration test phase to run multi-node tests. Verify phase should be used to verify integration test phase result, and generate clover reports. Shouldn't tar profile use property $ {project.artifactId} for the descriptorRef for downstream module to reuse this profile? It would be cleaner if hadoop-assemblies, and hadoop-annotation are part of hadoop-project. The same applies to src and tar profiles, hence the submodules can reuse the same profile code. dev directory could use a better name, like patch-build. It may be alarming for old school sys admin to see dev filename here. It would be really great if you could forward porting the rpm/deb packaging target to maven profiles and invoke maven-ant-plugin. Thanks
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487223/mvn-layout-AB.sh
          against trunk revision 1148933.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/755//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487223/mvn-layout-AB.sh against trunk revision 1148933. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/755//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Fileset 'AB' minor fixes in test-patch.sh, renaming devbin/ hadoop-common/maven/ dirs to be 'dev/' (for development) as they contain script files useful for development.

          Show
          Alejandro Abdelnur added a comment - Fileset 'AB' minor fixes in test-patch.sh, renaming devbin/ hadoop-common/maven/ dirs to be 'dev/' (for development) as they contain script files useful for development.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12487088/mvn-layout-AA.sh
          against trunk revision 1147971.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/751//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487088/mvn-layout-AA.sh against trunk revision 1147971. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/751//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Script and patch 'AA'

          Integrating all Eric's comments. The layout is now:

          -- pom.xml (hadoop reactor pom)
           |
           |-devbin/ (test-patch scripts)
           |
           |-hadoop-project/pom.xml (project defaults/dependencies)
           |
           |-hadoop-annotations/pom.xml (doclet and javadoc annotations)
           |
           |-hadoop-assemblies/pom.xml (reusable assemblies)
           |
           |-hadoop-common/pom.xml (common)
          

          I've named the modules 'hadoop-...' because both IntelliJ and Eclipse use the artifactId of the POM to display the project structure and it will be less confusing to developers if what the IDE shows an the directory structure are the same.

          The common build commands are:

          * Clean                     : mvn clean
          * Compile                   : mvn compile [-Pnative]
          * Run tests                 : mvn test [-Pnative]
          * Create JAR                : mvn package
          * Run findbugs              : mvn compile findbugs:findbugs
          * Run checkstyle            : mvn compile checkstyle:checkstyle
          * Install JAR in M2 cache   : mvn install
          * Deploy JAR to Maven repo  : mvn deploy
          * Run clover                : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license]
          * Run Rat                   : mvn apache-rat:check
          * Build javadocs            : mvn javadocs:javadocs
          * Build TAR                 : mvn verify -Ptar[,docs][,src][,native]
          

          The 'verify' phase is used to create the TAR. This is done this way because we need to run some scripts before and after the assembly:single goal (I've spent all morning trying to define a custom lifcycle to have a 'distro' phase but I couldn't get it to work, this could be a later improvement).

          Full details on the Maven build commands can be found in the hadoop-common/BUILDING.txt

          Tom, now I would need your help to update the test-patch script and the jenkins job.

          Show
          Alejandro Abdelnur added a comment - Script and patch 'AA' Integrating all Eric's comments. The layout is now: -- pom.xml (hadoop reactor pom) | |-devbin/ (test-patch scripts) | |-hadoop-project/pom.xml (project defaults/dependencies) | |-hadoop-annotations/pom.xml (doclet and javadoc annotations) | |-hadoop-assemblies/pom.xml (reusable assemblies) | |-hadoop-common/pom.xml (common) I've named the modules 'hadoop-...' because both IntelliJ and Eclipse use the artifactId of the POM to display the project structure and it will be less confusing to developers if what the IDE shows an the directory structure are the same. The common build commands are: * Clean : mvn clean * Compile : mvn compile [-Pnative] * Run tests : mvn test [-Pnative] * Create JAR : mvn package * Run findbugs : mvn compile findbugs:findbugs * Run checkstyle : mvn compile checkstyle:checkstyle * Install JAR in M2 cache : mvn install * Deploy JAR to Maven repo : mvn deploy * Run clover : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license] * Run Rat : mvn apache-rat:check * Build javadocs : mvn javadocs:javadocs * Build TAR : mvn verify -Ptar[,docs][,src][, native ] The 'verify' phase is used to create the TAR. This is done this way because we need to run some scripts before and after the assembly:single goal (I've spent all morning trying to define a custom lifcycle to have a 'distro' phase but I couldn't get it to work, this could be a later improvement). Full details on the Maven build commands can be found in the hadoop-common/BUILDING.txt Tom, now I would need your help to update the test-patch script and the jenkins job.
          Hide
          Eric Yang added a comment -

          Why common-distro, hdfs-distro, mapreduce-distro have to be separated? This can be defined as part of the super pom or parent pom to inherit same build structure across modules. For long term maintainability, it would be more ideal to define the profile once and inherit from parent module. If hdfs-distro and mapreduce-distro are separate submodule as layout, the code sharing does not take place.

          mvn clean package -Pjavadoc,docs,tar,rpm 
          

          Should be more human friendly than having to remember plugin specific commands.

          Assembly is not a phase, but it's a goal. There is no predefined phase called assembly. For distro module, it makes sense to have package phase wire to assembly goal. It is easy to wire antrun plugin to package phase in a profile, and define multiple profile for debian, rpm packages.

              <profile>
                <id>deb</id>
                <build>
                  <plugins>
                    <plugin>
                      <artifactId>maven-antrun-plugin</artifactId>
                      <version>1.6</version>
                      <executions>
                        <execution>
                          <id>build-deb</id>
                          <phase>package</phase>
                          <configuration>
                            <target>
                              <property name="artifactId" value="${project.artifactId}" />
                              <ant antfile="${basedir}/src/packages/build.xml">
                                <target name="package-deb"/>
                                <target name="package-conf-pseudo-deb"/>
                              </ant>
                            </target>
                          </configuration>
                          <goals>
                            <goal>run</goal>
                          </goals>
                        </execution>
                      </executions>
                    </plugin>
                  </plugins>
                </build>
              </profile>
          

          Although the actual heavy lifting is done in ant to work around bugs in maven jdeb and maven rpm plugin, it is nicer to have a global structured pattern for locating src/package/build.xml for each submodule.

          As long as the same profile code doesn't exist in common-distro, hdfs-distro, and mapreduce-distro, then I am fine with the proposal.

          Show
          Eric Yang added a comment - Why common-distro, hdfs-distro, mapreduce-distro have to be separated? This can be defined as part of the super pom or parent pom to inherit same build structure across modules. For long term maintainability, it would be more ideal to define the profile once and inherit from parent module. If hdfs-distro and mapreduce-distro are separate submodule as layout, the code sharing does not take place. mvn clean package -Pjavadoc,docs,tar,rpm Should be more human friendly than having to remember plugin specific commands. Assembly is not a phase, but it's a goal. There is no predefined phase called assembly. For distro module, it makes sense to have package phase wire to assembly goal. It is easy to wire antrun plugin to package phase in a profile, and define multiple profile for debian, rpm packages. <profile> <id>deb</id> <build> <plugins> <plugin> <artifactId>maven-antrun-plugin</artifactId> <version>1.6</version> <executions> <execution> <id>build-deb</id> <phase>package</phase> <configuration> <target> <property name="artifactId" value="${project.artifactId}" /> <ant antfile="${basedir}/src/packages/build.xml"> <target name="package-deb"/> <target name="package-conf-pseudo-deb"/> </ant> </target> </configuration> <goals> <goal>run</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </profile> Although the actual heavy lifting is done in ant to work around bugs in maven jdeb and maven rpm plugin, it is nicer to have a global structured pattern for locating src/package/build.xml for each submodule. As long as the same profile code doesn't exist in common-distro, hdfs-distro, and mapreduce-distro, then I am fine with the proposal.
          Hide
          Alejandro Abdelnur added a comment -

          Do you mean having one assembly descriptor for TAR, one for DEB and one for RPM; and then reusing them for common, hdfs and mapreduce? If so, this is possible (http://maven.apache.org/plugins/maven-assembly-plugin/examples/sharing-descriptors.html).

          And still you'd have a common-distro, an hdfs-distro and mapreduce-distro module.

          I'm OK with the profiles to decide with distro you want to build (TAR/RPM/DEB).

          However, I'm not OK with wiring the assembly to the package phase. I've done exactly this in the first version of the Mavenization I've posted. And (as mentioned in my previous post) I've got into messy situations where lots of plugins are bound to non-standard phases in a particular order to ensure things fall in place.

          I rather used standard phase bindings for building a distro. For example:

          To build the TAR distro without documentation:

          $ mvn clean package assembly:single
          

          To build the TAR distro with documentation:

          $ mvn clean package site assembly:single
          

          To build RPM the -Prpm profile and to build Debian the -Pdebian profile. For example:

          $ mvn clean package site assembly:single -Prpm 
          

          And still, the open issue is how to invoke the antrun plugin in a custom lifecycle with phases like 'pre-assembly', 'assembly' & 'post-assembly' to be able to use the antrun plugin before and after to take care of symlinks and call the RPM/DEB generation tools.

          Show
          Alejandro Abdelnur added a comment - Do you mean having one assembly descriptor for TAR, one for DEB and one for RPM; and then reusing them for common, hdfs and mapreduce? If so, this is possible ( http://maven.apache.org/plugins/maven-assembly-plugin/examples/sharing-descriptors.html ). And still you'd have a common-distro, an hdfs-distro and mapreduce-distro module. I'm OK with the profiles to decide with distro you want to build (TAR/RPM/DEB). However, I'm not OK with wiring the assembly to the package phase. I've done exactly this in the first version of the Mavenization I've posted. And (as mentioned in my previous post) I've got into messy situations where lots of plugins are bound to non-standard phases in a particular order to ensure things fall in place. I rather used standard phase bindings for building a distro. For example: To build the TAR distro without documentation: $ mvn clean package assembly:single To build the TAR distro with documentation: $ mvn clean package site assembly:single To build RPM the -Prpm profile and to build Debian the -Pdebian profile. For example: $ mvn clean package site assembly:single -Prpm And still, the open issue is how to invoke the antrun plugin in a custom lifecycle with phases like 'pre-assembly', 'assembly' & 'post-assembly' to be able to use the antrun plugin before and after to take care of symlinks and call the RPM/DEB generation tools.
          Hide
          Eric Yang added a comment -

          The proposal looks sounding, but I am still concern if hdfs/mapreduce can use hadoop-common-distro's profile to build packages? XML editor can helps in visualizing the large structure. Having centralized definitions is easier to replicate properly downstream.

          HBase maven builds tarball by default. It is not a regression introduced recently. Using profile can completely skip the package building rather than initialized module then noop. The latter approach still cost a few seconds more.

          How about shorten hadoop-common-distro to hadoop-distro, and set it as the last module to build? For developers, it would be ideal to compile and create jar files in package phase, but not building distro packages. This could ensure common to be a standalone module.

          The layout and building sequence would then be:

          trunk/pom.xml
          |-- hadoop-project (dependency management)
          |-- common
          |-- hdfs
          |-- mapreduce
          |-- hadoop-docs (javadoc, doclet)
          |-- hadoop-distro (packaging)
          

          By using this layout, common packaging steps can be consolidated, and without -Ptar, it is noop for hadoop-distro module. The same applies to hadoop-docs.

          Show
          Eric Yang added a comment - The proposal looks sounding, but I am still concern if hdfs/mapreduce can use hadoop-common-distro's profile to build packages? XML editor can helps in visualizing the large structure. Having centralized definitions is easier to replicate properly downstream. HBase maven builds tarball by default. It is not a regression introduced recently. Using profile can completely skip the package building rather than initialized module then noop. The latter approach still cost a few seconds more. How about shorten hadoop-common-distro to hadoop-distro, and set it as the last module to build? For developers, it would be ideal to compile and create jar files in package phase, but not building distro packages. This could ensure common to be a standalone module. The layout and building sequence would then be: trunk/pom.xml |-- hadoop-project (dependency management) |-- common |-- hdfs |-- mapreduce |-- hadoop-docs (javadoc, doclet) |-- hadoop-distro (packaging) By using this layout, common packaging steps can be consolidated, and without -Ptar, it is noop for hadoop-distro module. The same applies to hadoop-docs.
          Hide
          Alejandro Abdelnur added a comment -

          NOTE: the times of the tar generation using Maven are high because of a nit in Maven assembly, I'm working on a tweak.

          Show
          Alejandro Abdelnur added a comment - NOTE: the times of the tar generation using Maven are high because of a nit in Maven assembly, I'm working on a tweak.
          Hide
          Alejandro Abdelnur added a comment -

          @Eric,

          First of all, thanks for volunteering to tackle the Mavenization RPM/DEB.

          My initial approach to the patch was heavily based in profiles doing what you are suggesting. The end result was a very large POM for 'common' with profiles heavily relying in the order of the plugins to do the right thing (I had to define all the plugins, even if not used, in the main <build> to ensure the right order of execution when the profiles are active). The result was a POM difficult to follow and to update (got bitten a few times while improving it).

          My second approach, the current one, it is much cleaner in that regard. It fully leverages Maven reactor and build times are not affected. Following is a table that shows the time taken by common build tasks:

          Build task Ant command Maven command Ant Time Maven time
          clean ant clean mvn clean 00:02 00:01 *
          clean compile ant clean compile mvn clean compile 00:20 00:13 *
          clean test-compile ant clean test-compile mvn clean test -DskipTests 00:23 00:17 *
          clean 1 test ant clean test -Dtestcase=TestConfiguration mvn clean test -Dtest=TestConfiguration 01:09 00:27 *
          <warm> 1 test ant test -Dtestcase=TestConfiguration mvn test -Dtest=TestConfiguration 00:52 00:11 *
          clean jar test-jar ant clean jar jar-test mvn clean package 00:28 00:23 *
          clean binary-tar ant clean binary mvn clean package post-site -DskipTests 00:59 00:46 *
          clean tar w/docs ant clean tar mvn clean package post-site -DskipTests -Pdocs N/A 04:10
          clean tar w/docs/src mvn clean tar mvn clean package post-site -DskipTests -Pdocs -Psource 01:34 * 05:18

          Of all these, IMO the most interesting improvement is running a single test (from scratch and with pre-compiled classes). This will be a huge improvement for development.

          Said this, we could merge hadoop-docs in hadoop-common, using the 'site' phase to wire all documentation generation (I think this wouldn't complicate things too much).

          However, for TAR/RPM/DEB I would like to keep a different module which kicks with the assembly plugin to generate the TAR/RMP/DEB. And there we could have a profiles that build a TAR, a RMP and/or a DEB.

          Another benefit of this is that all scripts and stuff would end up in the TAR/RMP/DEB module, the hadoop-common module only produces a JAR file.

          The layout would then be:

          trunk/pom.xml
          |
          |-- hadoop-annotations/pom.xml (javadoc annotations and doclet)
          |
          |-- hadoop-project/pom.xml (dependency management, extended by all other modules))
          |
          |-- common/pom.xml
          |      |
          |      |-- hadoop-common/pom.xml [clean, compile,package,install,deploy,site] (-Pnative)
          |      |
          |      |-- hadoop-common-distro/pom.xml [clean, assembly:single] (-Ptar -Prpm -Pdeb)
          |
          |-- hdfs
          |
          |-- mapreduce
          

          The [...] are the meaningful lifecycle phases.

          The (-P...) are the profiles each module would support.

          The only thing we have to sort out is how to wire the maven-antrun-plugin to run after the 'assembly:single' invocation. This is required to be able to create invoke Unix TAR to create the TAR in order to preserve the symlinks.

          Would you be OK with this approach?

          Thoughts?

          PS: I'm somehow familiar with Hbase packaging and the current overloading of maven phases and profiles usages makings things too slow (until not long ago, not sure if still valid, running 'mvn install' was generating the TAR).

          Show
          Alejandro Abdelnur added a comment - @Eric, First of all, thanks for volunteering to tackle the Mavenization RPM/DEB. My initial approach to the patch was heavily based in profiles doing what you are suggesting. The end result was a very large POM for 'common' with profiles heavily relying in the order of the plugins to do the right thing (I had to define all the plugins, even if not used, in the main <build> to ensure the right order of execution when the profiles are active). The result was a POM difficult to follow and to update (got bitten a few times while improving it). My second approach, the current one, it is much cleaner in that regard. It fully leverages Maven reactor and build times are not affected. Following is a table that shows the time taken by common build tasks: Build task Ant command Maven command Ant Time Maven time clean ant clean mvn clean 00:02 00:01 * clean compile ant clean compile mvn clean compile 00:20 00:13 * clean test-compile ant clean test-compile mvn clean test -DskipTests 00:23 00:17 * clean 1 test ant clean test -Dtestcase=TestConfiguration mvn clean test -Dtest=TestConfiguration 01:09 00:27 * <warm> 1 test ant test -Dtestcase=TestConfiguration mvn test -Dtest=TestConfiguration 00:52 00:11 * clean jar test-jar ant clean jar jar-test mvn clean package 00:28 00:23 * clean binary-tar ant clean binary mvn clean package post-site -DskipTests 00:59 00:46 * clean tar w/docs ant clean tar mvn clean package post-site -DskipTests -Pdocs N/A 04:10 clean tar w/docs/src mvn clean tar mvn clean package post-site -DskipTests -Pdocs -Psource 01:34 * 05:18 Of all these, IMO the most interesting improvement is running a single test (from scratch and with pre-compiled classes). This will be a huge improvement for development. Said this, we could merge hadoop-docs in hadoop-common , using the 'site' phase to wire all documentation generation (I think this wouldn't complicate things too much). However, for TAR/RPM/DEB I would like to keep a different module which kicks with the assembly plugin to generate the TAR/RMP/DEB. And there we could have a profiles that build a TAR, a RMP and/or a DEB. Another benefit of this is that all scripts and stuff would end up in the TAR/RMP/DEB module, the hadoop-common module only produces a JAR file. The layout would then be: trunk/pom.xml | |-- hadoop-annotations/pom.xml (javadoc annotations and doclet) | |-- hadoop-project/pom.xml (dependency management, extended by all other modules)) | |-- common/pom.xml | | | |-- hadoop-common/pom.xml [clean, compile, package ,install,deploy,site] (-Pnative) | | | |-- hadoop-common-distro/pom.xml [clean, assembly:single] (-Ptar -Prpm -Pdeb) | |-- hdfs | |-- mapreduce The [...] are the meaningful lifecycle phases. The (-P...) are the profiles each module would support. The only thing we have to sort out is how to wire the maven-antrun-plugin to run after the 'assembly:single' invocation. This is required to be able to create invoke Unix TAR to create the TAR in order to preserve the symlinks. Would you be OK with this approach? Thoughts? PS: I'm somehow familiar with Hbase packaging and the current overloading of maven phases and profiles usages makings things too slow (until not long ago, not sure if still valid, running 'mvn install' was generating the TAR).
          Hide
          Eric Yang added a comment -

          The directory structure is designed by artifact file type. This will introduce more latency for building optional parts in the commit build. It would be better to use profile to control the artifact output type. For example, docs, tar, rpm, and deb are just different profiles. This would shrink the directory structures to look like this:

          trunk/pom.xml
          |
          |-- common/pom.xml
          |         /src/packages/deb
          |         /src/packages/rpm
          |         /src/packages/tar
          |         /src/docs/site
          |         /src/docs/javadoc
          |
          |-- hdfs/dfsclient
          |       /namenode
          |       /datanode
          |
          |-- mapreduce
          |
          |-- hadoop
          

          Using profile would ensure that package generation code or javadoc generation code are defined in one central place to reduce repetitive building/packaging code. HBase has profiles to control rpm/deb/tarball generation code, and it may be useful as a reference. Where hadoop is the aggregating submodule that copies artifacts from common, hdfs, mapreduce to create a release. I will port the ant rpm/deb work to maven after your patch is committed.

          Show
          Eric Yang added a comment - The directory structure is designed by artifact file type. This will introduce more latency for building optional parts in the commit build. It would be better to use profile to control the artifact output type. For example, docs, tar, rpm, and deb are just different profiles. This would shrink the directory structures to look like this: trunk/pom.xml | |-- common/pom.xml | /src/packages/deb | /src/packages/rpm | /src/packages/tar | /src/docs/site | /src/docs/javadoc | |-- hdfs/dfsclient | /namenode | /datanode | |-- mapreduce | |-- hadoop Using profile would ensure that package generation code or javadoc generation code are defined in one central place to reduce repetitive building/packaging code. HBase has profiles to control rpm/deb/tarball generation code, and it may be useful as a reference. Where hadoop is the aggregating submodule that copies artifacts from common, hdfs, mapreduce to create a release. I will port the ant rpm/deb work to maven after your patch is committed.
          Hide
          Alejandro Abdelnur added a comment -

          @Tom,

          Thanks.

          • On naming. I'd prefer keeping the module names and the artifact names the same because IDE show the artifact name. When jumping from IDE to command line and viceversa it will be easier to find your way around.
          • On the cross projects build for HDFS and MapReduce. Is possible to commit that patch independent of this one? Or we'll break Ant builds? If the former, we should open an HDFS and MR JIRA and do it there. If the later, we should add your patch to the main HADOOP-6671.
          • On running Hadoop from tarball. Last time I'v tried to do that from committed trunk I've failed miserably. All startup script were pointing to funny places.
          • On updating HowToContribute, agree. I've been keeping tabs on BUILDING.txt file, once this patch is committed we can use it to update the wiki.

          How about the following names for the Module/Directory structure?

          trunk/pom.xml
          |
          |-- hadoop-annotations/pom.xml (renaming doclets)
          |
          |-- hadoop-project/pom.xml
          |
          |-- common/pom.xml
          |      |
          |      |-- hadoop-common/pom.xml
          |      |
          |      |-- hadoop-common-docs/pom.xml
          |      |
          |      |-- hadoop-common-tar/pom.xml
          |
          |-- hdfs
          |
          |-- mapreduce
          

          We'd have also a hadoop-common-rpm and hadoop-common-deb.

          Then hdfs/mapreduce would have a similar structure.

          And there would be a hadoop-contribs with a submodule for contrib project (which could have sub-sub-modules).

          Show
          Alejandro Abdelnur added a comment - @Tom, Thanks. On naming. I'd prefer keeping the module names and the artifact names the same because IDE show the artifact name. When jumping from IDE to command line and viceversa it will be easier to find your way around. On the cross projects build for HDFS and MapReduce. Is possible to commit that patch independent of this one? Or we'll break Ant builds? If the former, we should open an HDFS and MR JIRA and do it there. If the later, we should add your patch to the main HADOOP-6671 . On running Hadoop from tarball. Last time I'v tried to do that from committed trunk I've failed miserably. All startup script were pointing to funny places. On updating HowToContribute, agree. I've been keeping tabs on BUILDING.txt file, once this patch is committed we can use it to update the wiki. How about the following names for the Module/Directory structure? trunk/pom.xml | |-- hadoop-annotations/pom.xml (renaming doclets) | |-- hadoop-project/pom.xml | |-- common/pom.xml | | | |-- hadoop-common/pom.xml | | | |-- hadoop-common-docs/pom.xml | | | |-- hadoop-common-tar/pom.xml | |-- hdfs | |-- mapreduce We'd have also a hadoop-common-rpm and hadoop-common-deb . Then hdfs/mapreduce would have a similar structure. And there would be a hadoop-contribs with a submodule for contrib project (which could have sub-sub-modules).
          Hide
          Tom White added a comment -
          • On naming, I think the "hadoop-" prefix in module names is unnecessary. I would prefer common/common-main, etc.
          • I updated the pre-commit job for Maven to test that "bad" patch in HADOOP-7413. See the end of https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-HADOOP-Build-maven/11/console. The only missing -1 is for the javadoc, which failed due to missing artifacts and can be fixed by adding a line calling "mvn install -DskipTests" in the section labelled "Pre-build trunk to verify trunk stability and javac warnings".
          • We need to make sure that cross project builds still work. When I tried doing "mvn install", then building HDFS I got an error which can be fixed with HADOOP-6671-cross-project-HDFS.patch, which I posted a while back. We should commit this, and do the same for MapReduce.
          • Have you checked that you can run Hadoop from a tarball built using Maven? (BTW https://builds.apache.org/job/Hadoop-Common-trunk-maven/ is building nightly tarballs using Maven.)
          • What needs doing to make for a smooth switchover as smooth for developers? We should update http://wiki.apache.org/hadoop/HowToContribute (perhaps make a copy so it's available before the switch). Anything else?
          Show
          Tom White added a comment - On naming, I think the "hadoop-" prefix in module names is unnecessary. I would prefer common/common-main, etc. I updated the pre-commit job for Maven to test that "bad" patch in HADOOP-7413 . See the end of https://builds.apache.org/view/G-L/view/Hadoop/job/PreCommit-HADOOP-Build-maven/11/console . The only missing -1 is for the javadoc, which failed due to missing artifacts and can be fixed by adding a line calling "mvn install -DskipTests" in the section labelled "Pre-build trunk to verify trunk stability and javac warnings". We need to make sure that cross project builds still work. When I tried doing "mvn install", then building HDFS I got an error which can be fixed with HADOOP-6671 -cross-project-HDFS.patch, which I posted a while back. We should commit this, and do the same for MapReduce. Have you checked that you can run Hadoop from a tarball built using Maven? (BTW https://builds.apache.org/job/Hadoop-Common-trunk-maven/ is building nightly tarballs using Maven.) What needs doing to make for a smooth switchover as smooth for developers? We should update http://wiki.apache.org/hadoop/HowToContribute (perhaps make a copy so it's available before the switch). Anything else?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12486652/HADOOP-6671-q.patch
          against trunk revision 1146912.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 67 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/733//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486652/HADOOP-6671-q.patch against trunk revision 1146912. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 67 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/733//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Attached patch is rebased to current trunk (SVN Revision: 1147263).

          It also includes fixes that get the test-patch script and Jenkings runs working (thanks Tom)

          Show
          Alejandro Abdelnur added a comment - Attached patch is rebased to current trunk (SVN Revision: 1147263). It also includes fixes that get the test-patch script and Jenkings runs working (thanks Tom)
          Hide
          Alejandro Abdelnur added a comment -

          @Tom - jenkins, thanks for testing the patch with Jenkins

          @Tom/@Giri - module names, what about the following?

          trunk/pom.xml
          |
          |-- hadoop-doclet/pom.xml
          |
          |-- hadoop-project/pom.xml
          |
          |-- common/pom.xml
          |      |
          |      |-- hadoop-common/pom.xml
          |      |
          |      |-- hadoop-common-docs/pom.xml
          |      |
          |      |-- hadoop-common-tar/pom.xml
          |
          |-- hdfs
          |
          |-- mapreduce
          

          @Tom - using Apache parent POM, I have not.

          Looking at it it seems like a good idea.

          Would you agree to do that as an improvement once we have all Hadoop mavenized?

          Show
          Alejandro Abdelnur added a comment - @Tom - jenkins, thanks for testing the patch with Jenkins @Tom/@Giri - module names, what about the following? trunk/pom.xml | |-- hadoop-doclet/pom.xml | |-- hadoop-project/pom.xml | |-- common/pom.xml | | | |-- hadoop-common/pom.xml | | | |-- hadoop-common-docs/pom.xml | | | |-- hadoop-common-tar/pom.xml | |-- hdfs | |-- mapreduce @Tom - using Apache parent POM, I have not. Looking at it it seems like a good idea. Would you agree to do that as an improvement once we have all Hadoop mavenized?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12486471/HADOOP-6671-p.patch
          against trunk revision 1146300.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 64 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/731//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486471/HADOOP-6671-p.patch against trunk revision 1146300. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 64 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/731//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          excluding the jdk.tools JAR from the generated TAR

          Show
          Alejandro Abdelnur added a comment - excluding the jdk.tools JAR from the generated TAR
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12486446/HADOOP-6671-o.patch
          against trunk revision 1146300.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 64 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/729//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486446/HADOOP-6671-o.patch against trunk revision 1146300. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 64 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/729//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Fixes TAR name and tar path (it was using absolute paths inside)

          Show
          Alejandro Abdelnur added a comment - Fixes TAR name and tar path (it was using absolute paths inside)
          Hide
          Tom White added a comment -

          I updated https://builds.apache.org/job/Hadoop-Common-trunk-maven/ to use patch 'n'.

          A couple of quick comments:

          Show
          Tom White added a comment - I updated https://builds.apache.org/job/Hadoop-Common-trunk-maven/ to use patch 'n'. A couple of quick comments: I agree with Giri that common/common-main (or even common/main) is a better convention than common-main/common. Have you considered using the Apache parent POM? http://svn.apache.org/repos/asf/maven/pom/tags/maven-parent-9/pom.xml
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12486390/HADOOP-6671-n.patch
          against trunk revision 1146300.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 64 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/728//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486390/HADOOP-6671-n.patch against trunk revision 1146300. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 64 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/728//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          'n' version of script & patch, rebasing to current trunk as the removal of chinese docs was breaking the patch

          Show
          Alejandro Abdelnur added a comment - 'n' version of script & patch, rebasing to current trunk as the removal of chinese docs was breaking the patch
          Hide
          Alejandro Abdelnur added a comment -

          Thanks for trying it out Giri.

          #1 - I'm thinking, that even if it will be verbose, the module dir names should match the module artifact name. It seems more intuitive to me, then it would be as my second diagram in my previous comment.

          #2 - My idea is to use trunk/hadoop-root/pom.xml. I've run into problems in the past with some plugins when using as parent the same POM used by the reactor. Some plugins break unless the root-reactor pom is installed in .m2 cache.

          #3 - the problem here is that the reactor is not including the doclet module and you don't have the doclet artifact in your .m2 cache. Run 'mvn install' from trunk once, then you can work in a sub-module an the no-reactor modules will be picked up from .m2 cache.

          Common depends on doclet as provided for javadocs/jdiff runs. Doing what is mentioned in #3 will take care of this.

          Note that once this is committed, nightly builds would push all module artifacts to the Maven snapshots repo and you'll be able to work from any module consuming the latest nightly artifacts (that are not resolved by the reactor).

          Show
          Alejandro Abdelnur added a comment - Thanks for trying it out Giri. #1 - I'm thinking, that even if it will be verbose, the module dir names should match the module artifact name. It seems more intuitive to me, then it would be as my second diagram in my previous comment. #2 - My idea is to use trunk/hadoop-root/pom.xml. I've run into problems in the past with some plugins when using as parent the same POM used by the reactor. Some plugins break unless the root-reactor pom is installed in .m2 cache. #3 - the problem here is that the reactor is not including the doclet module and you don't have the doclet artifact in your .m2 cache. Run 'mvn install' from trunk once, then you can work in a sub-module an the no-reactor modules will be picked up from .m2 cache. Common depends on doclet as provided for javadocs/jdiff runs. Doing what is mentioned in #3 will take care of this. Note that once this is committed, nightly builds would push all module artifacts to the Maven snapshots repo and you'll be able to work from any module consuming the latest nightly artifacts (that are not resolved by the reactor).
          Hide
          Giridharan Kesavan added a comment -

          thanks for the clarification Alejandro.
          You are right; common is just a folder. I missed deleting it.

          1) most of the projects that I ve seen uses this convention

          common/pom.xml
          ---common-main/pom.xml
          

          2) can trunk/pom.xml used for module dependencies b/w common,hdfs & mapreduce?
          (just wondering about the reason for root/pom.xml)

          3) cd trunk/common-main/common
          mvn install - I get the following error

             [WARNING] The POM for org.apache.hadoop:hadoop-doclet:jar:0.23.0-SNAPSHOT is missing, no dependency information available
          [INFO] ------------------------------------------------------------------------
          [INFO] Reactor Summary:
          [INFO] 
          [INFO] Apache Hadoop Common .............................. FAILURE [0.446s]
          [INFO] Apache Hadoop Common Docs ......................... SKIPPED
          [INFO] Apache Hadoop Common Distro ....................... SKIPPED
          [INFO] Apache Hadoop Common Main ......................... SKIPPED
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD FAILURE
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 0.683s
          [INFO] Finished at: Tue Jul 12 17:19:57 PDT 2011
          [INFO] Final Memory: 4M/81M
          [INFO] ------------------------------------------------------------------------
          [ERROR] Failed to execute goal on project hadoop-common: Could not resolve dependencies for project org.apache.hadoop:hadoop-common:jar:0.23.0-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-doclet:jar:0.23.0-SNAPSHOT -> [Help 1]
          

          Not sure why common depends on docklet;
          common depends on docklet which is not yet build/installed and hence its fails(scope is set to provided);
          I'm still testing the patch...

          Show
          Giridharan Kesavan added a comment - thanks for the clarification Alejandro. You are right; common is just a folder. I missed deleting it. 1) most of the projects that I ve seen uses this convention common/pom.xml ---common-main/pom.xml 2) can trunk/pom.xml used for module dependencies b/w common,hdfs & mapreduce? (just wondering about the reason for root/pom.xml) 3) cd trunk/common-main/common mvn install - I get the following error [WARNING] The POM for org.apache.hadoop:hadoop-doclet:jar:0.23.0-SNAPSHOT is missing, no dependency information available [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Common .............................. FAILURE [0.446s] [INFO] Apache Hadoop Common Docs ......................... SKIPPED [INFO] Apache Hadoop Common Distro ....................... SKIPPED [INFO] Apache Hadoop Common Main ......................... SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 0.683s [INFO] Finished at: Tue Jul 12 17:19:57 PDT 2011 [INFO] Final Memory: 4M/81M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal on project hadoop-common: Could not resolve dependencies for project org.apache.hadoop:hadoop-common:jar:0.23.0-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-doclet:jar:0.23.0-SNAPSHOT -> [Help 1] Not sure why common depends on docklet; common depends on docklet which is not yet build/installed and hence its fails(scope is set to provided); I'm still testing the patch...
          Hide
          Alejandro Abdelnur added a comment -

          Giri,

          Thanks for playing with the patch.

          Actually the layout after applying the patch (Maven module speaking) is:

          trunk/pom.xml
          |
          |-- doclet/pom.xml
          |
          |-- root/pom.xml
          |
          |-- common-main/pom.xml
          |      |
          |      |-- common/pom.xml
          |      |
          |      |-- docs/pom.xml
          |      |
          |      |-- distro/pom.xml
          |
          |-- hdfs
          |
          |-- mapreduce
          

          I assume you are testing it in a SVN checkout, the old common dir (empty) won't go away till you commit.

          The reason for calling the root Common 'common-main' and the child one 'common' is to be as close as possible to the artifact names (generated JARs).

          We could also go the way of using the full artifact names, then it could be:

          trunk/pom.xml
          |
          |-- hadoop-doclet/pom.xml
          |
          |-- hadoop-root/pom.xml
          |
          |-- common/pom.xml
          |      |
          |      |-- hadoop-common/pom.xml
          |      |
          |      |-- hadoop-common-docs/pom.xml
          |      |
          |      |-- hadoop-common-distro/pom.xml
          |
          |-- hdfs
          |
          |-- mapreduce
          

          This seemed to verbose to me, that is way I didn't go that path initially.

          Thoughts?

          Show
          Alejandro Abdelnur added a comment - Giri, Thanks for playing with the patch. Actually the layout after applying the patch (Maven module speaking) is: trunk/pom.xml | |-- doclet/pom.xml | |-- root/pom.xml | |-- common-main/pom.xml | | | |-- common/pom.xml | | | |-- docs/pom.xml | | | |-- distro/pom.xml | |-- hdfs | |-- mapreduce I assume you are testing it in a SVN checkout, the old common dir (empty) won't go away till you commit. The reason for calling the root Common 'common-main' and the child one 'common' is to be as close as possible to the artifact names (generated JARs). We could also go the way of using the full artifact names, then it could be: trunk/pom.xml | |-- hadoop-doclet/pom.xml | |-- hadoop-root/pom.xml | |-- common/pom.xml | | | |-- hadoop-common/pom.xml | | | |-- hadoop-common-docs/pom.xml | | | |-- hadoop-common-distro/pom.xml | |-- hdfs | |-- mapreduce This seemed to verbose to me, that is way I didn't go that path initially. Thoughts?
          Hide
          Giridharan Kesavan added a comment -

          applied path version m and here is how the layout is.

          trunk/common
                common-main
                docklet
          

          Instead, the following layout would look lot easier to understand and maintain.

          trunk/pom.xml
          |
          |---common/pom.xml
          |     |
          |     |---common-main/pom.xml
          |---hdfs/pom.xml
          |     |
          |     |---hdfs-main/pom.xml
          |     |---contrib-1/pom.xml
          |     |---contirb-2/pom.xml
          |---mapreduce/pom.xml
                |
                |---mapreduce-main/pom.xml
                |---contrib-1/pom.xml
                |---contrib-2/pom.xml 
          
          Show
          Giridharan Kesavan added a comment - applied path version m and here is how the layout is. trunk/common common-main docklet Instead, the following layout would look lot easier to understand and maintain. trunk/pom.xml | |---common/pom.xml | | | |---common-main/pom.xml |---hdfs/pom.xml | | | |---hdfs-main/pom.xml | |---contrib-1/pom.xml | |---contirb-2/pom.xml |---mapreduce/pom.xml | |---mapreduce-main/pom.xml |---contrib-1/pom.xml |---contrib-2/pom.xml
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12486080/HADOOP-6671-m.patch
          against trunk revision 1144858.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 64 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/714//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486080/HADOOP-6671-m.patch against trunk revision 1144858. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 64 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/714//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          'm' patch, apache rat properly configured.

          test-patch.sh wired to use Maven, moved to trunk/bin

          Show
          Alejandro Abdelnur added a comment - 'm' patch, apache rat properly configured. test-patch.sh wired to use Maven, moved to trunk/bin
          Hide
          Alejandro Abdelnur added a comment -

          As before, test-patch won't pass because a script to move files around has to be run before applying the patch.

          Show
          Alejandro Abdelnur added a comment - As before, test-patch won't pass because a script to move files around has to be run before applying the patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12485798/HADOOP-6671-l.patch
          against trunk revision 1144043.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 62 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/712//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12485798/HADOOP-6671-l.patch against trunk revision 1144043. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 62 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/712//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          script/patch l (L), rebased to SVN revision 1144555.

          Patch instructions remain the same, first run script on trunk then apply patch.

          TAR by default does not include DOCS/SOURCE (it cuts 3 mins of the build, handy for development).

          The -Pdocs & -Psource profile options activate the generation and inclusion of docs and source.

          Full build instructions in common-main/BUILDING.txt file

          Show
          Alejandro Abdelnur added a comment - script/patch l (L), rebased to SVN revision 1144555. Patch instructions remain the same, first run script on trunk then apply patch. TAR by default does not include DOCS/SOURCE (it cuts 3 mins of the build, handy for development). The -Pdocs & -Psource profile options activate the generation and inclusion of docs and source. Full build instructions in common-main/BUILDING.txt file
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12485641/HADOOP-6671-k.sh
          against trunk revision 1143681.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 62 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/709//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12485641/HADOOP-6671-k.sh against trunk revision 1143681. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 62 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/709//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          The attached patch works on top of revision 1143624.

          This is a complete rewrite of the patch doing things the Maven way. It is much
          simpler that my previous patch. There is a single user activated profile used
          to do native compilation. The Maven Ant plugin is used only for things that are
          not possible to do with existing Maven plugins (generate Forrest documentation,
          generate the package-info.java, and copying the SO files preserving symlinks).

          What is left:

          • rewire the the patch & jenkins scripts
          • tests that use AOP
          • tests that do fault injection
          • generating RPM/DEB packages

          For the tests I will need some help as I'm completely ignorant about them.

          I'll work on the patch & jenkins scripts (Tom, help!! .

          At this point I would like committers to review this and if OK commit it and
          integrate the tests with AOP and fault injection, and RPM/DEB as a follow up JIRAs.

          Patching instructions:

          (I've tested it both out of a GIT checkout and a SVN checkout)

          • Run the 'mvn-layout-k.sh' script. If using a SVN checkout use the 'svn' parameter
            (it will do SVN file operations). If using a GIT checkout use 'fs' parameter
            (it will do FS file operations).
          • Apply the 'HADOOP-6671-k.patch' patch.

          Notes:

          The trunk has a pom.xml file, at the moment is only wired to build Hadoop common.
          Later hdfs, mapred, contribs will be wired here.

          The trunk/root module contains the foundation pom.xml inherited by all Hadoop
          modules defining versions of plugins and dependencies as well as common properties.

          The trunk/doclet module builds the Hadoop doclet used when generating Javadocs
          and Jdiff reports. hdfs and mapred will also use this doclet.

          The trunk/common-main module is Hadoop common proper. It contains 3 sub-modules:
          trunk/common-main/common, trunk/common-main/docs, trunk/common-main/distro.

          The trunk/common-main/common module generates Hadoop Common JAR.

          The trunk/common-main/docs module generates Hadoop Common Forrest documentation.

          The trunk/common-main/distro module generates Hadoop Common tarball.

          The reason for renaming trunk/common-main to trunk/common was that having to
          nested modules called common (trunk/common/common) would be confusing. And I
          didn't want to change the name of the later to something else as it is the
          one generating the hadoop-common JAR.

          For the RPM/DEB packages the idea is use alternate assembly descriptors in the
          trunk/common-main/distro module that will generate the layout required by the
          packaging tools.

          There are Maven plugins that invoke RPM/DEB tools. We'll have to decide if we
          use those plugins or we just jump out of Maven to the RPM/DEB generation. I'll
          work with Giri to decide this as he did the original RPM/DEB generation.

          ==BUILDING.txt==

          ----------------------------------------------------------------------------------
          Requirements:
          
          * Unix System
          * JDK 1.6
          * Maven 3.0
          * Forrest 0.8 (if generating docs)
          * Findbugs 1.3.9 (if running findbugs)
          * Autotools (if compiling native code)
          * Internet connection for first build (to fetch all Maven and Hadoop dependencies)
          
          ----------------------------------------------------------------------------------
          Maven modules:
          
            hadoop (Main Hadoop project)
                   - root (boostrap module with the parent pom inherited by all modules)
                          (all plugins & dependencies versions are defined here        )
                   - doclet (generates the Hadoop doclet used to generated the Javadocs)
                   - common-main (Hadoop common Main)
                                 - common (Java & Native code)
                                 - docs   (documentation)
                                 - distro (creates TAR)
          
          ----------------------------------------------------------------------------------
          Where to run Maven from?
          
            It can be run from any module. The only catch is that if not run from utrunk
            all modules that are not part of the build run must be installed in the local
            Maven cache or available in a Maven repository.
          
          ----------------------------------------------------------------------------------
          Maven build goals:
          
           * Clean                     : mvn clean
           * Compile                   : mvn compile
           * Run tests                 : mvn test
           * Create JAR                : mvn package
           * Run findbugs              : mvn compile findbugs:findbugs
           * Run checkstyle            : mvn checkstyle:checkstyle
           * Install JAR in M2 cache   : mvn install
           * Deploy JAR to Maven repo  : mvn deploy
           * Run clover                : mvn clover:clover
           * Run Rat                   : mvn apache-rat:check
           * Build documentation       : mvn package site
           * Build TAR                 : mvn package post-site (*)
          
           Build options:
          
            * Use -Pnative  to compile/bundle native code
            * Use -DskipTests to skip tests when running the following Maven goals:
              'package',  'install'  or 'deploy'
            * Use -Dsnappy.prefix=(/usr/local) & -Dbundle.snappy=(false) to compile
              Snappy JNI bindings and to bundle Snappy SO files
          
           Tests options:
          
            * -Dtest=<TESTCLASSNAME>,....
            * -Dtest.exclude=<TESTCLASSNAME>
            * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java
          
          
          [* piggybacking on post-site lifecycle phase to wire the Ant & Assembly plugins, ]
          [  in that order, to generate a TAR with symlinks. The issue is that the assembly]
          [  plugin on its own is not wired to a lifecycle where we could hook the Ant     ]
          [  the ant plugin to copy the SO files preserving symlinks (Assembly plugin does ]
          [  not handle symlinks.                                                          ]
          ----------------------------------------------------------------------------------
          
          Show
          Alejandro Abdelnur added a comment - The attached patch works on top of revision 1143624. This is a complete rewrite of the patch doing things the Maven way. It is much simpler that my previous patch. There is a single user activated profile used to do native compilation. The Maven Ant plugin is used only for things that are not possible to do with existing Maven plugins (generate Forrest documentation, generate the package-info.java, and copying the SO files preserving symlinks). What is left: rewire the the patch & jenkins scripts tests that use AOP tests that do fault injection generating RPM/DEB packages For the tests I will need some help as I'm completely ignorant about them. I'll work on the patch & jenkins scripts (Tom, help!! . At this point I would like committers to review this and if OK commit it and integrate the tests with AOP and fault injection, and RPM/DEB as a follow up JIRAs. Patching instructions: (I've tested it both out of a GIT checkout and a SVN checkout) Run the 'mvn-layout-k.sh' script. If using a SVN checkout use the 'svn' parameter (it will do SVN file operations). If using a GIT checkout use 'fs' parameter (it will do FS file operations). Apply the ' HADOOP-6671 -k.patch' patch. Notes: The trunk has a pom.xml file, at the moment is only wired to build Hadoop common. Later hdfs, mapred, contribs will be wired here. The trunk/root module contains the foundation pom.xml inherited by all Hadoop modules defining versions of plugins and dependencies as well as common properties. The trunk/doclet module builds the Hadoop doclet used when generating Javadocs and Jdiff reports. hdfs and mapred will also use this doclet. The trunk/common-main module is Hadoop common proper. It contains 3 sub-modules: trunk/common-main/common, trunk/common-main/docs, trunk/common-main/distro. The trunk/common-main/common module generates Hadoop Common JAR. The trunk/common-main/docs module generates Hadoop Common Forrest documentation. The trunk/common-main/distro module generates Hadoop Common tarball. The reason for renaming trunk/common-main to trunk/common was that having to nested modules called common (trunk/common/common) would be confusing. And I didn't want to change the name of the later to something else as it is the one generating the hadoop-common JAR. For the RPM/DEB packages the idea is use alternate assembly descriptors in the trunk/common-main/distro module that will generate the layout required by the packaging tools. There are Maven plugins that invoke RPM/DEB tools. We'll have to decide if we use those plugins or we just jump out of Maven to the RPM/DEB generation. I'll work with Giri to decide this as he did the original RPM/DEB generation. ==BUILDING.txt== ---------------------------------------------------------------------------------- Requirements: * Unix System * JDK 1.6 * Maven 3.0 * Forrest 0.8 ( if generating docs) * Findbugs 1.3.9 ( if running findbugs) * Autotools ( if compiling native code) * Internet connection for first build (to fetch all Maven and Hadoop dependencies) ---------------------------------------------------------------------------------- Maven modules: hadoop (Main Hadoop project) - root (boostrap module with the parent pom inherited by all modules) (all plugins & dependencies versions are defined here ) - doclet (generates the Hadoop doclet used to generated the Javadocs) - common-main (Hadoop common Main) - common (Java & Native code) - docs (documentation) - distro (creates TAR) ---------------------------------------------------------------------------------- Where to run Maven from? It can be run from any module. The only catch is that if not run from utrunk all modules that are not part of the build run must be installed in the local Maven cache or available in a Maven repository. ---------------------------------------------------------------------------------- Maven build goals: * Clean : mvn clean * Compile : mvn compile * Run tests : mvn test * Create JAR : mvn package * Run findbugs : mvn compile findbugs:findbugs * Run checkstyle : mvn checkstyle:checkstyle * Install JAR in M2 cache : mvn install * Deploy JAR to Maven repo : mvn deploy * Run clover : mvn clover:clover * Run Rat : mvn apache-rat:check * Build documentation : mvn package site * Build TAR : mvn package post-site (*) Build options: * Use -Pnative to compile/bundle native code * Use -DskipTests to skip tests when running the following Maven goals: ' package ', 'install' or 'deploy' * Use -Dsnappy.prefix=(/usr/local) & -Dbundle.snappy=( false ) to compile Snappy JNI bindings and to bundle Snappy SO files Tests options: * -Dtest=<TESTCLASSNAME>,.... * -Dtest.exclude=<TESTCLASSNAME> * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java [* piggybacking on post-site lifecycle phase to wire the Ant & Assembly plugins, ] [ in that order, to generate a TAR with symlinks. The issue is that the assembly] [ plugin on its own is not wired to a lifecycle where we could hook the Ant ] [ the ant plugin to copy the SO files preserving symlinks (Assembly plugin does ] [ not handle symlinks. ] ----------------------------------------------------------------------------------
          Hide
          Konstantin Boudnik added a comment -

          Can we include Sonar stuff here as well? Here INFRA-3645 is the almost working patch for HDFS which I am sure can be scaled to the Common with ease. If there's no objection I can update the patch.

          Show
          Konstantin Boudnik added a comment - Can we include Sonar stuff here as well? Here INFRA-3645 is the almost working patch for HDFS which I am sure can be scaled to the Common with ease. If there's no objection I can update the patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12483320/HADOOP-6671-j.patch
          against trunk revision 1137724.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 57 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/662//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12483320/HADOOP-6671-j.patch against trunk revision 1137724. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 57 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/662//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          rebasing to trunks HEAD (needed because of new snappy dependency)

          Show
          Alejandro Abdelnur added a comment - rebasing to trunks HEAD (needed because of new snappy dependency)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12482874/HADOOP-6671-i.patch
          against trunk revision 1136249.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 57 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/645//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482874/HADOOP-6671-i.patch against trunk revision 1136249. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 57 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/645//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          rebased to trunk's head.

          script & patch version 'i'

          test-patch.sh wired

          findbugs/clover/rat/checkstyle work. Refer to BUILDING.txt file for details.

          Show
          Alejandro Abdelnur added a comment - rebased to trunk's head. script & patch version 'i' test-patch.sh wired findbugs/clover/rat/checkstyle work. Refer to BUILDING.txt file for details.
          Hide
          Tom White added a comment -

          I updated https://builds.apache.org/job/Hadoop-Common-trunk-maven/ to generate Clover reports.

          Show
          Tom White added a comment - I updated https://builds.apache.org/job/Hadoop-Common-trunk-maven/ to generate Clover reports.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12482623/HADOOP-6671-h.patch
          against trunk revision 1135820.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 38 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/633//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482623/HADOOP-6671-h.patch against trunk revision 1135820. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 38 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/633//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Wiring Clover.

          The following property in the pom.xml may have to set to the correct value in the Jenkins servers:
          <cloverLicenseLocation>$

          {user.home}

          /.clover.license</cloverLicenseLocation>

          Build commands avail:

          • Clean : mvn clean
          • Compile : mvn compile
          • Run tests : mvn test
          • Create JAR : mvn package
          • Run findbugs : mvn package -DrunFindbugs
          • Run checkstyle : mvn checkstyle:checkstyle
          • TAR wo/docs & wo/source : mvn package -DmakeTar
          • TAR w/docs & w/source : mvn package -DmakeTar -DgenerateDocs -DincludeSource
          • TAR w/source only : mvn assembly:single -Dsource
          • Install in local M2 cache : mvn install
          • Deploy to Maven repo : mvn deploy
          • Run clover : mvn test -DrunClover
          Show
          Alejandro Abdelnur added a comment - Wiring Clover. The following property in the pom.xml may have to set to the correct value in the Jenkins servers: <cloverLicenseLocation>$ {user.home} /.clover.license</cloverLicenseLocation> Build commands avail: Clean : mvn clean Compile : mvn compile Run tests : mvn test Create JAR : mvn package Run findbugs : mvn package -DrunFindbugs Run checkstyle : mvn checkstyle:checkstyle TAR wo/docs & wo/source : mvn package -DmakeTar TAR w/docs & w/source : mvn package -DmakeTar -DgenerateDocs -DincludeSource TAR w/source only : mvn assembly:single -Dsource Install in local M2 cache : mvn install Deploy to Maven repo : mvn deploy Run clover : mvn test -DrunClover
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12482491/HADOOP-6671-g.patch
          against trunk revision 1135333.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 35 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/621//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482491/HADOOP-6671-g.patch against trunk revision 1135333. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 35 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/621//console This message is automatically generated.
          Hide
          Aaron T. Myers added a comment -

          IMO it should be a separate JIRA

          Filed and posted a patch at: https://issues.apache.org/jira/browse/HADOOP-7389

          After some more thought, I ended up going a slightly different route than "create a cleanup method." Instead, a failure to find any groups for a user created by a call to UserGroupInformation.createUserForTesting(...) will result in a follow-up call to the original Groups implementation.

          Show
          Aaron T. Myers added a comment - IMO it should be a separate JIRA Filed and posted a patch at: https://issues.apache.org/jira/browse/HADOOP-7389 After some more thought, I ended up going a slightly different route than "create a cleanup method." Instead, a failure to find any groups for a user created by a call to UserGroupInformation.createUserForTesting(...) will result in a follow-up call to the original Groups implementation.
          Hide
          Alejandro Abdelnur added a comment -

          @ATM, thanks.

          IMO it should be a separate JIRA

          Show
          Alejandro Abdelnur added a comment - @ATM, thanks. IMO it should be a separate JIRA
          Hide
          Aaron T. Myers added a comment -

          I've figured out the issue. UserGroupInformation.createUserForTesting(...) replaces UserGroupInformation's static reference to groups with an instance of TestingGroups. Once this occurs in a JVM, no subsequent calls to UserGroupInformation.getGroupNames(...) will work for any real users on the system. The test case in question compares the groups of the real user running the tests with the groups of that same user as determined by the UserGroupInformation class.

          The explanation for why this fails under maven when run as part of the suite, but not under maven when run in isolation or under ant is that those must run the test cases in different orders.

          The solution is that tests which call createUserForTesting should reset the static reference when they complete. I can work on a patch for this. Tom/Alejandro - do you think this should be done as a separate JIRA? Or part of this one?

          Show
          Aaron T. Myers added a comment - I've figured out the issue. UserGroupInformation.createUserForTesting(...) replaces UserGroupInformation 's static reference to groups with an instance of TestingGroups . Once this occurs in a JVM, no subsequent calls to UserGroupInformation.getGroupNames(...) will work for any real users on the system. The test case in question compares the groups of the real user running the tests with the groups of that same user as determined by the UserGroupInformation class. The explanation for why this fails under maven when run as part of the suite, but not under maven when run in isolation or under ant is that those must run the test cases in different orders. The solution is that tests which call createUserForTesting should reset the static reference when they complete. I can work on a patch for this. Tom/Alejandro - do you think this should be done as a separate JIRA? Or part of this one?
          Hide
          Alejandro Abdelnur added a comment -

          @Tom: that explains, thxs for the detective work. In my answer to Todd I meant testcase, didn't associate the per test to per test method.

          Then it makes sense to open a different JIRA to fix this testcase, right? (I don't think it would be an easy thing to do to get one JVM per test method, not to mention it would be tooooo slow)

          Show
          Alejandro Abdelnur added a comment - @Tom: that explains, thxs for the detective work. In my answer to Todd I meant testcase, didn't associate the per test to per test method. Then it makes sense to open a different JIRA to fix this testcase, right? (I don't think it would be an easy thing to do to get one JVM per test method, not to mention it would be tooooo slow)
          Hide
          Tom White added a comment -

          Per-test in Ant (and Maven) creates a new JVM per TestCase class (not per method). When I move testGetServerSideGroups() to be the last method in TestUserGroupInformation it consistently fails in Ant, which suggests that it relies on static state (as Todd suggested).

          Show
          Tom White added a comment - Per-test in Ant (and Maven) creates a new JVM per TestCase class (not per method). When I move testGetServerSideGroups() to be the last method in TestUserGroupInformation it consistently fails in Ant, which suggests that it relies on static state (as Todd suggested).
          Hide
          Alejandro Abdelnur added a comment -

          handles TAR creating in the same way other Ant delegated task are handled.

          Show
          Alejandro Abdelnur added a comment - handles TAR creating in the same way other Ant delegated task are handled.
          Hide
          Alejandro Abdelnur added a comment -

          @ATM: only when running full test suite, just that test passes
          @Todd: doing a per-test fork (that is the odd thing, at first thought the same)

          Show
          Alejandro Abdelnur added a comment - @ATM: only when running full test suite, just that test passes @Todd: doing a per-test fork (that is the odd thing, at first thought the same)
          Hide
          Todd Lipcon added a comment -

          Is maven executing tests in a "per-test fork" mode? Or is it trying to execute them all in one JVM instantiation? UGI relies on some static state (the "login user") which might be busted if it runs in the same JVM as other test cases.

          Show
          Todd Lipcon added a comment - Is maven executing tests in a "per-test fork" mode? Or is it trying to execute them all in one JVM instantiation? UGI relies on some static state (the "login user") which might be busted if it runs in the same JVM as other test cases.
          Hide
          Aaron T. Myers added a comment -

          @Alejandro: does this test fail for you when running the full test suite, but not when run in isolation? Or does this test not fail for you at all?

          Show
          Aaron T. Myers added a comment - @Alejandro: does this test fail for you when running the full test suite, but not when run in isolation? Or does this test not fail for you at all?
          Hide
          Alejandro Abdelnur added a comment -

          Some additional info on the TestUserGroupInformation failure, if running the test alone (-Dtest=TestUserGroupInformation) the test passes (both on Mac and Linux)

          Show
          Alejandro Abdelnur added a comment - Some additional info on the TestUserGroupInformation failure, if running the test alone (-Dtest=TestUserGroupInformation) the test passes (both on Mac and Linux)
          Hide
          Alejandro Abdelnur added a comment -

          Patch rebased on top of current (unsplitted) trunk.

          new stuff:

          • findbugs wired
          • javadocs use the Hadoop doclet

          Still missing:

          • DEB/RPM generation
          Show
          Alejandro Abdelnur added a comment - Patch rebased on top of current (unsplitted) trunk. new stuff: findbugs wired javadocs use the Hadoop doclet Still missing: DEB/RPM generation
          Hide
          Eric Charles added a comment -

          Hi Tom,
          Yes, sounds like environment, not code, issue.
          I'm used to maven, but completely new to hadoop, especially regarding the way hadoop relies on Operating Systems functions (I see many Runtime calls in the code).

          First of all, I will try to let the common project tests work in eclipse. If I run "ant test", all tests are successfull. But if I run them eclipse, about half are failing. I will post on mailing list for this. I think getting the solution for this will help to fix the TestUserGroupInformation test (working in eclipse, but not in command line).

          Show
          Eric Charles added a comment - Hi Tom, Yes, sounds like environment, not code, issue. I'm used to maven, but completely new to hadoop, especially regarding the way hadoop relies on Operating Systems functions (I see many Runtime calls in the code). First of all, I will try to let the common project tests work in eclipse. If I run "ant test", all tests are successfull. But if I run them eclipse, about half are failing. I will post on mailing list for this. I think getting the solution for this will help to fix the TestUserGroupInformation test (working in eclipse, but not in command line).
          Hide
          Tom White added a comment -

          Eric,

          I noticed the same test failures when running on Jenkins (https://builds.apache.org/job/Hadoop-Common-trunk-maven/8/), but I can't reproduce locally (on Mac), or on Jenkins with Ant (see https://builds.apache.org/job/Hadoop-Common-trunk/lastCompletedBuild/testReport/org.apache.hadoop.security/TestUserGroupInformation/testGetServerSideGroups/). Seems like some kind of environment issue.

          Show
          Tom White added a comment - Eric, I noticed the same test failures when running on Jenkins ( https://builds.apache.org/job/Hadoop-Common-trunk-maven/8/ ), but I can't reproduce locally (on Mac), or on Jenkins with Ant (see https://builds.apache.org/job/Hadoop-Common-trunk/lastCompletedBuild/testReport/org.apache.hadoop.security/TestUserGroupInformation/testGetServerSideGroups/ ). Seems like some kind of environment issue.
          Hide
          Eric Charles added a comment -

          Hi,
          Jenkins haddop-common-trunk is blue (builds fine).
          I've downloaded https://builds.apache.org/job/Hadoop-Common-trunk-maven/13/artifact/trunk/target/hadoop-common-0.23.0-SNAPSHOT.tar.gz
          and imported it in eclipse via m2eclipse. Still need to add 3 folders in my build path to have operation project in eclipse: target/generated-sources/avro-protocol, avro-schema and record-cc (Maybe this can be solved with the build-helper-maven-plugin : http://www.sonatype.com/people/2008/05/adding-additional-source-folders-to-your-maven-build/)

          When I run "mvn test" in shell, I've got two failures in TestUserGroupInformation [1], however the TestUserGroupInformation run from eclipse works fine.
          Seems like UserGroupInformation behaves differently in shell than in eclipse. Same issue if I run "mvn test -P os.mac" (I run on mac).

          [1]
          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.security.TestUserGroupInformation
          -------------------------------------------------------------------------------
          Tests run: 14, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 0.389 sec <<< FAILURE!
          testGetServerSideGroups(org.apache.hadoop.security.TestUserGroupInformation) Time elapsed: 0.029 sec <<< FAILURE!
          java.lang.AssertionError: expected:<13> but was:<0>
          at org.junit.Assert.fail(Assert.java:91)
          at org.junit.Assert.failNotEquals(Assert.java:645)
          at org.junit.Assert.assertEquals(Assert.java:126)
          at org.junit.Assert.assertEquals(Assert.java:470)
          at org.junit.Assert.assertEquals(Assert.java:454)
          at org.apache.hadoop.security.TestUserGroupInformation.testGetServerSideGroups(TestUserGroupInformation.java:97)
          ...
          testLogin(org.apache.hadoop.security.TestUserGroupInformation) Time elapsed: 0 sec <<< FAILURE!
          java.lang.AssertionError:
          at org.junit.Assert.fail(Assert.java:91)
          at org.junit.Assert.assertTrue(Assert.java:43)
          at org.junit.Assert.assertTrue(Assert.java:54)
          at org.apache.hadoop.security.TestUserGroupInformation.testLogin(TestUserGroupInformation.java:122)
          ...

          Show
          Eric Charles added a comment - Hi, Jenkins haddop-common-trunk is blue (builds fine). I've downloaded https://builds.apache.org/job/Hadoop-Common-trunk-maven/13/artifact/trunk/target/hadoop-common-0.23.0-SNAPSHOT.tar.gz and imported it in eclipse via m2eclipse. Still need to add 3 folders in my build path to have operation project in eclipse: target/generated-sources/avro-protocol, avro-schema and record-cc (Maybe this can be solved with the build-helper-maven-plugin : http://www.sonatype.com/people/2008/05/adding-additional-source-folders-to-your-maven-build/ ) When I run "mvn test" in shell, I've got two failures in TestUserGroupInformation [1] , however the TestUserGroupInformation run from eclipse works fine. Seems like UserGroupInformation behaves differently in shell than in eclipse. Same issue if I run "mvn test -P os.mac" (I run on mac). [1] ------------------------------------------------------------------------------- Test set: org.apache.hadoop.security.TestUserGroupInformation ------------------------------------------------------------------------------- Tests run: 14, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 0.389 sec <<< FAILURE! testGetServerSideGroups(org.apache.hadoop.security.TestUserGroupInformation) Time elapsed: 0.029 sec <<< FAILURE! java.lang.AssertionError: expected:<13> but was:<0> at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.junit.Assert.assertEquals(Assert.java:454) at org.apache.hadoop.security.TestUserGroupInformation.testGetServerSideGroups(TestUserGroupInformation.java:97) ... testLogin(org.apache.hadoop.security.TestUserGroupInformation) Time elapsed: 0 sec <<< FAILURE! java.lang.AssertionError: at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.security.TestUserGroupInformation.testLogin(TestUserGroupInformation.java:122) ...
          Hide
          Tom White added a comment -

          A couple of other things I noticed:

          We need to use a custom doclet for Javadoc (to exclude private classes). Something like

          <doclet>org.apache.hadoop.classification.tools.ExcludePrivateAnnotationsStandardDoclet</doclet>
          <docletArtifact>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>0.23.0-SNAPSHOT</version>
          </docletArtifact>
          <useStandardDocletOptions>true</useStandardDocletOptions>
          

          The releaseaudit target equivalent is needed in Maven.

          Show
          Tom White added a comment - A couple of other things I noticed: We need to use a custom doclet for Javadoc (to exclude private classes). Something like <doclet>org.apache.hadoop.classification.tools.ExcludePrivateAnnotationsStandardDoclet</doclet> <docletArtifact> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>0.23.0-SNAPSHOT</version> </docletArtifact> <useStandardDocletOptions> true </useStandardDocletOptions> The releaseaudit target equivalent is needed in Maven.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12482008/HADOOP-6671-cross-project-HDFS.patch
          against trunk revision 1133125.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/603//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482008/HADOOP-6671-cross-project-HDFS.patch against trunk revision 1133125. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/603//console This message is automatically generated.
          Hide
          Tom White added a comment -

          I've set up a Jenkins job to build common artifacts using Maven: https://builds.apache.org/job/Hadoop-Common-trunk-maven/.

          It's building the same artifacts as https://builds.apache.org/job/Hadoop-Common-trunk/, including documentation and native libraries, and reports for compiler warnings, tests, FindBugs, and Checkstyle. The only missing report is for Clover which needs adding to the Maven build.

          Currently two tests are failing (https://builds.apache.org/job/Hadoop-Common-trunk-maven/8/) - I'm not sure why, as they pass for me locally using Maven, and on Hudson using Ant.

          I also tried a cross-project build using Maven for common and Ant for HDFS. I needed the attached patch to get the HDFS build to work - these are changes that are needed anyway that we were getting away with using Ivy. MapReduce will need similar changes.

          Show
          Tom White added a comment - I've set up a Jenkins job to build common artifacts using Maven: https://builds.apache.org/job/Hadoop-Common-trunk-maven/ . It's building the same artifacts as https://builds.apache.org/job/Hadoop-Common-trunk/ , including documentation and native libraries, and reports for compiler warnings, tests, FindBugs, and Checkstyle. The only missing report is for Clover which needs adding to the Maven build. Currently two tests are failing ( https://builds.apache.org/job/Hadoop-Common-trunk-maven/8/ ) - I'm not sure why, as they pass for me locally using Maven, and on Hudson using Ant. I also tried a cross-project build using Maven for common and Ant for HDFS. I needed the attached patch to get the HDFS build to work - these are changes that are needed anyway that we were getting away with using Ivy. MapReduce will need similar changes.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481585/mvn-layout-e.sh
          against trunk revision 1132511.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/582//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481585/mvn-layout-e.sh against trunk revision 1132511. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/582//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          If folks agree I'd like to work with Tom in a new branch to get all the Jenkins hook scripts working.

          Show
          Alejandro Abdelnur added a comment - If folks agree I'd like to work with Tom in a new branch to get all the Jenkins hook scripts working.
          Hide
          Alejandro Abdelnur added a comment -

          Patch rebased to HEAD of trunk (use the

          Now jdiff works (same as in Ant build)

          Build options in BUILDING.txt file.

          Show
          Alejandro Abdelnur added a comment - Patch rebased to HEAD of trunk (use the Now jdiff works (same as in Ant build) Build options in BUILDING.txt file.
          Hide
          Tom White added a comment -

          The Jenkins scripts in http://svn.apache.org/repos/asf/hadoop/nightly/ will need updating too. To test the changes we could have a branch containing the Mavenized tree (i.e. with the patch from this issue applied), and a copy of the Jenkins nightly build job that uses a Maven version of the nightly script. For patch submission we can test the script manually rather than hooking it up to Jenkins across the board. We'd only commit this change when we are happy that the Jenkins jobs are working properly.

          Show
          Tom White added a comment - The Jenkins scripts in http://svn.apache.org/repos/asf/hadoop/nightly/ will need updating too. To test the changes we could have a branch containing the Mavenized tree (i.e. with the patch from this issue applied), and a copy of the Jenkins nightly build job that uses a Maven version of the nightly script. For patch submission we can test the script manually rather than hooking it up to Jenkins across the board. We'd only commit this change when we are happy that the Jenkins jobs are working properly.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480329/mvn-layout2.sh
          against trunk revision 1127215.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/520//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480329/mvn-layout2.sh against trunk revision 1127215. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/520//console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          You should modify the script to also delete the files and directories that are no longer needed using svn rm.

          Show
          Owen O'Malley added a comment - You should modify the script to also delete the files and directories that are no longer needed using svn rm.
          Hide
          Owen O'Malley added a comment -

          This version uses svn mv to modify the working directory.

          Show
          Owen O'Malley added a comment - This version uses svn mv to modify the working directory.
          Hide
          Owen O'Malley added a comment -

          The current patch doesn't apply. It is trying to modify maven/checkstyle.xml.

          Show
          Owen O'Malley added a comment - The current patch doesn't apply. It is trying to modify maven/checkstyle.xml.
          Hide
          Owen O'Malley added a comment -

          I don't think we should commit this until the jdiff is done.

          Show
          Owen O'Malley added a comment - I don't think we should commit this until the jdiff is done.
          Hide
          Tom White added a comment -

          We currently publish jdiff documentation as a part of a release (e.g. http://hadoop.apache.org/common/docs/r0.20.2/jdiff/changes.html). There's also HADOOP-7035 which refines this to publish changes categorized by API stability and compatibility (there is an example at http://people.apache.org/~tomwhite/HADOOP-7035/common/).

          HADOOP-7035 will include documenting the process for generating jdiff for a release, so I don't think that we need to get it integrated in Maven as a part of this issue. (If needed at a later point we could hook it into Maven by calling out to the script.) Does that sound reasonable?

          Show
          Tom White added a comment - We currently publish jdiff documentation as a part of a release (e.g. http://hadoop.apache.org/common/docs/r0.20.2/jdiff/changes.html ). There's also HADOOP-7035 which refines this to publish changes categorized by API stability and compatibility (there is an example at http://people.apache.org/~tomwhite/HADOOP-7035/common/ ). HADOOP-7035 will include documenting the process for generating jdiff for a release, so I don't think that we need to get it integrated in Maven as a part of this issue. (If needed at a later point we could hook it into Maven by calling out to the script.) Does that sound reasonable?
          Hide
          Todd Lipcon added a comment -

          Hey Alejandro. Yep, we do use jdiff in order to detect incompatible changes between releases. Tom is one of the experts there.

          Show
          Todd Lipcon added a comment - Hey Alejandro. Yep, we do use jdiff in order to detect incompatible changes between releases. Tom is one of the experts there.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480296/mvn-layout2.sh
          against trunk revision 1126719.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/516//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480296/mvn-layout2.sh against trunk revision 1126719. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/516//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Integrated checkstyle and findbugs, added option to skipDocs.

          Only thing missing is JDiff integration (which it seems a dead project and the Maven plugin does not seem to work properly).

          Is JDIFF required? Any other alternative?

          Maven goals:

          • clean
          • compile (-DcompileNative)
          • test (-DcompileNative, -Dtest=<TESTCLASS>, -DskipTests)
          • package (-DcompileNative, -DskipDocs, -DskipTests | -Dtest=<TESTCLASS>)
          • checkstyle:checkstyle
          • findbugs:findbugs
          Show
          Alejandro Abdelnur added a comment - Integrated checkstyle and findbugs, added option to skipDocs. Only thing missing is JDiff integration (which it seems a dead project and the Maven plugin does not seem to work properly). Is JDIFF required? Any other alternative? Maven goals: clean compile (-DcompileNative) test (-DcompileNative, -Dtest=<TESTCLASS>, -DskipTests) package (-DcompileNative, -DskipDocs, -DskipTests | -Dtest=<TESTCLASS>) checkstyle:checkstyle findbugs:findbugs
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480134/HADOOP-6671c.patch
          against trunk revision 1126287.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 18 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/506//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480134/HADOOP-6671c.patch against trunk revision 1126287. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/506//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          HADOOP-6671c.patch uses the maven native:javah mojo to generate the javah stuff.

          This solves the need tools.jar issue.

          I've also moved all the native autoreconf/configure/make to happen under the BUILD directory (target/). I've done this before for the documentation generation. This makes all files generated by the build to happen under target/

          Still jdiff invocation is missing

          Show
          Alejandro Abdelnur added a comment - HADOOP-6671 c.patch uses the maven native:javah mojo to generate the javah stuff. This solves the need tools.jar issue. I've also moved all the native autoreconf/configure/make to happen under the BUILD directory (target/). I've done this before for the documentation generation. This makes all files generated by the build to happen under target/ Still jdiff invocation is missing
          Hide
          Alejandro Abdelnur added a comment -

          Todd,

          Thanks for the fast response and detailed review.

          I've taken care of all your comments except the one regarding the tools.jar (still trying to figure out, the Ant <javah> is not finding tools.jar (for some reason JRE is being used, thus my current hack is still needed).

          The following maven options can be used to alter the build/tests:

          • -DskipTests
          • -DskipDocs
          • -DcompileNative
          • -Dtest=<TESTCLASS>
          Show
          Alejandro Abdelnur added a comment - Todd, Thanks for the fast response and detailed review. I've taken care of all your comments except the one regarding the tools.jar (still trying to figure out, the Ant <javah> is not finding tools.jar (for some reason JRE is being used, thus my current hack is still needed). The following maven options can be used to alter the build/tests: -DskipTests -DskipDocs -DcompileNative -Dtest=<TESTCLASS>
          Hide
          Todd Lipcon added a comment -

          Cool stuff, Alejandro! A few things I noticed while trying this out locally:

          • nit: The project name is "Hadoop Common" rather than "Hadoop Commons"
          • I'm getting errors for our usage of the com.sun.javadoc stuff - eg:
            /home/todd/git/hadoop-common/src/main/java/org/apache/hadoop/classification/tools/ExcludePrivateAnnotationsJDiffDoclet.java:[42,16] cannot access com.sun.javadoc.Doclet
            class file for com.sun.javadoc.Doclet not found
                return JDiff.start(RootDocProcessor.process(root));
            /home/todd/git/hadoop-common/src/main/java/org/apache/hadoop/classification/tools/RootDocProcessor.java:[55,12] cannot find symbol
            

            Is this the bit where I need to add a softlink? Any workaround for this we can do local to the project?

          • "mvn test" seems to be dumping all the log to stdout. Is there a way to get it to log just to the files like ant used to?
          • is there an equivalent of our "bin-package" target which makes a tarball but doesn't build the forrest docs?
          • the assembly doesn't seem to build if I choose not to compile native:
            
            [INFO] ------------------------------------------------------------------------
            [ERROR] BUILD ERROR
            [INFO] ------------------------------------------------------------------------
            [INFO] An Ant BuildException has occured: /home/todd/git/hadoop-common/target/native/lib does not exist.
            
          Show
          Todd Lipcon added a comment - Cool stuff, Alejandro! A few things I noticed while trying this out locally: nit: The project name is "Hadoop Common" rather than "Hadoop Commons" I'm getting errors for our usage of the com.sun.javadoc stuff - eg: /home/todd/git/hadoop-common/src/main/java/org/apache/hadoop/classification/tools/ExcludePrivateAnnotationsJDiffDoclet.java:[42,16] cannot access com.sun.javadoc.Doclet class file for com.sun.javadoc.Doclet not found return JDiff.start(RootDocProcessor.process(root)); /home/todd/git/hadoop-common/src/main/java/org/apache/hadoop/classification/tools/RootDocProcessor.java:[55,12] cannot find symbol Is this the bit where I need to add a softlink? Any workaround for this we can do local to the project? "mvn test" seems to be dumping all the log to stdout. Is there a way to get it to log just to the files like ant used to? is there an equivalent of our "bin-package" target which makes a tarball but doesn't build the forrest docs? the assembly doesn't seem to build if I choose not to compile native: [INFO] ------------------------------------------------------------------------ [ERROR] BUILD ERROR [INFO] ------------------------------------------------------------------------ [INFO] An Ant BuildException has occured: /home/todd/git/hadoop-common/target/ native /lib does not exist.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12479868/HADOOP-6671.patch
          against trunk revision 1125051.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 18 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/488//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479868/HADOOP-6671.patch against trunk revision 1125051. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/488//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          How to apply the patch (I'm using GIT):

          • Checkout hadoop-common trunk
          • run the mvn-layout.sh script
          • git 'add -u' and 'git add' all new files dirs (git will recognize all the moving of files are renames)
          • commit all changes as first commit (a separate JIRA could be created for this if you find it necessary)
          • apply the HADOOP-6671.patch
          • test all works as indicated below
          • commit

          It requires Maven 3

          Typical Maven goals:

          • mvn clean
          • mvn compile
          • mvn test
          • mvn package

          The last one generates the TARBALL. As with the Ant build, FORREST_HOME must be defined.

          Options:

          • -Dtest=<TestClass> to run a single test
          • -DskipTests to skip running the tests (package by default runs tests)
          • -Dcompile.native to compile the native code

          This patch does the minimal code changes (one line in a test class where 'build/' was hardcoded) and the fixFont script for chinese documentation.

          There are some testcases that are writing files in the current directory instead under the build directory. Those should be fixed as part of a follow up issue (thus concentrating this one in just the mavenization).

          Native code compilation and Documentation generation are done using 2 auxiliary ant scripts invoked from Maven POM.

          As a follow up, when Mavenizing HDFS and MAPREDUCE we can introduce a parent POM file where all dependencies versions are commonly defined (same version for all).

          A couple of JAR files are missing from the final TARBALL, I assume they were included in the Ant build due to a ivy scope mistake (jdiff, aspectj-tools)

          JDIFF/Clover/FindBugs/Cobertura/etc/etc can be easily integrated in the Maven POM. But again, I think this should be done incrementally on top of this.

          Show
          Alejandro Abdelnur added a comment - How to apply the patch (I'm using GIT): Checkout hadoop-common trunk run the mvn-layout.sh script git 'add -u' and 'git add' all new files dirs (git will recognize all the moving of files are renames) commit all changes as first commit (a separate JIRA could be created for this if you find it necessary) apply the HADOOP-6671 .patch test all works as indicated below commit It requires Maven 3 Typical Maven goals: mvn clean mvn compile mvn test mvn package The last one generates the TARBALL. As with the Ant build, FORREST_HOME must be defined. Options: -Dtest=<TestClass> to run a single test -DskipTests to skip running the tests (package by default runs tests) -Dcompile.native to compile the native code This patch does the minimal code changes (one line in a test class where 'build/' was hardcoded) and the fixFont script for chinese documentation. There are some testcases that are writing files in the current directory instead under the build directory. Those should be fixed as part of a follow up issue (thus concentrating this one in just the mavenization). Native code compilation and Documentation generation are done using 2 auxiliary ant scripts invoked from Maven POM. As a follow up, when Mavenizing HDFS and MAPREDUCE we can introduce a parent POM file where all dependencies versions are commonly defined (same version for all). A couple of JAR files are missing from the final TARBALL, I assume they were included in the Ant build due to a ivy scope mistake (jdiff, aspectj-tools) JDIFF/Clover/FindBugs/Cobertura/etc/etc can be easily integrated in the Maven POM. But again, I think this should be done incrementally on top of this.
          Hide
          Konstantin Boudnik added a comment -

          Alejandro, I think mvn test-compile should do the same.

          Show
          Konstantin Boudnik added a comment - Alejandro, I think mvn test-compile should do the same.
          Hide
          Alejandro Abdelnur added a comment -

          Glad to see people are interested in this.

          Luke, thanks for the Eclipse info (I don't use Eclipse. I use IntelliJ, IntelliJ integrates nicely with Maven, you just open the POM as a project and you are done).

          JUnit execution should work fine if you first compile the testcases from Maven (running 'mvn test -DskipTests'), I do that trick for IntelliJ. And, if I'm not mistaken you should be able to configure the IDE to this automatically (or an IDE click).

          Show
          Alejandro Abdelnur added a comment - Glad to see people are interested in this. Luke, thanks for the Eclipse info (I don't use Eclipse. I use IntelliJ, IntelliJ integrates nicely with Maven, you just open the POM as a project and you are done). JUnit execution should work fine if you first compile the testcases from Maven (running 'mvn test -DskipTests'), I do that trick for IntelliJ. And, if I'm not mistaken you should be able to configure the IDE to this automatically (or an IDE click).
          Hide
          Luke Lu added a comment -

          if the maven plug-in for Eclipse will just make this happen, that's great, please let me know.

          Confirmed. mvn eclipse:eclipse should suffice. If you use m2eclipse IDE plugin than the step is not needed either. BTW, NetBeans supports maven projects natively (without any plugins).

          Show
          Luke Lu added a comment - if the maven plug-in for Eclipse will just make this happen, that's great, please let me know. Confirmed. mvn eclipse:eclipse should suffice. If you use m2eclipse IDE plugin than the step is not needed either. BTW, NetBeans supports maven projects natively (without any plugins).
          Hide
          Matt Foley added a comment -

          Alejandro, great to see someone working on this!

          Would you please add to your "still to be done" list –

          • Eclipse integration, especially:
            • Eclipse project files generation (hopefully more reliably, automatically, and with less maintenance effort than currently);
            • JUnit execution under the Eclipse-native "Java Builder" in the IDE (which works fine today as long as the dependencies and Build Path have been made available to Eclipse in the way it needs - which today usually requires changes to those Eclipse project files).

          If this was so obvious you forgot to mention it or if the maven plug-in for Eclipse will just make this happen, that's great, please let me know. But please do include this in the scope of work, because some of us are very dependent on the Eclipse IDE for our quality of life! Thanks.

          Show
          Matt Foley added a comment - Alejandro, great to see someone working on this! Would you please add to your "still to be done" list – Eclipse integration, especially: Eclipse project files generation (hopefully more reliably, automatically, and with less maintenance effort than currently); JUnit execution under the Eclipse-native "Java Builder" in the IDE (which works fine today as long as the dependencies and Build Path have been made available to Eclipse in the way it needs - which today usually requires changes to those Eclipse project files). If this was so obvious you forgot to mention it or if the maven plug-in for Eclipse will just make this happen, that's great, please let me know. But please do include this in the scope of work, because some of us are very dependent on the Eclipse IDE for our quality of life! Thanks.
          Hide
          Alejandro Abdelnur added a comment -

          I spent a few hours playing with hadoop-common mavenization following the ideas in the JIRA.

          I've got it to a point that compiles java and native code, runs the tests (with native code if present) and generates the JAR.

          I've only had to modify 1 java file, one testcase that had 'build/classes' hardcoded to look for a properties file.

          I've had to move directories/files around to follow Maven dir layout.

          Still to be done is javadocs, documentation, packaging (that would be an assembly descriptor), wiring jdiff/clover/findbugs, and deployment configuration of artifacts (JARs/SOs).

          The end goal will be to generate a TARBALL identical to the one it is being generated today.

          I don't expect all that to be much work.

          Once hadoop-commons is done, the same could be done in hadoop-hdfs and hadoop-mapreduce. Also the contrib/ stuff could go into its own Maven module. Finally a root Maven project could be used to wire all the above projects and external dependencies versions would be defined there in a dependencyManagement section.

          The good thing is that it is not required to do all at once, we can do common, then hdfs, then mapreduce and finally contrib.

          Attached you'll find a script that moves dir/files around to the maven expected locations and a patch containing the Maven pom.xml file, a native-build.xml (Ant file invoked from maven to build native code) and the 1 line change to a testcase.

          Instructions to test it:

          • checkout hadoop-common trunk
          • run the attached script from hadoop-common root dir
          • apply the the patch
          • Use Maven 3 to to build/test
            • -Dcompile.native enables native compilation

          IMPORTANT: I couldn't figure this out yet but there is some issue with javah when invoked from Maven/Ant (javah is not being found in LINUX because Maven changes JAVA_HOME to JRE location). TEMPORARY WORKAROUND: softlink JDK/lib/tools.jar in JRE/lib/ext/

          Before I continue working on this I want to know if folks would be OK with moving forward with this patch.

          Show
          Alejandro Abdelnur added a comment - I spent a few hours playing with hadoop-common mavenization following the ideas in the JIRA. I've got it to a point that compiles java and native code, runs the tests (with native code if present) and generates the JAR. I've only had to modify 1 java file, one testcase that had 'build/classes' hardcoded to look for a properties file. I've had to move directories/files around to follow Maven dir layout. Still to be done is javadocs, documentation, packaging (that would be an assembly descriptor), wiring jdiff/clover/findbugs, and deployment configuration of artifacts (JARs/SOs). The end goal will be to generate a TARBALL identical to the one it is being generated today. I don't expect all that to be much work. Once hadoop-commons is done, the same could be done in hadoop-hdfs and hadoop-mapreduce. Also the contrib/ stuff could go into its own Maven module. Finally a root Maven project could be used to wire all the above projects and external dependencies versions would be defined there in a dependencyManagement section. The good thing is that it is not required to do all at once, we can do common, then hdfs, then mapreduce and finally contrib. Attached you'll find a script that moves dir/files around to the maven expected locations and a patch containing the Maven pom.xml file, a native-build.xml (Ant file invoked from maven to build native code) and the 1 line change to a testcase. Instructions to test it: checkout hadoop-common trunk run the attached script from hadoop-common root dir apply the the patch Use Maven 3 to to build/test -Dcompile.native enables native compilation IMPORTANT: I couldn't figure this out yet but there is some issue with javah when invoked from Maven/Ant (javah is not being found in LINUX because Maven changes JAVA_HOME to JRE location). TEMPORARY WORKAROUND: softlink JDK/lib/tools.jar in JRE/lib/ext/ Before I continue working on this I want to know if folks would be OK with moving forward with this patch.
          Hide
          Luke Lu added a comment -

          Since HADOOP-7106 is moving forward, we should consider converting the all the Hadoop core projects (common, hdfs, mapreduce) into one single multi-module maven (2+) project. We can then easily do mvn install on the top-level and test patches across the modules, or work in the module directory independently. This has many advantages over a bunch of disjoint ant/ivy projects. Though maven2+ still has its issues, most can be resolved by using the right versions of plugins. Our recent effort with a major multi-module maven project worked out better than I expected. The more I worked with both maven (2+) and ant/ivy the more I prefer maven to ant/ivy (mostly for much less ways to shoot myself in the foot

          Show
          Luke Lu added a comment - Since HADOOP-7106 is moving forward, we should consider converting the all the Hadoop core projects (common, hdfs, mapreduce) into one single multi-module maven (2+) project. We can then easily do mvn install on the top-level and test patches across the modules, or work in the module directory independently. This has many advantages over a bunch of disjoint ant/ivy projects. Though maven2+ still has its issues, most can be resolved by using the right versions of plugins. Our recent effort with a major multi-module maven project worked out better than I expected. The more I worked with both maven (2+) and ant/ivy the more I prefer maven to ant/ivy (mostly for much less ways to shoot myself in the foot
          Hide
          Lars Francke added a comment -

          Perhaps this PNG is helpful for others as well. It visualizes the dependencies between all the ant targets.

          I'm not actively working on the Mavenization but spending an hour here and there on it when I have the time.

          Show
          Lars Francke added a comment - Perhaps this PNG is helpful for others as well. It visualizes the dependencies between all the ant targets. I'm not actively working on the Mavenization but spending an hour here and there on it when I have the time.
          Hide
          Eli Collins added a comment -

          Hey Giri,

          Are you still working on this? This jira isn't assigned to you but I remember you were working on a patch at some point.

          Thanks,
          Eli

          Show
          Eli Collins added a comment - Hey Giri, Are you still working on this? This jira isn't assigned to you but I remember you were working on a patch at some point. Thanks, Eli
          Hide
          Doug Cutting added a comment -

          > The directory moving/renaming unfortunately tends not to work too well with Subversion and branches (or I was doing it wrong) so I don't know how big the benefit would be to start a new one for doing this.

          Yes, if the branch is at all long-lived it will require lots of merges to keep it up to date, and such merges will not be easy.

          In my experience tree reorganizations are easier to develop as:

          • a shell script that contains a sequence of 'svn mv' commands
          • a patch file to be applied after the script has run

          For example, AVRO-163 was developed this way.

          Show
          Doug Cutting added a comment - > The directory moving/renaming unfortunately tends not to work too well with Subversion and branches (or I was doing it wrong) so I don't know how big the benefit would be to start a new one for doing this. Yes, if the branch is at all long-lived it will require lots of merges to keep it up to date, and such merges will not be easy. In my experience tree reorganizations are easier to develop as: a shell script that contains a sequence of 'svn mv' commands a patch file to be applied after the script has run For example, AVRO-163 was developed this way.
          Hide
          Paul Smith added a comment -

          Lars beat me to the jira comment, so I'll just say "Yeah, what Lars said".

          Happy to put my hand up to help, rather than a branch, I'd say a simply script like what was in HBASE-2099 is simple to work with for a reviewer, it outlines the migration steps needed rather than some hideous patch to review.

          In regards to AllanW's comments on sync'ing things around, that is still possible, rather than sync'ng the .ivy directory it's just ensuring the ~/.m2/repository directories are in sync, and then working in Maven's offline mode if that server doesn't have internet connectivity.

          Yes, one has to get the POM right, and that does come with experience, so perhaps if Lars and I can help here to get it off in the right direction that can ease any potential pain. IntelliJ and Eclipse Maven support is now 1st-class citizens really, Ivy less so. For future modularization, the Maven migration will pay off, splitting out code into nice modular chunks becomes much less work keeping the build system in sync.

          Anyway, happy to help out here too.

          Show
          Paul Smith added a comment - Lars beat me to the jira comment, so I'll just say "Yeah, what Lars said". Happy to put my hand up to help, rather than a branch, I'd say a simply script like what was in HBASE-2099 is simple to work with for a reviewer, it outlines the migration steps needed rather than some hideous patch to review. In regards to AllanW's comments on sync'ing things around, that is still possible, rather than sync'ng the .ivy directory it's just ensuring the ~/.m2/repository directories are in sync, and then working in Maven's offline mode if that server doesn't have internet connectivity. Yes, one has to get the POM right, and that does come with experience, so perhaps if Lars and I can help here to get it off in the right direction that can ease any potential pain. IntelliJ and Eclipse Maven support is now 1st-class citizens really, Ivy less so. For future modularization, the Maven migration will pay off, splitting out code into nice modular chunks becomes much less work keeping the build system in sync. Anyway, happy to help out here too.
          Hide
          Lars Francke added a comment -

          I don't have a lot of insight into the Hadoop Common project itself but I've done a lot of the work on the recent HBase transition together with Paul Smith and Kay Kay and would like to offer my help if needed.

          There are basically two ways to go: Either shuffle around a lot of directories to conform to the standard Maven layout or to override the Maven defaults and keep the current layout. Both ways are actually very painful but I prefer the first one. While that means a lot of work and a lot of swearing it also means that you'll get a consistent layout (with most other Maven projects) and the configuration is a lot easier. Some tools tend(ed) to not work properly with the non standard layout (strictly speaking this are bugs in the tools/plugins). You basically have to do a lot of work up front but after that it shouldn't be too hard to maintain.

          The directory moving/renaming unfortunately tends not to work too well with Subversion and branches (or I was doing it wrong) so I don't know how big the benefit would be to start a new one for doing this.

          What Paul Smith has done in HBASE-2099 is to provide a script that contained all the necessary commands (svn mv...) to finish the move (there are a couple more changes in other tickets) and a patch containing the .pom files. This has been much improved since and we've learned a lot from it. We are in fact still tweaking it and will change the pom structure once again to rely on the common Apache parent pom in addition to a few smaller fixes that are still outstanding.

          I'd propose a similar structure as HBase now has: A pom in the main directory with common information (packaging "pom") and then two modules ("core" and "contrib"). So the src/contrib directory would move to the top-level as would the docs directory (btw: I've never used Forrest with Maven but if Ant can do it Maven should be able to do it too). The final tarball as currently (trunk) created by ant tar should be easily reproducible by Maven (I couldn't find where most of the conribs ended up, are those currently missung from the .tar?)

          Paul Smith has indicated interest in this and I'd be interested and willing to help too if needed. But if you've already got someone to do the work I'm not going to complain It'd just be a shame to duplicate work here as the moving around of stuff in SVN is painful work if that's the way you decide to go.

          Show
          Lars Francke added a comment - I don't have a lot of insight into the Hadoop Common project itself but I've done a lot of the work on the recent HBase transition together with Paul Smith and Kay Kay and would like to offer my help if needed. There are basically two ways to go: Either shuffle around a lot of directories to conform to the standard Maven layout or to override the Maven defaults and keep the current layout. Both ways are actually very painful but I prefer the first one. While that means a lot of work and a lot of swearing it also means that you'll get a consistent layout (with most other Maven projects) and the configuration is a lot easier. Some tools tend(ed) to not work properly with the non standard layout (strictly speaking this are bugs in the tools/plugins). You basically have to do a lot of work up front but after that it shouldn't be too hard to maintain. The directory moving/renaming unfortunately tends not to work too well with Subversion and branches (or I was doing it wrong) so I don't know how big the benefit would be to start a new one for doing this. What Paul Smith has done in HBASE-2099 is to provide a script that contained all the necessary commands (svn mv...) to finish the move (there are a couple more changes in other tickets) and a patch containing the .pom files. This has been much improved since and we've learned a lot from it. We are in fact still tweaking it and will change the pom structure once again to rely on the common Apache parent pom in addition to a few smaller fixes that are still outstanding. I'd propose a similar structure as HBase now has: A pom in the main directory with common information (packaging "pom") and then two modules ("core" and "contrib"). So the src/contrib directory would move to the top-level as would the docs directory (btw: I've never used Forrest with Maven but if Ant can do it Maven should be able to do it too). The final tarball as currently (trunk) created by ant tar should be easily reproducible by Maven (I couldn't find where most of the conribs ended up, are those currently missung from the .tar?) Paul Smith has indicated interest in this and I'd be interested and willing to help too if needed. But if you've already got someone to do the work I'm not going to complain It'd just be a shame to duplicate work here as the moving around of stuff in SVN is painful work if that's the way you decide to go.
          Hide
          E. Sammer added a comment -

          So much of the build system today "acts as" maven it might as well be maven. Having some consistency in the way the individual projects are built would be extremely nice. Maven is as painful as every other build tool we've seen but at least its pain comes in a consistent, known, quantity. The tooling support tends to be better due to the predictable nature of the the build process. The surrounding infrastructure (such as Hudson) is already mvn aware. The main problem with mvn tends to be errant dependencies in poms but this can be avoided (see the SpringSource Ivy / Maven repositories for an example of very well maintained metadata - http://www.springsource.com/repository/app/). +1

          Show
          E. Sammer added a comment - So much of the build system today "acts as" maven it might as well be maven. Having some consistency in the way the individual projects are built would be extremely nice. Maven is as painful as every other build tool we've seen but at least its pain comes in a consistent, known, quantity. The tooling support tends to be better due to the predictable nature of the the build process. The surrounding infrastructure (such as Hudson) is already mvn aware. The main problem with mvn tends to be errant dependencies in poms but this can be avoided (see the SpringSource Ivy / Maven repositories for an example of very well maintained metadata - http://www.springsource.com/repository/app/ ). +1
          Hide
          Allen Wittenauer added a comment -

          Owen and Lee have told me that offline builds will have basically the same amount of pain that they do now. I'd be happier with less pain, but same is acceptable.

          Show
          Allen Wittenauer added a comment - Owen and Lee have told me that offline builds will have basically the same amount of pain that they do now. I'd be happier with less pain, but same is acceptable.
          Hide
          Lee Tucker added a comment -

          What it does do is provide a plugin architecture for a whole range of tools instead of having to roll each one independently into the build.xml. It provides project structure and actually improves inter-project communication by managing transitive build dependencies. Given that we already rely on net access (and maven repositories) to pull things through Ivy for dependencies, I'm not exactly sure how Maven makes that any worse. You had to fill a cache somehow already.

          Show
          Lee Tucker added a comment - What it does do is provide a plugin architecture for a whole range of tools instead of having to roll each one independently into the build.xml. It provides project structure and actually improves inter-project communication by managing transitive build dependencies. Given that we already rely on net access (and maven repositories) to pull things through Ivy for dependencies, I'm not exactly sure how Maven makes that any worse. You had to fill a cache somehow already.
          Hide
          Giridharan Kesavan added a comment -

          @Allen
          you can use mvn to do offline builds by passing -o argument

          @Steve
          Apart from maintaining ivy.xml and pom.xml, by having the pom file it reduces the build scripts maintenance.
          As the pom file has got standards for doing any kind of build stuff and its easy to maintain.

          Show
          Giridharan Kesavan added a comment - @Allen you can use mvn to do offline builds by passing -o argument @Steve Apart from maintaining ivy.xml and pom.xml, by having the pom file it reduces the build scripts maintenance. As the pom file has got standards for doing any kind of build stuff and its easy to maintain.
          Hide
          Doug Cutting added a comment -

          > pom files are not generated dynamically

          HADOOP-6629 also addresses that, so that the pom is generated from ivy.xml.

          > have one single xml file(POM) for dependency management and artifact publishing.

          HADOOP-6407 & HADOOP-6629 together eliminate redundant XML files, since both the POM and eclipse files are then generated from ivy.xml.

          That said, I'd still be interested to see what a Maven version of the build looks like, whether it's considerably simpler to support all the aspects of Common's build (native code, fault-injection, etc.) with Maven than with Ant. My concerns about Maven are that while it might make it easier to build a simple Java-only project, it might be trickier to incorporate more complex, non-standard build features than it is with Ant. Common is complex enough to be a good test case for this.

          Show
          Doug Cutting added a comment - > pom files are not generated dynamically HADOOP-6629 also addresses that, so that the pom is generated from ivy.xml. > have one single xml file(POM) for dependency management and artifact publishing. HADOOP-6407 & HADOOP-6629 together eliminate redundant XML files, since both the POM and eclipse files are then generated from ivy.xml. That said, I'd still be interested to see what a Maven version of the build looks like, whether it's considerably simpler to support all the aspects of Common's build (native code, fault-injection, etc.) with Maven than with Ant. My concerns about Maven are that while it might make it easier to build a simple Java-only project, it might be trickier to incorporate more complex, non-standard build features than it is with Ant. Common is complex enough to be a good test case for this.
          Hide
          steve_l added a comment -

          I've had bad experiences with M2 in the past; these colour my opinions. I don't know how much maven2 has improved since then.

          What I do have to deal with on a regular basis, even today, is people who write POM files who get the dependencies correct for their own build and test, but which screw up everyone else downstream. Recent logging JARs are an example. Accordingly, I view a POM file as an artifact for downstream users that you have to get right, not just some internal thing, as we can do today with ivy and ant files.

          This means saying "we should move to maven to eliminate having ivy.xml and POM files" isn't a good enough reason for me. If it improves testing, build times,-even to reduce ivy and ant xml maintenance costs, then yes , but not just "because you can".

          Show
          steve_l added a comment - I've had bad experiences with M2 in the past; these colour my opinions. I don't know how much maven2 has improved since then. What I do have to deal with on a regular basis, even today, is people who write POM files who get the dependencies correct for their own build and test, but which screw up everyone else downstream. Recent logging JARs are an example. Accordingly, I view a POM file as an artifact for downstream users that you have to get right, not just some internal thing, as we can do today with ivy and ant files. This means saying "we should move to maven to eliminate having ivy.xml and POM files" isn't a good enough reason for me. If it improves testing, build times,-even to reduce ivy and ant xml maintenance costs, then yes , but not just "because you can".
          Hide
          Allen Wittenauer added a comment -

          They don't have access to the Internet. They do have internal network access. And, no, I'm not willing to setup a service to build code. I'm sure Maven is a great idea if you have a ton of resources (mainly time and effort). I don't.

          Show
          Allen Wittenauer added a comment - They don't have access to the Internet. They do have internal network access. And, no, I'm not willing to setup a service to build code. I'm sure Maven is a great idea if you have a ton of resources (mainly time and effort). I don't.
          Hide
          Patrick Angeles added a comment -

          How do you scp without network access?

          Show
          Patrick Angeles added a comment - How do you scp without network access?
          Hide
          Allen Wittenauer added a comment -

          No, they don't.

          I bulid on my desktop mac, let ivy download all of its stuff, then scp .ivy2 and the other stuff that ivy pulls in to my real build machine. So your assumption is false.

          Show
          Allen Wittenauer added a comment - No, they don't. I bulid on my desktop mac, let ivy download all of its stuff, then scp .ivy2 and the other stuff that ivy pulls in to my real build machine. So your assumption is false.
          Hide
          Patrick Angeles added a comment -

          Giri,

          I like the idea, and I'll pitch in however I can.

          @Allen

          You can very easily set up an internal proxy for Maven. I'm assuming your build machines have some kind of network access to get to the source code...

          Show
          Patrick Angeles added a comment - Giri, I like the idea, and I'll pitch in however I can. @Allen You can very easily set up an internal proxy for Maven. I'm assuming your build machines have some kind of network access to get to the source code...
          Hide
          Allen Wittenauer added a comment -

          My initial thoughts is that every time someone tinkers with either ivy or maven or the build process in general, it always seems to get worse and worse for those of us that build on non-Internet connected machines. This sounds like another movement towards more pain.

          Show
          Allen Wittenauer added a comment - My initial thoughts is that every time someone tinkers with either ivy or maven or the build process in general, it always seems to get worse and worse for those of us that build on non-Internet connected machines. This sounds like another movement towards more pain.

            People

            • Assignee:
              Alejandro Abdelnur
              Reporter:
              Giridharan Kesavan
            • Votes:
              0 Vote for this issue
              Watchers:
              25 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development