Hadoop Common
  1. Hadoop Common
  2. HADOOP-8009

Create hadoop-client and hadoop-minicluster artifacts for downstream projects

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 0.22.0, 0.23.0, 0.23.1, 0.24.0
    • Fix Version/s: 1.0.1, 0.23.1
    • Component/s: build
    • Labels:
      None
    • Release Note:
      Hide
      Generate integration artifacts "org.apache.hadoop:hadoop-client" and "org.apache.hadoop:hadoop-minicluster" containing all the jars needed to use Hadoop client APIs, and to run Hadoop MiniClusters, respectively. Push these artifacts to the maven repository when mvn-deploy, along with existing artifacts.
      Show
      Generate integration artifacts "org.apache.hadoop:hadoop-client" and "org.apache.hadoop:hadoop-minicluster" containing all the jars needed to use Hadoop client APIs, and to run Hadoop MiniClusters, respectively. Push these artifacts to the maven repository when mvn-deploy, along with existing artifacts.

      Description

      Using Hadoop from projects like Pig/Hive/Sqoop/Flume/Oozie or any in-house system that interacts with Hadoop is quite challenging for the following reasons:

      • Different versions of Hadoop produce different artifacts: Before Hadoop 0.23 there was a single artifact hadoop-core, starting with Hadoop 0.23 there are several (common, hdfs, mapred*, yarn*)
      • There are no 'client' artifacts: Current artifacts include all JARs needed to run the services, thus bringing into clients several JARs that are not used for job submission/monitoring (servlet, jsp, tomcat, jersey, etc.)
      • Doing testing on the client side is also quite challenging as more artifacts have to be included than the dependencies define: for example, the history-server artifact has to be explicitly included. If using Hadoop 1 artifacts, jersey-server has to be explicitly included.
      • 3rd party dependencies change in Hadoop from version to version: This makes things complicated for projects that have to deal with multiple versions of Hadoop as their exclusions list become a huge mix & match of artifacts from different Hadoop versions and it may be break things when a particular version of Hadoop requires a dependency that other version of Hadoop does not require.

      Because of this it would be quite convenient to have the following 'aggregator' artifacts:

      • org.apache.hadoop:hadoop-client : it includes all required JARs to use Hadoop client APIs (excluding all JARs that are not needed for it)
      • org.apache.hadoop:hadoop-minicluster : it includes all required JARs to run Hadoop Mini Clusters

      These aggregator artifacts would be created for current branches under development (trunk, 0.22, 0.23, 1.0) and for released versions that are still in use.

      For branches under development, these artifacts would be generated as part of the build.

      For released versions we would have a a special branch used only as vehicle for publishing the corresponding 'aggregator' artifacts.

      1. HADOOP-8009.patch
        14 kB
        Alejandro Abdelnur
      2. HADOOP-8009-existing-releases.patch
        27 kB
        Alejandro Abdelnur
      3. HADOOP-8009-branch-1.patch
        14 kB
        Alejandro Abdelnur
      4. HADOOP-8009-branch-1-add.patch
        0.8 kB
        Matt Foley
      5. HADOOP-8009-branch-0_22.patch
        16 kB
        Alejandro Abdelnur

        Issue Links

          Activity

          Hide
          Alejandro Abdelnur added a comment - - edited

          Patch creates 2 new Maven modules: hadoop-client and hadoop-minicluster (instead of hadoop-test as suggested in the original description)

          I've tested both with Oozie.

          The hadoop-client may allow for a few more JAR exclussions, but for that could be a later refinement.

          Show
          Alejandro Abdelnur added a comment - - edited Patch creates 2 new Maven modules: hadoop-client and hadoop-minicluster (instead of hadoop-test as suggested in the original description) I've tested both with Oozie. The hadoop-client may allow for a few more JAR exclussions, but for that could be a later refinement.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12512710/HADOOP-8009.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/553//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12512710/HADOOP-8009.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/553//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          As usual test-patch does not like my patches, the patch applies cleanly on trunk.

          Show
          Alejandro Abdelnur added a comment - As usual test-patch does not like my patches, the patch applies cleanly on trunk.
          Hide
          Robert Joseph Evans added a comment -

          I like the patch, but I am a bit concerned about the scope of hadoop-client. There appears to be an effort underway to add in other computing models on top of yarn that are being added in as part of Hadoop itself, MPI with Hampster is the one the seems to be the furthest along. Would we add these in as well? If not then I would prefer to see hadoop-client named something with mapreduce in it, because that is what this is really creating a mapreduce client package for PIG, Hive and other mapreduce users.

          Other projects that just want to use YARN to write their own application master would not really want to use it, because they would be pulling in all of the mapreduce client as well. Also what about a project that just wants to use HDFS. Do they want to pull in all of yarn and mapreduce? Perhaps providing consumers of other parts of hadoop with similar functionality is beyond the scope of this ticket, and if so that is fine. I just want to understand a little better what the intention is for this JIRA before I give it a +1.

          Show
          Robert Joseph Evans added a comment - I like the patch, but I am a bit concerned about the scope of hadoop-client. There appears to be an effort underway to add in other computing models on top of yarn that are being added in as part of Hadoop itself, MPI with Hampster is the one the seems to be the furthest along. Would we add these in as well? If not then I would prefer to see hadoop-client named something with mapreduce in it, because that is what this is really creating a mapreduce client package for PIG, Hive and other mapreduce users. Other projects that just want to use YARN to write their own application master would not really want to use it, because they would be pulling in all of the mapreduce client as well. Also what about a project that just wants to use HDFS. Do they want to pull in all of yarn and mapreduce? Perhaps providing consumers of other parts of hadoop with similar functionality is beyond the scope of this ticket, and if so that is fine. I just want to understand a little better what the intention is for this JIRA before I give it a +1.
          Hide
          Alejandro Abdelnur added a comment -

          @Robert, the intention of this JIRA is to provide a simple and consistent way for downstream projects to use Apache Hadoop. Other projects may do the same, either by creating their client 'aggregator' artifact from scratch or by depending on these one and adding their additional dependencies (and excluding why they don't need).

          Regarding the name, while I'm not fixed on a particular name, I think that 'hadoop-client' makes sense as it is a client for the functionality that Apache Hadoop provides (HDFS, YARN & MAPRED for 0.23).

          Show
          Alejandro Abdelnur added a comment - @Robert, the intention of this JIRA is to provide a simple and consistent way for downstream projects to use Apache Hadoop. Other projects may do the same, either by creating their client 'aggregator' artifact from scratch or by depending on these one and adding their additional dependencies (and excluding why they don't need). Regarding the name, while I'm not fixed on a particular name, I think that 'hadoop-client' makes sense as it is a client for the functionality that Apache Hadoop provides (HDFS, YARN & MAPRED for 0.23).
          Hide
          Robert Joseph Evans added a comment -

          OK so the intention is that if a downstream project as a client wants to consume only a subset of hadoop, then they will have to pull out those pieces themselves, but if they want it simple then they can use this dependency and have all of hadoop available for them.

          That seems fine to me, thanks for the clarification. Have you tested this with some of our largest customers Hive, HBase, Pig, Oozie, etc.? In theory all you would have to do is to check them out, update their pom.xml to point to this new packages instead of the long list of old ones and it should still compile, and still run.

          Show
          Robert Joseph Evans added a comment - OK so the intention is that if a downstream project as a client wants to consume only a subset of hadoop, then they will have to pull out those pieces themselves, but if they want it simple then they can use this dependency and have all of hadoop available for them. That seems fine to me, thanks for the clarification. Have you tested this with some of our largest customers Hive, HBase, Pig, Oozie, etc.? In theory all you would have to do is to check them out, update their pom.xml to point to this new packages instead of the long list of old ones and it should still compile, and still run.
          Hide
          Alejandro Abdelnur added a comment -

          Robert, Yes I've tested both. hadoop-minicluster to run Oozie testcases and hadoop-client to build Oozie, then installed Oozie with it and run successfully a workflow that submits an MR job.

          Show
          Alejandro Abdelnur added a comment - Robert, Yes I've tested both. hadoop-minicluster to run Oozie testcases and hadoop-client to build Oozie, then installed Oozie with it and run successfully a workflow that submits an MR job.
          Hide
          Robert Joseph Evans added a comment -

          OK then I am a +1 on the patch.

          Show
          Robert Joseph Evans added a comment - OK then I am a +1 on the patch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #1705 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1705/)
          HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu)

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222
          Files :

          • /hadoop/common/trunk/hadoop-client
          • /hadoop/common/trunk/hadoop-client/pom.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-minicluster
          • /hadoop/common/trunk/hadoop-minicluster/pom.xml
          • /hadoop/common/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #1705 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1705/ ) HADOOP-8009 . Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu) tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222 Files : /hadoop/common/trunk/hadoop-client /hadoop/common/trunk/hadoop-client/pom.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-minicluster /hadoop/common/trunk/hadoop-minicluster/pom.xml /hadoop/common/trunk/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #1634 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1634/)
          HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu)

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222
          Files :

          • /hadoop/common/trunk/hadoop-client
          • /hadoop/common/trunk/hadoop-client/pom.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-minicluster
          • /hadoop/common/trunk/hadoop-minicluster/pom.xml
          • /hadoop/common/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1634 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1634/ ) HADOOP-8009 . Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu) tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222 Files : /hadoop/common/trunk/hadoop-client /hadoop/common/trunk/hadoop-client/pom.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-minicluster /hadoop/common/trunk/hadoop-minicluster/pom.xml /hadoop/common/trunk/pom.xml
          Hide
          Alejandro Abdelnur added a comment -

          I've just committed this to trunk and branch-0.23.

          Now I'd like to get opinions on how could we do this for pre 0.23 releases.

          My thoughts would be to create a branch with shell maven project where we can add the hadoop-client and hadoop-minicluster for previous releases.

          Thoughts?

          Show
          Alejandro Abdelnur added a comment - I've just committed this to trunk and branch-0.23. Now I'd like to get opinions on how could we do this for pre 0.23 releases. My thoughts would be to create a branch with shell maven project where we can add the hadoop-client and hadoop-minicluster for previous releases. Thoughts?
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Commit #452 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/452/)
          Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-client
          • /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml
          • /hadoop/common/branches/branch-0.23/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Commit #452 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/452/ ) Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009 tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231 Files : /hadoop/common/branches/branch-0.23/hadoop-client /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-minicluster /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml /hadoop/common/branches/branch-0.23/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-0.23-Commit #462 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/462/)
          Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-client
          • /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml
          • /hadoop/common/branches/branch-0.23/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Common-0.23-Commit #462 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/462/ ) Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009 tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231 Files : /hadoop/common/branches/branch-0.23/hadoop-client /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-minicluster /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml /hadoop/common/branches/branch-0.23/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #1649 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1649/)
          HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu)

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222
          Files :

          • /hadoop/common/trunk/hadoop-client
          • /hadoop/common/trunk/hadoop-client/pom.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-minicluster
          • /hadoop/common/trunk/hadoop-minicluster/pom.xml
          • /hadoop/common/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1649 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1649/ ) HADOOP-8009 . Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu) tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222 Files : /hadoop/common/trunk/hadoop-client /hadoop/common/trunk/hadoop-client/pom.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-minicluster /hadoop/common/trunk/hadoop-minicluster/pom.xml /hadoop/common/trunk/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Commit #475 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/475/)
          Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-client
          • /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml
          • /hadoop/common/branches/branch-0.23/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Commit #475 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/475/ ) Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009 tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231 Files : /hadoop/common/branches/branch-0.23/hadoop-client /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-minicluster /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml /hadoop/common/branches/branch-0.23/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #944 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/944/)
          HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu)

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222
          Files :

          • /hadoop/common/trunk/hadoop-client
          • /hadoop/common/trunk/hadoop-client/pom.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-minicluster
          • /hadoop/common/trunk/hadoop-minicluster/pom.xml
          • /hadoop/common/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #944 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/944/ ) HADOOP-8009 . Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu) tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222 Files : /hadoop/common/trunk/hadoop-client /hadoop/common/trunk/hadoop-client/pom.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-minicluster /hadoop/common/trunk/hadoop-minicluster/pom.xml /hadoop/common/trunk/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #157 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/157/)
          Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-client
          • /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml
          • /hadoop/common/branches/branch-0.23/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #157 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/157/ ) Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009 tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231 Files : /hadoop/common/branches/branch-0.23/hadoop-client /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-minicluster /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml /hadoop/common/branches/branch-0.23/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Build #179 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/179/)
          Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-client
          • /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml
          • /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster
          • /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml
          • /hadoop/common/branches/branch-0.23/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Build #179 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/179/ ) Merge -r 1239221:1239222 from trunk to branch. FIXES: HADOOP-8009 tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239231 Files : /hadoop/common/branches/branch-0.23/hadoop-client /hadoop/common/branches/branch-0.23/hadoop-client/pom.xml /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-minicluster /hadoop/common/branches/branch-0.23/hadoop-minicluster/pom.xml /hadoop/common/branches/branch-0.23/pom.xml
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #977 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/977/)
          HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu)

          tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222
          Files :

          • /hadoop/common/trunk/hadoop-client
          • /hadoop/common/trunk/hadoop-client/pom.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-minicluster
          • /hadoop/common/trunk/hadoop-minicluster/pom.xml
          • /hadoop/common/trunk/pom.xml
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #977 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/977/ ) HADOOP-8009 . Create hadoop-client and hadoop-minicluster artifacts for downstream projects. (tucu) tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1239222 Files : /hadoop/common/trunk/hadoop-client /hadoop/common/trunk/hadoop-client/pom.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-minicluster /hadoop/common/trunk/hadoop-minicluster/pom.xml /hadoop/common/trunk/pom.xml
          Hide
          Alejandro Abdelnur added a comment -

          Patch 'HADOOP-8009-existing-releases.patch' creates the hadoop-client and hadoop-minicluster artifacts for Hadoop 0.22.0 and Hadoop 1.0.0 releases.

          I've done the patch as it would go on a new branch (ie hadoop-client-minicluster) as those 2 versions are already released. Also, they are Ant base so it would a bit more difficult to integrate them there as this is Maven.

          If Hadoop 0.22 and Hadoop 1.0 branches produce new releases, those releases should be responsible for publishing their hadoop-client and hadoop-minicluster artifacts to Maven repo.

          In other words, this should be an one-off thing for 0.22.0 and 1.0.0.

          The only thing we should have to do once this patch is committed is to run 'mvn deploy' to make the artifacts available in the Maven repo.

          I've tested them with Oozie.

          Show
          Alejandro Abdelnur added a comment - Patch ' HADOOP-8009 -existing-releases.patch' creates the hadoop-client and hadoop-minicluster artifacts for Hadoop 0.22.0 and Hadoop 1.0.0 releases. I've done the patch as it would go on a new branch (ie hadoop-client-minicluster) as those 2 versions are already released. Also, they are Ant base so it would a bit more difficult to integrate them there as this is Maven. If Hadoop 0.22 and Hadoop 1.0 branches produce new releases, those releases should be responsible for publishing their hadoop-client and hadoop-minicluster artifacts to Maven repo. In other words, this should be an one-off thing for 0.22.0 and 1.0.0. The only thing we should have to do once this patch is committed is to run 'mvn deploy' to make the artifacts available in the Maven repo. I've tested them with Oozie.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513012/HADOOP-8009-existing-releases.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/558//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513012/HADOOP-8009-existing-releases.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/558//console This message is automatically generated.
          Hide
          Robert Joseph Evans added a comment -

          I like the new path. My biggest concern with it is maintainability. If it is a separate branch the release manager has to know about it and I am afraid that it will be forgotten. 1.0 is not going away in the near future. I would want the 1.0 release manager to sign off on it being in a separate branch, or possibly just be a separate directory under the 1.0 line, that they can run a maven command to commit the changes.

          Show
          Robert Joseph Evans added a comment - I like the new path. My biggest concern with it is maintainability. If it is a separate branch the release manager has to know about it and I am afraid that it will be forgotten. 1.0 is not going away in the near future. I would want the 1.0 release manager to sign off on it being in a separate branch, or possibly just be a separate directory under the 1.0 line, that they can run a maven command to commit the changes.
          Hide
          Alejandro Abdelnur added a comment -

          Agree 100%, this patch if for the already released 1.0.0 and 0.22.0. For minor releases, those branches should include this in their build.

          Show
          Alejandro Abdelnur added a comment - Agree 100%, this patch if for the already released 1.0.0 and 0.22.0. For minor releases, those branches should include this in their build.
          Hide
          Robert Joseph Evans added a comment -

          OK then I am a +1 assuming that you also are going to come up with a patch to put them in as part of their build.

          Show
          Robert Joseph Evans added a comment - OK then I am a +1 assuming that you also are going to come up with a patch to put them in as part of their build.
          Hide
          Tom White added a comment -

          +1 for releasing these artifacts for active release branches. There are upcoming minor releases for the 1 and 23 branches (and possibly 22?), so it would be good to incorporate these artifacts into those releases if possible.

          Show
          Tom White added a comment - +1 for releasing these artifacts for active release branches. There are upcoming minor releases for the 1 and 23 branches (and possibly 22?), so it would be good to incorporate these artifacts into those releases if possible.
          Hide
          Matt Foley added a comment -

          Hi, I'm okay in principle with adding something to branch-1.0 for this, but it doesn't look like either of the two patches currently posted are appropriate for adding directly to branch-1.0. I currently see the following issues:
          1. integration into the code branch
          1a) directory layout
          1b) ant vs maven
          2. need for top-level pom
          3. automatically picking up the version number so we don't have to edit it in every release

          To address 1a and 2, would the following make sense?

          • at the top level of the branch-1.0 code tree we add a directory named "integration-artifacts/".
          • under it, put "pom.xml", "hadoop-client/pom.xml", and "hadoop-minicluster/pom.xml"
            This avoids putting a "pom.xml" at the top level of branch-1.0, when the majority of 1.0 is built with ant not maven.

          To address 1b and 3,

          • provide an ant build rule in build.xml for target name "intartifactdeploy" which passes the build version number to an invocation of maven upon "integration-artifacts/pom.xml"
          • add "intartifactdeploy" as a dependency to the existing ant target "mvn-deploy"

          If this makes sense, and you are willing to put together the appropriate patch, I'll include it in branch-1.0 and branch-1.

          Show
          Matt Foley added a comment - Hi, I'm okay in principle with adding something to branch-1.0 for this, but it doesn't look like either of the two patches currently posted are appropriate for adding directly to branch-1.0. I currently see the following issues: 1. integration into the code branch 1a) directory layout 1b) ant vs maven 2. need for top-level pom 3. automatically picking up the version number so we don't have to edit it in every release To address 1a and 2, would the following make sense? at the top level of the branch-1.0 code tree we add a directory named "integration-artifacts/". under it, put "pom.xml", "hadoop-client/pom.xml", and "hadoop-minicluster/pom.xml" This avoids putting a "pom.xml" at the top level of branch-1.0, when the majority of 1.0 is built with ant not maven. To address 1b and 3, provide an ant build rule in build.xml for target name "intartifactdeploy" which passes the build version number to an invocation of maven upon "integration-artifacts/pom.xml" add "intartifactdeploy" as a dependency to the existing ant target "mvn-deploy" If this makes sense, and you are willing to put together the appropriate patch, I'll include it in branch-1.0 and branch-1.
          Hide
          Alejandro Abdelnur added a comment -

          Patch for branch-1, instead using Maven, I've just followed the same pattern used for the current poms.

          I've tested installing and deploying to an alternate local repo.

          Show
          Alejandro Abdelnur added a comment - Patch for branch-1, instead using Maven, I've just followed the same pattern used for the current poms. I've tested installing and deploying to an alternate local repo.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513642/HADOOP-8009-branch-1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/570//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513642/HADOOP-8009-branch-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/570//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Matt, does the patch for branch-1, keeping consistency with the current POM/Maven stuff, seem like a reasonable approach?

          Show
          Alejandro Abdelnur added a comment - Matt, does the patch for branch-1, keeping consistency with the current POM/Maven stuff, seem like a reasonable approach?
          Hide
          Matt Foley added a comment -

          +1, lgtm. I tested this in the 1.0.1 build, and the results looked correct. Committing to branch-1 and branch-1.0. Thanks Alejandro!

          Show
          Matt Foley added a comment - +1, lgtm. I tested this in the 1.0.1 build, and the results looked correct. Committing to branch-1 and branch-1.0. Thanks Alejandro!
          Hide
          Alejandro Abdelnur added a comment -

          Thanks Matt. One more thing, we should make these artifacts avail for 1.0.0, no?

          Show
          Alejandro Abdelnur added a comment - Thanks Matt. One more thing, we should make these artifacts avail for 1.0.0, no?
          Hide
          Alejandro Abdelnur added a comment -

          Small correction for the release notes, the name of the test artifact is 'hadoop-minicluster' not 'hadoop-test'

          Show
          Alejandro Abdelnur added a comment - Small correction for the release notes, the name of the test artifact is 'hadoop-minicluster' not 'hadoop-test'
          Hide
          Matt Foley added a comment -

          There was a bug in the branch-1 patch. A couple lines were missing in the "sign" target in build.xml, resulting in hadoop-client.jar and hadoop-minicluster.jar not getting signed, and therefore failing the upload. This new patch adds on to the previous one, to resolve the problem.

          Show
          Matt Foley added a comment - There was a bug in the branch-1 patch. A couple lines were missing in the "sign" target in build.xml, resulting in hadoop-client.jar and hadoop-minicluster.jar not getting signed, and therefore failing the upload. This new patch adds on to the previous one, to resolve the problem.
          Hide
          Matt Foley added a comment -

          Alejandro, please code review.

          Show
          Matt Foley added a comment - Alejandro, please code review.
          Hide
          Steve Loughran added a comment -

          Note that the classname of MiniDFSCluster moves between 0.20.x and 0.23.x . If the minicluster artifacts are to be used downstream it'd be nice to have the original class still there (even if it's just a subclass of the now-moved class)

          Show
          Steve Loughran added a comment - Note that the classname of MiniDFSCluster moves between 0.20.x and 0.23.x . If the minicluster artifacts are to be used downstream it'd be nice to have the original class still there (even if it's just a subclass of the now-moved class)
          Hide
          Alejandro Abdelnur added a comment -

          @Matt, +1

          Show
          Alejandro Abdelnur added a comment - @Matt, +1
          Hide
          Alejandro Abdelnur added a comment -

          @Steve, I wonder how things work in Oozie then, I can compile/run testcases using 1.0.0 and 0.23.x without changing imports. What am I missing?

          Show
          Alejandro Abdelnur added a comment - @Steve, I wonder how things work in Oozie then, I can compile/run testcases using 1.0.0 and 0.23.x without changing imports. What am I missing?
          Hide
          Matt Foley added a comment -

          Thanks, Alejandro.

          Committed to branch-1.0 and branch-1.

          Show
          Matt Foley added a comment - Thanks, Alejandro. Committed to branch-1.0 and branch-1.
          Hide
          Konstantin Shvachko added a comment -

          Alejandro, if you could produce a 0.22 patch similar to branch-1.0 (ant based) I'd be glad to incorporate in the current 0.22.1 branch and deploy client jars for 0.22.0 to maven. Would appreciate your help.

          Show
          Konstantin Shvachko added a comment - Alejandro, if you could produce a 0.22 patch similar to branch-1.0 (ant based) I'd be glad to incorporate in the current 0.22.1 branch and deploy client jars for 0.22.0 to maven. Would appreciate your help.
          Hide
          Alejandro Abdelnur added a comment -

          @Konstantin, attached is a patch for the 0.22 branch. I've test it installing the deploying to a local repo. I think I've got the signing cover, but you may have to tweak that if it does not work.

          Also, on another note, 0.22 is not publishing the hadoop-streaming JAR to maven repo, because of that we cannot use 0.22 from Oozie. I'll open a JIRA for that.

          Thanks.

          Show
          Alejandro Abdelnur added a comment - @Konstantin, attached is a patch for the 0.22 branch. I've test it installing the deploying to a local repo. I think I've got the signing cover, but you may have to tweak that if it does not work. Also, on another note, 0.22 is not publishing the hadoop-streaming JAR to maven repo, because of that we cannot use 0.22 from Oozie. I'll open a JIRA for that. Thanks.
          Hide
          Alejandro Abdelnur added a comment -

          JIRA for the missing hadoop-streaming in 0.22 is MAPREDUCE-3863

          Show
          Alejandro Abdelnur added a comment - JIRA for the missing hadoop-streaming in 0.22 is MAPREDUCE-3863

            People

            • Assignee:
              Alejandro Abdelnur
              Reporter:
              Alejandro Abdelnur
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development