Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.23.0, 0.24.0
    • Component/s: build
    • Labels:
      None
    • Environment:

      Java 6, Redhat 5.5

      Description

      Hadoop release tarball contains both raw source and binary. This leads users to use the release tarball as base for applying patches, to build custom Hadoop. This is not the recommended method to develop hadoop because it leads to mixed development system where processed files and raw source are hard to separate.

      To correct the problematic usage of the release tarball, the release build target should be defined as:

      "ant source" generates source release tarball.
      "ant binary" is binary release without source/javadoc jar files.
      "ant tar" is a mirror of binary release with source/javadoc jar files.

      Does this sound reasonable?

        Activity

        Hide
        Eric Yang added a comment -

        This has been resolved in Hadoop mavenization with src profile to create source tarball, and dist profile to create binary tarball.

        Show
        Eric Yang added a comment - This has been resolved in Hadoop mavenization with src profile to create source tarball, and dist profile to create binary tarball.
        Hide
        Eric Yang added a comment -

        Path has became staled after mavenization.

        Show
        Eric Yang added a comment - Path has became staled after mavenization.
        Hide
        Allen Wittenauer added a comment -

        FWIW, I'm not going to block this, but I still think it is going to lead to confusion, except for maybe the three people who debug production grids with eclipse.

        Show
        Allen Wittenauer added a comment - FWIW, I'm not going to block this, but I still think it is going to lead to confusion, except for maybe the three people who debug production grids with eclipse.
        Hide
        Eric Yang added a comment -

        Source jar + JMX + Eclipse combined can setup a debug environment when production cluster is in trouble.

        Show
        Eric Yang added a comment - Source jar + JMX + Eclipse combined can setup a debug environment when production cluster is in trouble.
        Hide
        Allen Wittenauer added a comment -

        Why would Eclipse users use the tarball? Besides, don't Eclipse users have other things they need to do before they can actually do things with Hadoop?

        Show
        Allen Wittenauer added a comment - Why would Eclipse users use the tarball? Besides, don't Eclipse users have other things they need to do before they can actually do things with Hadoop?
        Hide
        Eric Yang added a comment -

        For developer, source jar files can help debugging the application in eclipse, where tar packaged source can't.

        Show
        Eric Yang added a comment - For developer, source jar files can help debugging the application in eclipse, where tar packaged source can't.
        Hide
        Allen Wittenauer added a comment -

        Sources are compressed to a jar file as $HADOOP_PREFIX/share/hadoop/hadoop-source-[version].jar, Javadoc is compressed as $HADOOP_PREFIX/share/javadoc/hadoop-javadoc-[version].jar

        Do we really want to use jar for these? This could lead to massive confusion. Besides, if these are part of the tarball distribution, the user clearly has tar available...

        Show
        Allen Wittenauer added a comment - Sources are compressed to a jar file as $HADOOP_PREFIX/share/hadoop/hadoop-source- [version] .jar, Javadoc is compressed as $HADOOP_PREFIX/share/javadoc/hadoop-javadoc- [version] .jar Do we really want to use jar for these? This could lead to massive confusion. Besides, if these are part of the tarball distribution, the user clearly has tar available...
        Hide
        Devaraj Das added a comment -

        I don't think this is a must-fix for 20.2xx.

        Show
        Devaraj Das added a comment - I don't think this is a must-fix for 20.2xx.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12482496/HADOOP-7371.patch
        against trunk revision 1135333.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/623//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12482496/HADOOP-7371.patch against trunk revision 1135333. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/623//console This message is automatically generated.
        Hide
        Eric Yang added a comment -

        For "ant source"

        • Create a hadoop-source-[version].tar.gz in build directory
        Show
        Eric Yang added a comment - For "ant source" Create a hadoop-source- [version] .tar.gz in build directory
        Hide
        Eric Yang added a comment -

        For "ant tar":

        • Sources are compressed to a jar file as $HADOOP_PREFIX/share/hadoop/hadoop-source-[version].jar
        • Javadoc is compressed as $HADOOP_PREFIX/share/javadoc/hadoop-javadoc-[version].jar
        • Documents are relocated to $HADOOP_PREFIX/share/doc/hadoop
        • HADOOP_HOME is set to be the same as HADOOP_PREFIX
        Show
        Eric Yang added a comment - For "ant tar": Sources are compressed to a jar file as $HADOOP_PREFIX/share/hadoop/hadoop-source- [version] .jar Javadoc is compressed as $HADOOP_PREFIX/share/javadoc/hadoop-javadoc- [version] .jar Documents are relocated to $HADOOP_PREFIX/share/doc/hadoop HADOOP_HOME is set to be the same as HADOOP_PREFIX
        Hide
        Eric Yang added a comment -

        Alejandro, source tarball should be the source ready to build without hadoop jar files.

        Show
        Eric Yang added a comment - Alejandro, source tarball should be the source ready to build without hadoop jar files.
        Hide
        Alejandro Abdelnur added a comment -

        Eric, the source TAR, should be the source ready to build, or ...?

        Show
        Alejandro Abdelnur added a comment - Eric, the source TAR, should be the source ready to build, or ...?
        Hide
        Alejandro Abdelnur added a comment -

        Yes, we need HDFS & MR mavenized as well.

        Those would be done immediately after HADOOP-6671, first HDFS and then MR.

        Note that you'll be able to work on non-mavenized HDFS and MR using mavenized COMMON.

        Show
        Alejandro Abdelnur added a comment - Yes, we need HDFS & MR mavenized as well. Those would be done immediately after HADOOP-6671 , first HDFS and then MR. Note that you'll be able to work on non-mavenized HDFS and MR using mavenized COMMON.
        Hide
        Eli Collins added a comment -

        But this jira pertains to all the projects (we release one tarball, not three), not just common. Ie we'd need the HDFS and MR side of HADOOP-6671 to use this here.

        Show
        Eli Collins added a comment - But this jira pertains to all the projects (we release one tarball, not three), not just common. Ie we'd need the HDFS and MR side of HADOOP-6671 to use this here.
        Hide
        Alejandro Abdelnur added a comment -

        Rg Eli's comment: I'm wiring DEB/RPM in HADOOP-6671, after doing Mavenization of hadoop-common would be equivalent to Ant functionality.

        Show
        Alejandro Abdelnur added a comment - Rg Eli's comment: I'm wiring DEB/RPM in HADOOP-6671 , after doing Mavenization of hadoop-common would be equivalent to Ant functionality.
        Hide
        Alejandro Abdelnur added a comment -

        HADOOP-6671 (in trunk) will include options to generate source/binary/tar as described in this JIRA

        Show
        Alejandro Abdelnur added a comment - HADOOP-6671 (in trunk) will include options to generate source/binary/tar as described in this JIRA
        Hide
        Eli Collins added a comment -

        Because of that, wouldn't make sense to wait till Mavenization is in place for this?

        Is common/hdfs/mr mavenization coming soon? We shouldn't block this work on a much larger project.

        Show
        Eli Collins added a comment - Because of that, wouldn't make sense to wait till Mavenization is in place for this? Is common/hdfs/mr mavenization coming soon? We shouldn't block this work on a much larger project.
        Hide
        Eric Yang added a comment -

        If this can be done as part of HADOOP-6671, then please change this jira to 0.20.205. Otherwise, I will submit a patch for trunk and make another jira for 0.20.205.

        Show
        Eric Yang added a comment - If this can be done as part of HADOOP-6671 , then please change this jira to 0.20.205. Otherwise, I will submit a patch for trunk and make another jira for 0.20.205.
        Hide
        Alejandro Abdelnur added a comment -

        Got it, so this means this JIRA is not for trunk, right?

        Show
        Alejandro Abdelnur added a comment - Got it, so this means this JIRA is not for trunk, right?
        Hide
        Eric Yang added a comment -

        Because of that, wouldn't make sense to wait till Mavenization is in place for this?

        Yes, for trunk, it would. For 0.20.205 branch, maven work is not going to be back ported to this branch, and it looks like a needed improvement for 0.20.205 if there is plan to make future release from 0.20-security branch.

        Show
        Eric Yang added a comment - Because of that, wouldn't make sense to wait till Mavenization is in place for this? Yes, for trunk, it would. For 0.20.205 branch, maven work is not going to be back ported to this branch, and it looks like a needed improvement for 0.20.205 if there is plan to make future release from 0.20-security branch.
        Hide
        Alejandro Abdelnur added a comment -

        Yes, it makes senses.

        As part of the Mavenization HADOOP-6671 this would be trivial.

        Because of that, wouldn't make sense to wait till Mavenization is in place for this?

        Show
        Alejandro Abdelnur added a comment - Yes, it makes senses. As part of the Mavenization HADOOP-6671 this would be trivial. Because of that, wouldn't make sense to wait till Mavenization is in place for this?

          People

          • Assignee:
            Eric Yang
            Reporter:
            Eric Yang
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development