Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.2
    • Fix Version/s: 0.10.0
    • Component/s: build
    • Labels:
      None

      Description

      From Hadoop-356:
      >I note that the contrib packages are not included in distributions (the "tar" target).
      >They probably should be. Michel, would you like to modify "tar" to include the contrib code?
      >This should be done in a separate bug.

      OK.
      This packaging is done in target deploy-contrib.
      So I can just add a dependency to the top-level tar target:
      <!-- Make release tarball(s) -->
      <target name="tar" depends="package, deploy-contrib">

      1. 371.patch
        9 kB
        Nigel Daley
      2. build.tarcontrib.patch
        0.6 kB
        Michel Tourn
      3. contribbuild.patch
        2 kB
        Nigel Daley

        Activity

        Hide
        Doug Cutting added a comment -

        I was thinking we should include the contrib jar files and documentation. The jar files and readme files could go in something like:

        contrib/
        streaming/
        streaming.jar
        readme.txt

        at the top level in the tar file.

        The javadoc could be built into the main javadoc tree, but in a separate section, like Lucene:

        http://lucene.apache.org/java/docs/api/overview-summary.html

        Does this sound reasonable?

        Show
        Doug Cutting added a comment - I was thinking we should include the contrib jar files and documentation. The jar files and readme files could go in something like: contrib/ streaming/ streaming.jar readme.txt at the top level in the tar file. The javadoc could be built into the main javadoc tree, but in a separate section, like Lucene: http://lucene.apache.org/java/docs/api/overview-summary.html Does this sound reasonable?
        Hide
        Michel Tourn added a comment -

        OK to keep docs and readme separate in the tarball package.

        But I thought that the way you get all the code on the CLASSPATH in bin/hadoop:
        is by placing the contrib jar-s along with the main hadoop.jar..

        Show
        Michel Tourn added a comment - OK to keep docs and readme separate in the tarball package. But I thought that the way you get all the code on the CLASSPATH in bin/hadoop: is by placing the contrib jar-s along with the main hadoop.jar..
        Hide
        Doug Cutting added a comment -

        > I thought that the way you get all the code on the CLASSPATH in bin/hadoop:
        > is by placing the contrib jar-s along with the main hadoop.jar.

        The contrib and example code should not be on the CLASSPATH by default, but rather should be run using the 'bin/hadoop jar ...' command.

        Show
        Doug Cutting added a comment - > I thought that the way you get all the code on the CLASSPATH in bin/hadoop: > is by placing the contrib jar-s along with the main hadoop.jar. The contrib and example code should not be on the CLASSPATH by default, but rather should be run using the 'bin/hadoop jar ...' command.
        Hide
        Nigel Daley added a comment -

        The attached patch to build.xml does 2 things:

        1) Modifies the javadoc target to include the contrib javadoc along with the regular javadoc.
        The overview-summary.html is created to mimic lucene's:
        http://lucene.apache.org/java/docs/api/overview-summary.html
        Suggestion: perhaps the contrib javadoc should be entirely
        separate from the core javadoc, say in
        contrib/docs ???

        2) Modifies the package target to include the contrib jar files in
        contrib/streaming/hadoop-streaming.jar
        contrib/smallJobsBenchmark/MRBenchmark.jar
        Suggestion: perhaps the contrib jar files should be in either
        contrib/lib/hadoop-streaming.jar
        contrib/lib/MRBenchmark.jar
        or
        contrib/hadoop-streaming.jar
        contrib/MRBenchmark.jar ???

        Any thoughts on the above suggestions or other alternatives?

        Show
        Nigel Daley added a comment - The attached patch to build.xml does 2 things: 1) Modifies the javadoc target to include the contrib javadoc along with the regular javadoc. The overview-summary.html is created to mimic lucene's: http://lucene.apache.org/java/docs/api/overview-summary.html Suggestion: perhaps the contrib javadoc should be entirely separate from the core javadoc, say in contrib/docs ??? 2) Modifies the package target to include the contrib jar files in contrib/streaming/hadoop-streaming.jar contrib/smallJobsBenchmark/MRBenchmark.jar Suggestion: perhaps the contrib jar files should be in either contrib/lib/hadoop-streaming.jar contrib/lib/MRBenchmark.jar or contrib/hadoop-streaming.jar contrib/MRBenchmark.jar ??? Any thoughts on the above suggestions or other alternatives?
        Hide
        Nigel Daley added a comment -

        I think I'll propose a different solution that expands a bit on the scope of this bug to include consolidating benchmarks and reordering javadoc:

        1) consolidate benchmarks into src/test/org/apache/hadoop/benchmark and include them in the test jar file. This includes:

        • src/contrib/smallJobsBenchmark
        • src/examples/org/apache/hadoop/examples/NNBench.java

        2) rename PiBenchmark example to PiEstimator and leave it in the examples package

        3) build hadoop-streaming.jar into a top-level extensions directory using the package target

        4) package the contrib javadoc into docs/contrib and package the example javadoc into docs/examples

        5) create a top-level index.html that points to

        • hadoop api javadoc
        • examples javadoc
        • contrib javadoc
        • existing docs/hadoop-default.html
        • man pages for our command line utilities (would need to write these)
        • whitepapers
        • hadoop wiki
        • etc.

        Comments?

        Show
        Nigel Daley added a comment - I think I'll propose a different solution that expands a bit on the scope of this bug to include consolidating benchmarks and reordering javadoc: 1) consolidate benchmarks into src/test/org/apache/hadoop/benchmark and include them in the test jar file. This includes: src/contrib/smallJobsBenchmark src/examples/org/apache/hadoop/examples/NNBench.java 2) rename PiBenchmark example to PiEstimator and leave it in the examples package 3) build hadoop-streaming.jar into a top-level extensions directory using the package target 4) package the contrib javadoc into docs/contrib and package the example javadoc into docs/examples 5) create a top-level index.html that points to hadoop api javadoc examples javadoc contrib javadoc existing docs/hadoop-default.html man pages for our command line utilities (would need to write these) whitepapers hadoop wiki etc. Comments?
        Hide
        Doug Cutting added a comment -

        I believe smallJobsBenchmark was initially placed in contrib to keep it in a separate jar, not on the default classpath. So wherever it is moved, it should retain that property.

        I would prefer putting the contrib javadoc together with the main javadoc, as is done in Lucene:

        http://lucene.apache.org/java/docs/api/overview-summary.html

        As for a top-level index.html, should that really differ from site/index.html? That's included in releases and links to everything you mention, I think. Perhaps we should simply add an index.html with a meta-redirect to that page. Lucene does this:

        http://svn.apache.org/viewvc/lucene/java/trunk/index.html?view=markup

        (Like the last-modified date on that one?)

        Show
        Doug Cutting added a comment - I believe smallJobsBenchmark was initially placed in contrib to keep it in a separate jar, not on the default classpath. So wherever it is moved, it should retain that property. I would prefer putting the contrib javadoc together with the main javadoc, as is done in Lucene: http://lucene.apache.org/java/docs/api/overview-summary.html As for a top-level index.html, should that really differ from site/index.html? That's included in releases and links to everything you mention, I think. Perhaps we should simply add an index.html with a meta-redirect to that page. Lucene does this: http://svn.apache.org/viewvc/lucene/java/trunk/index.html?view=markup (Like the last-modified date on that one?)
        Hide
        Nigel Daley added a comment -

        Here's an updated proposal for restructuring the hadoop packaging:

        1) consolidate benchmarks into src/test/org/apache/hadoop and include them in the test jar file:

        • move src/contrib/smallJobsBenchmark to src/test/org/apache/hadoop/mapred
        • move src/examples/org/apache/hadoop/examples/NNBench.java to src/test/org/apache/hadoop/dfs

        2) rename PiBenchmark example to PiEstimator and leave it in the examples package

        3) build hadoop-streaming.jar into a top-level contrib directory using the package target

        4) package the contrib and examples javadoc with the core javadoc and provide an overview-summary.html like lucene's:
        http://lucene.apache.org/java/docs/api/overview-summary.html

        5) include the site docs into the top-level docs directory

        More comments?

        Show
        Nigel Daley added a comment - Here's an updated proposal for restructuring the hadoop packaging: 1) consolidate benchmarks into src/test/org/apache/hadoop and include them in the test jar file: move src/contrib/smallJobsBenchmark to src/test/org/apache/hadoop/mapred move src/examples/org/apache/hadoop/examples/NNBench.java to src/test/org/apache/hadoop/dfs 2) rename PiBenchmark example to PiEstimator and leave it in the examples package 3) build hadoop-streaming.jar into a top-level contrib directory using the package target 4) package the contrib and examples javadoc with the core javadoc and provide an overview-summary.html like lucene's: http://lucene.apache.org/java/docs/api/overview-summary.html 5) include the site docs into the top-level docs directory More comments?
        Hide
        Nigel Daley added a comment -

        371.patch addresses all issues in my Dec 14 comment except one. I have not moved src/contrib/smallJobsBenchmark – I will open a new issue for this item. This patch also fixes some streaming javadoc comments that were causing warnings.

        Before committing 371.patch, you must perform the following svn commands:

        svn mv src/examples/org/apache/hadoop/examples/NNBench.java src/test/org/apache/hadoop/dfs/NNBench.java
        svn mv src/examples/org/apache/hadoop/examples/PiBenchmark.java src/examples/org/apache/hadoop/examples/PiEstimator.java

        The patch will then fix up their package and class names.

        Show
        Nigel Daley added a comment - 371.patch addresses all issues in my Dec 14 comment except one. I have not moved src/contrib/smallJobsBenchmark – I will open a new issue for this item. This patch also fixes some streaming javadoc comments that were causing warnings. Before committing 371.patch, you must perform the following svn commands: svn mv src/examples/org/apache/hadoop/examples/NNBench.java src/test/org/apache/hadoop/dfs/NNBench.java svn mv src/examples/org/apache/hadoop/examples/PiBenchmark.java src/examples/org/apache/hadoop/examples/PiEstimator.java The patch will then fix up their package and class names.
        Hide
        Hadoop QA added a comment -

        -1, because the patch command could not apply the latest attachment (http://issues.apache.org/jira/secure/attachment/12348237/371.patch) as a patch to trunk revision r492365. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

        Show
        Hadoop QA added a comment - -1, because the patch command could not apply the latest attachment ( http://issues.apache.org/jira/secure/attachment/12348237/371.patch ) as a patch to trunk revision r492365. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
        Hide
        Nigel Daley added a comment -

        The -1 comment from Hadoop QA is expected as this patch cannot be applied until the noted svn mv commands are run.

        Show
        Nigel Daley added a comment - The -1 comment from Hadoop QA is expected as this patch cannot be applied until the noted svn mv commands are run.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Nigel!

        I made two small additional changes: I changed the links to the javadoc from api/ to api/index.html, so that they work offline. I also added a root index.html file that meta redirects to docs/index.html.

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Nigel! I made two small additional changes: I changed the links to the javadoc from api/ to api/index.html, so that they work offline. I also added a root index.html file that meta redirects to docs/index.html.

          People

          • Assignee:
            Nigel Daley
            Reporter:
            Michel Tourn
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development