Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.3
    • Fix Version/s: 2.0.3-alpha
    • Component/s: documentation
    • Labels:
      None
    • Target Version/s:
    • Tags:
      multipleoutputs, multipletestoutputformat, new api, lazyoutputformat

      Description

      In the new API, and using MultipleOutputs it is possible to segment output into directories by using MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath) in the Reducer to determine the output directory, and by using LazyOutputFormat at the job-level config to suppress normal output [eg use LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); instead of job.setOutputFormatClass(TextOutputFormat.class);]

      This recreates the functionality previously provided in the old API by using MultipleTextOutputFormat (etc)

      1. MAPREDUCE-4616.patch
        4 kB
        Tony Burton
      2. MAPREDUCE-4616.patch
        5 kB
        Tony Burton

        Activity

        Hide
        Arun C Murthy added a comment -

        Thanks for catching this Tony. Do you mind providing a patch to fix this? Thanks!

        http://wiki.apache.org/hadoop/HowToContribute if you need details. Thanks again.

        Show
        Arun C Murthy added a comment - Thanks for catching this Tony. Do you mind providing a patch to fix this? Thanks! http://wiki.apache.org/hadoop/HowToContribute if you need details. Thanks again.
        Hide
        Tony Burton added a comment -

        patch file for Jira issue MAPREDUCE-4616

        Show
        Tony Burton added a comment - patch file for Jira issue MAPREDUCE-4616
        Hide
        Tony Burton added a comment -

        Documentation changes to describe how to use MultipleOutputs and LazyOutputFormat to mimic behaviour in the now-deprecated MultipleTextOutputFormat (and similar)

        Show
        Tony Burton added a comment - Documentation changes to describe how to use MultipleOutputs and LazyOutputFormat to mimic behaviour in the now-deprecated MultipleTextOutputFormat (and similar)
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12544650/MAPREDUCE-4616.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544650/MAPREDUCE-4616.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//console This message is automatically generated.
        Hide
        Harsh J added a comment -

        Thanks Tony.

        A few comments we may address before this gets committed (mostly nits and typos):

        • "Use in conjuction" -> "Useful when used in conjunction" perhaps?
        • "Use your own code in <code>generateFileName()</code>". Your code sample references a generateFileName method but doesn't show an implementation. Perhaps add in a sample implementation that returns "part", "foo" or whatever?
        • "in your Hadoop job task-level setup." -> Simpler to say "in your Job configuration."?
        Show
        Harsh J added a comment - Thanks Tony. A few comments we may address before this gets committed (mostly nits and typos): "Use in conjuction" -> "Useful when used in conjunction" perhaps? "Use your own code in <code>generateFileName()</code>". Your code sample references a generateFileName method but doesn't show an implementation. Perhaps add in a sample implementation that returns "part", "foo" or whatever? "in your Hadoop job task-level setup." -> Simpler to say "in your Job configuration."?
        Hide
        Tony Burton added a comment -

        Further modifications to documentation as a result of feedback from committers.

        Show
        Tony Burton added a comment - Further modifications to documentation as a result of feedback from committers.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12545883/MAPREDUCE-4616.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +0 tests included. The patch appears to be a documentation patch that doesn't require tests.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2863//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2863//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545883/MAPREDUCE-4616.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2863//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2863//console This message is automatically generated.
        Hide
        Arun C Murthy added a comment -

        I just committed this. Thanks Tony!

        (Having trouble adding you to contributors list on jira. I'll fix the assignee asap! Sorry!)

        Show
        Arun C Murthy added a comment - I just committed this. Thanks Tony! (Having trouble adding you to contributors list on jira. I'll fix the assignee asap! Sorry!)
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #2908 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2908/)
        MAPREDUCE-4616. Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182)

        Result = SUCCESS
        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2908 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2908/ ) MAPREDUCE-4616 . Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #2846 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2846/)
        MAPREDUCE-4616. Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182)

        Result = SUCCESS
        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2846 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2846/ ) MAPREDUCE-4616 . Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #2871 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2871/)
        MAPREDUCE-4616. Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182)

        Result = FAILURE
        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2871 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2871/ ) MAPREDUCE-4616 . Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182) Result = FAILURE acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1193 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1193/)
        MAPREDUCE-4616. Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182)

        Result = SUCCESS
        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1193 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1193/ ) MAPREDUCE-4616 . Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1224 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1224/)
        MAPREDUCE-4616. Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182)

        Result = SUCCESS
        acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1224 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1224/ ) MAPREDUCE-4616 . Improve javadoc for MultipleOutputs. Contributed by Tony Burton. (Revision 1397182) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1397182 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.java
        Hide
        Harsh J added a comment -

        Thanks Arun! And sorry I missed this out after that review Tony, my apologies.

        Show
        Harsh J added a comment - Thanks Arun! And sorry I missed this out after that review Tony, my apologies.

          People

          • Assignee:
            Tony Burton
            Reporter:
            Tony Burton
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development