Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1888

Streaming overrides user given output key and value types.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.22.0
    • Component/s: contrib/streaming
    • Labels:
      None

      Description

      The following code in StreamJob.java overrides user given output key and value types.

          idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
              IdentifierResolver.TEXT_ID));
          conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
            idResolver.getOutputReaderClass(), OutputReader.class);
          job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
          job.setMapOutputValueClass(idResolver.getOutputValueClass());
          
          idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
              IdentifierResolver.TEXT_ID));
          conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
            idResolver.getOutputReaderClass(), OutputReader.class);
          job.setOutputKeyClass(idResolver.getOutputKeyClass());
          job.setOutputValueClass(idResolver.getOutputValueClass());
      
      1. 1888.patch
        9 kB
        Ravi Gummadi
      2. 1888.v1.patch
        25 kB
        Ravi Gummadi
      3. 1888.v2.patch
        26 kB
        Ravi Gummadi
      4. 1888.v3.patch
        29 kB
        Ravi Gummadi
      5. 1888.v4.patch
        34 kB
        Ravi Gummadi

        Issue Links

          Activity

          Hide
          Ravi Gummadi added a comment -

          Attaching patch fixing the issue. Added testcase that validates the user set config properties mapreduce.map.output.key.class and mapreduce.output.key.class are honored in streaming jobs with java classes as mapper and reducer.

          Show
          Ravi Gummadi added a comment - Attaching patch fixing the issue. Added testcase that validates the user set config properties mapreduce.map.output.key.class and mapreduce.output.key.class are honored in streaming jobs with java classes as mapper and reducer.
          Hide
          Amareshwari Sriramadasu added a comment -

          Some comments on the patch:

          • Please don't change genArgs() in TestStreaming. You can override it in TestStreamingJavaTasks as done in other tests.
          • Is the Change in TestFileOutputFormat required?
          Show
          Amareshwari Sriramadasu added a comment - Some comments on the patch: Please don't change genArgs() in TestStreaming. You can override it in TestStreamingJavaTasks as done in other tests. Is the Change in TestFileOutputFormat required?
          Hide
          Ravi Gummadi added a comment -

          >> Please don't change genArgs() in TestStreaming. You can override it in TestStreamingJavaTasks as done in other tests.

          Hmm. That will be duplication of code. I want to avoid that. Modified all other testcases also that are derived from TestStreaming so that reuse of code is done(in my new patch).

          >> Is the Change in TestFileOutputFormat required?

          This is just a duplication of code of those 2 lines. Though unrelated to this JIRA, removed those 2 lines.

          Attaching new patch for trunk with the above mentioned changes.

          Show
          Ravi Gummadi added a comment - >> Please don't change genArgs() in TestStreaming. You can override it in TestStreamingJavaTasks as done in other tests. Hmm. That will be duplication of code. I want to avoid that. Modified all other testcases also that are derived from TestStreaming so that reuse of code is done(in my new patch). >> Is the Change in TestFileOutputFormat required? This is just a duplication of code of those 2 lines. Though unrelated to this JIRA, removed those 2 lines. Attaching new patch for trunk with the above mentioned changes.
          Hide
          Amareshwari Sriramadasu added a comment -

          Passing NONE for reducer command, still has problem with key, value types.

          Show
          Amareshwari Sriramadasu added a comment - Passing NONE for reducer command, still has problem with key, value types.
          Hide
          Ravi Gummadi added a comment -

          Attaching patch by fixing the Reducer NONE case.

          Show
          Ravi Gummadi added a comment - Attaching patch by fixing the Reducer NONE case.
          Hide
          Amareshwari Sriramadasu added a comment -

          +1 Patch looks fine.

          Show
          Amareshwari Sriramadasu added a comment - +1 Patch looks fine.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12448202/1888.v2.patch
          against trunk revision 958279.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 29 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448202/1888.v2.patch against trunk revision 958279. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 29 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          Canceling the patch to fix test failures.

          Show
          Amareshwari Sriramadasu added a comment - Canceling the patch to fix test failures.
          Hide
          Ravi Gummadi added a comment -

          TrApp.java when used as mapper should not expect mapreduce_job_output_key_class as Text and mapreduce_job_output_value_class as Text.
          Fixing TrApp.java so that it expects mapreduce_map_output_key_class and mapreduce_map_output_key_class.

          Show
          Ravi Gummadi added a comment - TrApp.java when used as mapper should not expect mapreduce_job_output_key_class as Text and mapreduce_job_output_value_class as Text. Fixing TrApp.java so that it expects mapreduce_map_output_key_class and mapreduce_map_output_key_class.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch fixing TrApp.

          Show
          Ravi Gummadi added a comment - Attaching new patch fixing TrApp.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12448337/1888.v3.patch
          against trunk revision 958902.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 35 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448337/1888.v3.patch against trunk revision 958902. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 35 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          +1 Latest patch looks fine.
          Test failure is because of MAPREDUCE-1834.

          Show
          Amareshwari Sriramadasu added a comment - +1 Latest patch looks fine. Test failure is because of MAPREDUCE-1834 .
          Hide
          Amareshwari Sriramadasu added a comment -

          I just committed this. Thanks Ravi !

          Show
          Amareshwari Sriramadasu added a comment - I just committed this. Thanks Ravi !
          Hide
          Amareshwari Sriramadasu added a comment -

          While working on MAPREDUCE-1122, i realized that one more corner case got missed here. For reducer=NONE, if the mapper is a command, then job output key and value types should be set from IO identifier.

          Show
          Amareshwari Sriramadasu added a comment - While working on MAPREDUCE-1122 , i realized that one more corner case got missed here. For reducer=NONE, if the mapper is a command, then job output key and value types should be set from IO identifier.
          Hide
          Amareshwari Sriramadasu added a comment -

          I just reverted the commit to fix the corner case.

          Show
          Amareshwari Sriramadasu added a comment - I just reverted the commit to fix the corner case.
          Hide
          Amareshwari Sriramadasu added a comment -

          Ravi, Can you add testcases for all combination of the following?
          1) Mapper is command or is a java mapper.
          2) Reducer is command or is java mapper or is None.
          You can rename the testcase TestStreamingOutputKeyValueTypes instead of TestStreamingJavaTasks.

          Show
          Amareshwari Sriramadasu added a comment - Ravi, Can you add testcases for all combination of the following? 1) Mapper is command or is a java mapper. 2) Reducer is command or is java mapper or is None. You can rename the testcase TestStreamingOutputKeyValueTypes instead of TestStreamingJavaTasks.
          Hide
          Ravi Gummadi added a comment -

          Attaching patch for trunk covering the case of reducer=NONE and mapper is a command.

          Show
          Ravi Gummadi added a comment - Attaching patch for trunk covering the case of reducer=NONE and mapper is a command.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12448506/1888.v4.patch
          against trunk revision 959867.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 35 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448506/1888.v4.patch against trunk revision 959867. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 35 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/591/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch looks good.
          A couple of mumak tests failed with NoClassDefFoundError, they passed on my local machine. TestSimulatorDeterministicReplay failed because of MAPREDUCE-1834.
          Will check this in.

          Show
          Amareshwari Sriramadasu added a comment - Patch looks good. A couple of mumak tests failed with NoClassDefFoundError, they passed on my local machine. TestSimulatorDeterministicReplay failed because of MAPREDUCE-1834 . Will check this in.
          Hide
          Amareshwari Sriramadasu added a comment -

          I just committed this.
          Thanks Ravi!

          Show
          Amareshwari Sriramadasu added a comment - I just committed this. Thanks Ravi!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/ )

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development