Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.22.0
    • Component/s: contrib/streaming
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed an NPE in streaming that occurs when there is no input to reduce and the streaming reducer sends status updates by writing "reporter:status: xxx" statements to stderr.

      Description

      Some reduce tasks fail with following NPE
      java.lang.RuntimeException: java.lang.NullPointerException
      at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
      at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:540)
      at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
      at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:412)
      at org.apache.hadoop.mapred.Child.main(Child.java:159)
      Caused by: java.lang.NullPointerException
      at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.setStatus(PipeMapRed.java:517)
      at org.apache.hadoop.streaming.PipeMapRed$MRErrorThread.run(PipeMapRed.java:449)

      1. 1813.patch
        7 kB
        Ravi Gummadi
      2. 1813.v1.2.patch
        13 kB
        Ravi Gummadi
      3. 1813.v1.3.patch
        31 kB
        Ravi Gummadi
      4. 1813.v1.4.patch
        32 kB
        Ravi Gummadi
      5. 1813.v1.patch
        12 kB
        Ravi Gummadi

        Issue Links

          Activity

          Hide
          Amareshwari Sriramadasu added a comment -

          The above NPE occurs when there is no input to reduce, but the reducer writes "reporter:status: xxx" statements to stderr.

          Show
          Amareshwari Sriramadasu added a comment - The above NPE occurs when there is no input to reduce, but the reducer writes "reporter:status: xxx" statements to stderr.
          Hide
          Todd Lipcon added a comment -

          I think this may be duplicate of MAPREDUCE-576 (solved by HADOOP-5623 backport). Could also be related to MAPREDUCE-1481

          Show
          Todd Lipcon added a comment - I think this may be duplicate of MAPREDUCE-576 (solved by HADOOP-5623 backport). Could also be related to MAPREDUCE-1481
          Hide
          Ravi Gummadi added a comment -

          This is not a duplicate of MAPREDUCE-576 and still exists after HADOOP-5623.
          The issue here is NPE is seen because error thread's reporter is null. In source code, this reporter is set only through startOutputThreads() which is called from reduce(). When the input to reducer is empty, reduce() is not called and thus reporter of error thread is not set — causing NPE.

          If the following perl script is reducer,

          #!/usr/bin/perl
          print STDERR "reporter:status: before\n";
          while(<STDIN>) {
            chomp;
          }
          print STDERR "reporter:status: after\n";
          

          the first print statement causes this NPE even with non-empty input to reducer. The second print statement causes this NPE if reducer's input is empty. Need to fix both the cases.

          The problem exists in current trunk also.

          Show
          Ravi Gummadi added a comment - This is not a duplicate of MAPREDUCE-576 and still exists after HADOOP-5623 . The issue here is NPE is seen because error thread's reporter is null. In source code, this reporter is set only through startOutputThreads() which is called from reduce(). When the input to reducer is empty, reduce() is not called and thus reporter of error thread is not set — causing NPE. If the following perl script is reducer, #!/usr/bin/perl print STDERR "reporter:status: before\n" ; while (<STDIN>) { chomp; } print STDERR "reporter:status: after\n" ; the first print statement causes this NPE even with non-empty input to reducer. The second print statement causes this NPE if reducer's input is empty. Need to fix both the cases. The problem exists in current trunk also.
          Hide
          Ravi Gummadi added a comment -

          Attaching patch for trunk fixing the issue.

          This patch sets the reporter of MRErrorThread(in PipeMapRed.java) to a dummy reporter at the beginning of MRErrorThread.run() method just to avoid NPE. The actual reporter will be later set properly as usual and reports as needed. In the interim(before the actual reporter is set), the dummy reporter will just ignore the reporting lines.

          This is the way MRErrorThread's reporter is set to dummy reporter in waitOutputThreads() for the case of empty input based reducer. See HADOOP-4620 for details of the same.
          But before control goes to waitOutputThreads(), if the MRErrorThread's reporter is accessed, we get NPE. So to avoid this NPE, am setting this dummy reporter.

          This patch adds testcases which tests the perl script given in earlier comment as mapper and as reducer. Both test cases (1) perl script as streaming job's mapper and (2) perl script as streaming based reducer
          ---- fail without the fix and pass with the fix.

          Please review and provide your comments.

          Show
          Ravi Gummadi added a comment - Attaching patch for trunk fixing the issue. This patch sets the reporter of MRErrorThread(in PipeMapRed.java) to a dummy reporter at the beginning of MRErrorThread.run() method just to avoid NPE. The actual reporter will be later set properly as usual and reports as needed. In the interim(before the actual reporter is set), the dummy reporter will just ignore the reporting lines. This is the way MRErrorThread's reporter is set to dummy reporter in waitOutputThreads() for the case of empty input based reducer. See HADOOP-4620 for details of the same. But before control goes to waitOutputThreads(), if the MRErrorThread's reporter is accessed, we get NPE. So to avoid this NPE, am setting this dummy reporter. This patch adds testcases which tests the perl script given in earlier comment as mapper and as reducer. Both test cases (1) perl script as streaming job's mapper and (2) perl script as streaming based reducer ---- fail without the fix and pass with the fix. Please review and provide your comments.
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch looks fine to me.

          Show
          Amareshwari Sriramadasu added a comment - Patch looks fine to me.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Um.. the way the patch works, all the updates to the reporter like incrCounter(), setStatus() before the actual reduce begins will be LOST. Instead if we start the errorThread once a reduce starts, all the updates will be buffered till they are processed and so won't be lost. Thoughts?

          Show
          Vinod Kumar Vavilapalli added a comment - Um.. the way the patch works, all the updates to the reporter like incrCounter(), setStatus() before the actual reduce begins will be LOST. Instead if we start the errorThread once a reduce starts, all the updates will be buffered till they are processed and so won't be lost. Thoughts?
          Hide
          Ravi Gummadi added a comment -

          Attaching patch that delays the starting of error thread till output thread is started so that we don't lose any stderr "report:status:" lines because of dummy reporter in case of having "report:status:" lines before consuming input in the reducer.

          Show
          Ravi Gummadi added a comment - Attaching patch that delays the starting of error thread till output thread is started so that we don't lose any stderr "report:status:" lines because of dummy reporter in case of having "report:status:" lines before consuming input in the reducer.
          Hide
          Amareshwari Sriramadasu added a comment -

          Some comments on the patch:
          1. The current test is not regression test. The test added in earlier patch with empty input, should still be there.
          2. Starting up and shutting down MiniMRCluster for every testcase will just increase the test time. Can you run testStreamingMapStatus and testStreamingReduceStatus with single setup of MiniMRCluster.
          3. Minor : Can you rename mapAttemptID in method validateTaskStderr to just attemptID ?

          Show
          Amareshwari Sriramadasu added a comment - Some comments on the patch: 1. The current test is not regression test. The test added in earlier patch with empty input, should still be there. 2. Starting up and shutting down MiniMRCluster for every testcase will just increase the test time. Can you run testStreamingMapStatus and testStreamingReduceStatus with single setup of MiniMRCluster. 3. Minor : Can you rename mapAttemptID in method validateTaskStderr to just attemptID ?
          Hide
          Ravi Gummadi added a comment -

          Attaching patch by changing the testcases to have

          {perl script with empty input having "reporter:status:" lines and "reporter:counter:" lines written to stderr}

          instead of

          {StderrApp.class being used as streaming task}

          . Because, with nonempty input, earlier patch's testcase was not causing NPE consistently(fails based on timing) without the fix of the patch.

          Please review and provide your comments.

          Show
          Ravi Gummadi added a comment - Attaching patch by changing the testcases to have {perl script with empty input having "reporter:status:" lines and "reporter:counter:" lines written to stderr} instead of {StderrApp.class being used as streaming task} . Because, with nonempty input, earlier patch's testcase was not causing NPE consistently(fails based on timing) without the fix of the patch. Please review and provide your comments.
          Hide
          Amareshwari Sriramadasu added a comment -

          Latest patch looks fine.

          Show
          Amareshwari Sriramadasu added a comment - Latest patch looks fine.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12446581/1813.v1.2.patch
          against trunk revision 952548.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446581/1813.v1.2.patch against trunk revision 952548. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/229/console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          Failed tests are not related to this patch.
          TestSimulatorDeterministicReplay failure is because of MAPREDUCE-1834.
          TestSimulatorSerialJobSubmission failed with ZipException and passed when I ran on my local machine.
          TestSimulatorStressJobSubmission also passed when I ran on my local machine.

          Show
          Ravi Gummadi added a comment - Failed tests are not related to this patch. TestSimulatorDeterministicReplay failure is because of MAPREDUCE-1834 . TestSimulatorSerialJobSubmission failed with ZipException and passed when I ran on my local machine. TestSimulatorStressJobSubmission also passed when I ran on my local machine.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Code looks good. I have some testcase improvements in my mind.

          • TestStreamingStatus.testStreamingStatus() seems redundant except for the 'status report from user' check. Can we include the later into testReportingPerlScript() and dump the redundant testcase altogether?
          • TestStreamingStatus covers all possibilities with your patch and TestStreamingEmptyInpNonemptyOut doesn't seem to require its own testcase. Can we combine these two easily?
          Show
          Vinod Kumar Vavilapalli added a comment - Code looks good. I have some testcase improvements in my mind. TestStreamingStatus.testStreamingStatus() seems redundant except for the 'status report from user' check. Can we include the later into testReportingPerlScript() and dump the redundant testcase altogether? TestStreamingStatus covers all possibilities with your patch and TestStreamingEmptyInpNonemptyOut doesn't seem to require its own testcase. Can we combine these two easily?
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch with the testcases refactoring. Moved the validation of task status from testStreamingStatus() to testReporting() and removed testStreamingStatus() method. Modified the perl script of TestStreamingStatus.java to write to stdout(similar to that is there in TestStreamingEmptyInpnonemptyOut.java) and removed the file TestStreamingEmptyInpnonemptyOut.java.

          As there are 3 copies of the method readOutput() in different files, removed 2 of the 3 readOutput() methods and kept the one in MapReduceTestUtil.java.

          Show
          Ravi Gummadi added a comment - Attaching new patch with the testcases refactoring. Moved the validation of task status from testStreamingStatus() to testReporting() and removed testStreamingStatus() method. Modified the perl script of TestStreamingStatus.java to write to stdout(similar to that is there in TestStreamingEmptyInpnonemptyOut.java) and removed the file TestStreamingEmptyInpnonemptyOut.java. As there are 3 copies of the method readOutput() in different files, removed 2 of the 3 readOutput() methods and kept the one in MapReduceTestUtil.java.
          Hide
          Ravi Gummadi added a comment -

          Attaching new patch as I missed replacing one occurence of readOutput() call in earlier patch.

          Show
          Ravi Gummadi added a comment - Attaching new patch as I missed replacing one occurence of readOutput() call in earlier patch.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          +1 for the patch. Submitting to Hudson.

          Show
          Vinod Kumar Vavilapalli added a comment - +1 for the patch. Submitting to Hudson.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12446841/1813.v1.4.patch
          against trunk revision 953490.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 38 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446841/1813.v1.4.patch against trunk revision 953490. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 38 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/234/console This message is automatically generated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          TestSimulatorDeterministicReplay and TestCopyFiles failures are tracked at MAPREDUCE-1834 and MAPREDUCE-1858 respectively. I'm going to check this in.

          Show
          Vinod Kumar Vavilapalli added a comment - TestSimulatorDeterministicReplay and TestCopyFiles failures are tracked at MAPREDUCE-1834 and MAPREDUCE-1858 respectively. I'm going to check this in.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I just committed this to trunk. Thanks Ravi!

          Show
          Vinod Kumar Vavilapalli added a comment - I just committed this to trunk. Thanks Ravi!

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development