Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.22.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      org.apache.hadoop.mapred.TestBadRecords.testBadMapRed failed with the following
      exception:
      java.io.IOException: Job failed!
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
      at org.apache.hadoop.mapred.TestBadRecords.runMapReduce(TestBadRecords.java:94)
      at org.apache.hadoop.mapred.TestBadRecords.testBadMapRed(TestBadRecords.java:211)

      1. ASF.LICENSE.NOT.GRANTED--mr-1617-v1.3.patch
        1 kB
        Amar Kamat
      2. mr-1617-trunk-v1.patch
        0.4 kB
        Luke Lu
      3. mr-1617-trunk-v2.patch
        0.4 kB
        Luke Lu
      4. TestBadRecords.txt
        234 kB
        Amareshwari Sriramadasu

        Issue Links

          Activity

          Amareshwari Sriramadasu created issue -
          Hide
          Amareshwari Sriramadasu added a comment -

          Attaching the failure log

          Show
          Amareshwari Sriramadasu added a comment - Attaching the failure log
          Amareshwari Sriramadasu made changes -
          Field Original Value New Value
          Attachment TestBadRecords.txt [ 12439541 ]
          Hide
          Amareshwari Sriramadasu added a comment -

          After going through the failure log, I think the following is the cause for failure.

          The test expects first three attempts of the task to fail with System.exit, RuntimeTimeException and Timed out (failed to report
          status in 30 seconds) respectively; and fourth attempt should succeed. But, in the test log, fourth attempt also timed out.

          Here is the log for fourth attempt :

          2010-03-22 01:25:51,560 INFO  mapred.JobTracker (JobTracker.java:createTaskEntry(2484)) - Adding task (MAP)
          'attempt_20100322012429762_0001_m_000000_3' to tip task_20100322012429762_0001_m_000000, for tracker
          'tracker_host1.foo.com:localhost/127.0.0.1:49080'
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:registerTask(2125)) - LaunchTaskAction
          (registerTask): attempt_20100322012429762_0001_m_000000_3 task's state:UNASSIGNED
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:run(2062)) - Trying to launch :
          attempt_20100322012429762_0001_m_000000_3 which needs 1 slots
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:run(2094)) - In TaskLauncher, current free slots : 2
          and trying to launch attempt_20100322012429762_0001_m_000000_3 which needs 1 slots
          2010-03-22 01:26:21,595 INFO  mapred.TaskTracker (TaskTracker.java:markUnresponsiveTasks(1682)) -
          attempt_20100322012429762_0001_m_000000_3: Task attempt_20100322012429762_0001_m_000000_3 failed to report status for
          30 seconds. Killing!
          2010-03-22 01:26:21,616 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1827)) - About to purge task:
          attempt_20100322012429762_0001_m_000000_3
          2010-03-22 01:26:26,619 INFO  mapred.TaskRunner (MapTaskRunner.java:close(43)) -
          attempt_20100322012429762_0001_m_000000_3 done; removing files.
          2010-03-22 01:26:26,620 INFO  mapred.IndexCache (IndexCache.java:removeMap(140)) - Map ID
          attempt_20100322012429762_0001_m_000000_3 not found in cache
          

          For the fourth attempt, attempt_20100322012429762_0001_m_000000_3, I don't see the log saying "JVM with ID:xxxx is given
          task: attempt_20100322012429762_0001_m_000000_3".
          This says that jvm's getTask() has not returned in 30 seconds (the task's timeout configured in test). This is most likely because of HADOOP-5130. We avoid this in our clusters by setting -Djava.net.preferIPv4Stack=true in mapred.child.java.opts.

          Shall we set the same in Unit test(s) also ?

          Show
          Amareshwari Sriramadasu added a comment - After going through the failure log, I think the following is the cause for failure. The test expects first three attempts of the task to fail with System.exit, RuntimeTimeException and Timed out (failed to report status in 30 seconds) respectively; and fourth attempt should succeed. But, in the test log, fourth attempt also timed out. Here is the log for fourth attempt : 2010-03-22 01:25:51,560 INFO mapred.JobTracker (JobTracker.java:createTaskEntry(2484)) - Adding task (MAP) 'attempt_20100322012429762_0001_m_000000_3' to tip task_20100322012429762_0001_m_000000, for tracker 'tracker_host1.foo.com:localhost/127.0.0.1:49080' 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:registerTask(2125)) - LaunchTaskAction (registerTask): attempt_20100322012429762_0001_m_000000_3 task's state:UNASSIGNED 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:run(2062)) - Trying to launch : attempt_20100322012429762_0001_m_000000_3 which needs 1 slots 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:run(2094)) - In TaskLauncher, current free slots : 2 and trying to launch attempt_20100322012429762_0001_m_000000_3 which needs 1 slots 2010-03-22 01:26:21,595 INFO mapred.TaskTracker (TaskTracker.java:markUnresponsiveTasks(1682)) - attempt_20100322012429762_0001_m_000000_3: Task attempt_20100322012429762_0001_m_000000_3 failed to report status for 30 seconds. Killing! 2010-03-22 01:26:21,616 INFO mapred.TaskTracker (TaskTracker.java:purgeTask(1827)) - About to purge task: attempt_20100322012429762_0001_m_000000_3 2010-03-22 01:26:26,619 INFO mapred.TaskRunner (MapTaskRunner.java:close(43)) - attempt_20100322012429762_0001_m_000000_3 done; removing files. 2010-03-22 01:26:26,620 INFO mapred.IndexCache (IndexCache.java:removeMap(140)) - Map ID attempt_20100322012429762_0001_m_000000_3 not found in cache For the fourth attempt, attempt_20100322012429762_0001_m_000000_3, I don't see the log saying "JVM with ID:xxxx is given task: attempt_20100322012429762_0001_m_000000_3". This says that jvm's getTask() has not returned in 30 seconds (the task's timeout configured in test). This is most likely because of HADOOP-5130 . We avoid this in our clusters by setting -Djava.net.preferIPv4Stack=true in mapred.child.java.opts. Shall we set the same in Unit test(s) also ?
          Amareshwari Sriramadasu made changes -
          Link This issue is related to HADOOP-5130 [ HADOOP-5130 ]
          Hide
          Amar Kamat added a comment -

          Attaching a patch for Yahoo!'s distribution of Hadoop not to be committed here. TestBadRecords fail because the task jvm take more than 30 secs to get the task from the TaskTracker. As this is a known IPv6 issue, this patch simply enables IPv4 stack for JUnit tests. TestBadRecords passes after the patch. test-patch and ant-tests passed with the patch.

          Show
          Amar Kamat added a comment - Attaching a patch for Yahoo!'s distribution of Hadoop not to be committed here. TestBadRecords fail because the task jvm take more than 30 secs to get the task from the TaskTracker. As this is a known IPv6 issue , this patch simply enables IPv4 stack for JUnit tests. TestBadRecords passes after the patch. test-patch and ant-tests passed with the patch.
          Amar Kamat made changes -
          Attachment mr-1617-v1.3.patch [ 12441951 ]
          Luke Lu made changes -
          Assignee Luke Lu [ vicaya ]
          Hide
          Luke Lu added a comment -

          Incorporated some feedback from Chris: use LOG.is*Enabled checks to minimize log impact when the log is disabled.

          Show
          Luke Lu added a comment - Incorporated some feedback from Chris: use LOG.is*Enabled checks to minimize log impact when the log is disabled.
          Luke Lu made changes -
          Attachment mr-1617-trunk-v1.patch [ 12444162 ]
          Luke Lu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Affects Version/s 0.20.2 [ 12314205 ]
          Hide
          Luke Lu added a comment -

          Oops, the above comment is for MAPREDUCE-1543

          Show
          Luke Lu added a comment - Oops, the above comment is for MAPREDUCE-1543
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444162/mr-1617-trunk-v1.patch
          against trunk revision 942764.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444162/mr-1617-trunk-v1.patch against trunk revision 942764. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          Luke, mapred.child.java.opts has default value of -Xmx200m. This will be overridden by your change. Your patch should have
          +<property>
          + <name>mapred.child.java.opts</name>
          + <value>-Xmx200m -Djava.net.preferIPv4Stack=true</value>
          +</property>

          Show
          Amareshwari Sriramadasu added a comment - Luke, mapred.child.java.opts has default value of -Xmx200m. This will be overridden by your change. Your patch should have +<property> + <name>mapred.child.java.opts</name> + <value>-Xmx200m -Djava.net.preferIPv4Stack=true</value> +</property>
          Hide
          Luke Lu added a comment -

          Good catch, Amarsri. I was testing the jvm default (-Xmx500m) vs -Xmx200m to see if the larger default Xmx makes the tests run faster. For the curious, there is no statistical difference in cold runs. However, in repeated tests, the smaller Xmx is actually significantly faster.

          Show
          Luke Lu added a comment - Good catch, Amarsri. I was testing the jvm default (-Xmx500m) vs -Xmx200m to see if the larger default Xmx makes the tests run faster. For the curious, there is no statistical difference in cold runs. However, in repeated tests, the smaller Xmx is actually significantly faster.
          Luke Lu made changes -
          Attachment mr-1617-trunk-v2.patch [ 12444245 ]
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch looks fine.
          Canceling it to run through hudson.

          Show
          Amareshwari Sriramadasu added a comment - Patch looks fine. Canceling it to run through hudson.
          Amareshwari Sriramadasu made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Amareshwari Sriramadasu added a comment -

          submitting for hudson.

          Show
          Amareshwari Sriramadasu added a comment - submitting for hudson.
          Amareshwari Sriramadasu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444245/mr-1617-trunk-v2.patch
          against trunk revision 943358.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444245/mr-1617-trunk-v2.patch against trunk revision 943358. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          +1

          I committed this. Thanks Amar and Luke!

          Show
          Chris Douglas added a comment - +1 I committed this. Thanks Amar and Luke!
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #324 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/324/)
          MAPREDUCE-1617. Use IPv4 stack for unit tests. Contributed by Amar Kamat and Luke Lu

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #324 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/324/ ) MAPREDUCE-1617 . Use IPv4 stack for unit tests. Contributed by Amar Kamat and Luke Lu
          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available Open Open
          1d 3h 57m 1 Amareshwari Sriramadasu 12/May/10 04:48
          Open Open Patch Available Patch Available
          48d 19h 53m 2 Amareshwari Sriramadasu 12/May/10 04:49
          Patch Available Patch Available Resolved Resolved
          9d 5h 55m 1 Chris Douglas 21/May/10 10:44
          Resolved Resolved Closed Closed
          569d 20h 33m 1 Konstantin Shvachko 12/Dec/11 06:18

            People

            • Assignee:
              Luke Lu
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development