Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.22.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      org.apache.hadoop.mapred.TestBadRecords.testBadMapRed failed with the following
      exception:
      java.io.IOException: Job failed!
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
      at org.apache.hadoop.mapred.TestBadRecords.runMapReduce(TestBadRecords.java:94)
      at org.apache.hadoop.mapred.TestBadRecords.testBadMapRed(TestBadRecords.java:211)

      1. ASF.LICENSE.NOT.GRANTED--mr-1617-v1.3.patch
        1 kB
        Amar Kamat
      2. mr-1617-trunk-v1.patch
        0.4 kB
        Luke Lu
      3. mr-1617-trunk-v2.patch
        0.4 kB
        Luke Lu
      4. TestBadRecords.txt
        234 kB
        Amareshwari Sriramadasu

        Issue Links

          Activity

          Hide
          Amareshwari Sriramadasu added a comment -

          Attaching the failure log

          Show
          Amareshwari Sriramadasu added a comment - Attaching the failure log
          Hide
          Amareshwari Sriramadasu added a comment -

          After going through the failure log, I think the following is the cause for failure.

          The test expects first three attempts of the task to fail with System.exit, RuntimeTimeException and Timed out (failed to report
          status in 30 seconds) respectively; and fourth attempt should succeed. But, in the test log, fourth attempt also timed out.

          Here is the log for fourth attempt :

          2010-03-22 01:25:51,560 INFO  mapred.JobTracker (JobTracker.java:createTaskEntry(2484)) - Adding task (MAP)
          'attempt_20100322012429762_0001_m_000000_3' to tip task_20100322012429762_0001_m_000000, for tracker
          'tracker_host1.foo.com:localhost/127.0.0.1:49080'
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:registerTask(2125)) - LaunchTaskAction
          (registerTask): attempt_20100322012429762_0001_m_000000_3 task's state:UNASSIGNED
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:run(2062)) - Trying to launch :
          attempt_20100322012429762_0001_m_000000_3 which needs 1 slots
          2010-03-22 01:25:51,562 INFO  mapred.TaskTracker (TaskTracker.java:run(2094)) - In TaskLauncher, current free slots : 2
          and trying to launch attempt_20100322012429762_0001_m_000000_3 which needs 1 slots
          2010-03-22 01:26:21,595 INFO  mapred.TaskTracker (TaskTracker.java:markUnresponsiveTasks(1682)) -
          attempt_20100322012429762_0001_m_000000_3: Task attempt_20100322012429762_0001_m_000000_3 failed to report status for
          30 seconds. Killing!
          2010-03-22 01:26:21,616 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1827)) - About to purge task:
          attempt_20100322012429762_0001_m_000000_3
          2010-03-22 01:26:26,619 INFO  mapred.TaskRunner (MapTaskRunner.java:close(43)) -
          attempt_20100322012429762_0001_m_000000_3 done; removing files.
          2010-03-22 01:26:26,620 INFO  mapred.IndexCache (IndexCache.java:removeMap(140)) - Map ID
          attempt_20100322012429762_0001_m_000000_3 not found in cache
          

          For the fourth attempt, attempt_20100322012429762_0001_m_000000_3, I don't see the log saying "JVM with ID:xxxx is given
          task: attempt_20100322012429762_0001_m_000000_3".
          This says that jvm's getTask() has not returned in 30 seconds (the task's timeout configured in test). This is most likely because of HADOOP-5130. We avoid this in our clusters by setting -Djava.net.preferIPv4Stack=true in mapred.child.java.opts.

          Shall we set the same in Unit test(s) also ?

          Show
          Amareshwari Sriramadasu added a comment - After going through the failure log, I think the following is the cause for failure. The test expects first three attempts of the task to fail with System.exit, RuntimeTimeException and Timed out (failed to report status in 30 seconds) respectively; and fourth attempt should succeed. But, in the test log, fourth attempt also timed out. Here is the log for fourth attempt : 2010-03-22 01:25:51,560 INFO mapred.JobTracker (JobTracker.java:createTaskEntry(2484)) - Adding task (MAP) 'attempt_20100322012429762_0001_m_000000_3' to tip task_20100322012429762_0001_m_000000, for tracker 'tracker_host1.foo.com:localhost/127.0.0.1:49080' 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:registerTask(2125)) - LaunchTaskAction (registerTask): attempt_20100322012429762_0001_m_000000_3 task's state:UNASSIGNED 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:run(2062)) - Trying to launch : attempt_20100322012429762_0001_m_000000_3 which needs 1 slots 2010-03-22 01:25:51,562 INFO mapred.TaskTracker (TaskTracker.java:run(2094)) - In TaskLauncher, current free slots : 2 and trying to launch attempt_20100322012429762_0001_m_000000_3 which needs 1 slots 2010-03-22 01:26:21,595 INFO mapred.TaskTracker (TaskTracker.java:markUnresponsiveTasks(1682)) - attempt_20100322012429762_0001_m_000000_3: Task attempt_20100322012429762_0001_m_000000_3 failed to report status for 30 seconds. Killing! 2010-03-22 01:26:21,616 INFO mapred.TaskTracker (TaskTracker.java:purgeTask(1827)) - About to purge task: attempt_20100322012429762_0001_m_000000_3 2010-03-22 01:26:26,619 INFO mapred.TaskRunner (MapTaskRunner.java:close(43)) - attempt_20100322012429762_0001_m_000000_3 done; removing files. 2010-03-22 01:26:26,620 INFO mapred.IndexCache (IndexCache.java:removeMap(140)) - Map ID attempt_20100322012429762_0001_m_000000_3 not found in cache For the fourth attempt, attempt_20100322012429762_0001_m_000000_3, I don't see the log saying "JVM with ID:xxxx is given task: attempt_20100322012429762_0001_m_000000_3". This says that jvm's getTask() has not returned in 30 seconds (the task's timeout configured in test). This is most likely because of HADOOP-5130 . We avoid this in our clusters by setting -Djava.net.preferIPv4Stack=true in mapred.child.java.opts. Shall we set the same in Unit test(s) also ?
          Hide
          Amar Kamat added a comment -

          Attaching a patch for Yahoo!'s distribution of Hadoop not to be committed here. TestBadRecords fail because the task jvm take more than 30 secs to get the task from the TaskTracker. As this is a known IPv6 issue, this patch simply enables IPv4 stack for JUnit tests. TestBadRecords passes after the patch. test-patch and ant-tests passed with the patch.

          Show
          Amar Kamat added a comment - Attaching a patch for Yahoo!'s distribution of Hadoop not to be committed here. TestBadRecords fail because the task jvm take more than 30 secs to get the task from the TaskTracker. As this is a known IPv6 issue , this patch simply enables IPv4 stack for JUnit tests. TestBadRecords passes after the patch. test-patch and ant-tests passed with the patch.
          Hide
          Luke Lu added a comment -

          Incorporated some feedback from Chris: use LOG.is*Enabled checks to minimize log impact when the log is disabled.

          Show
          Luke Lu added a comment - Incorporated some feedback from Chris: use LOG.is*Enabled checks to minimize log impact when the log is disabled.
          Hide
          Luke Lu added a comment -

          Oops, the above comment is for MAPREDUCE-1543

          Show
          Luke Lu added a comment - Oops, the above comment is for MAPREDUCE-1543
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444162/mr-1617-trunk-v1.patch
          against trunk revision 942764.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444162/mr-1617-trunk-v1.patch against trunk revision 942764. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/536/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          Luke, mapred.child.java.opts has default value of -Xmx200m. This will be overridden by your change. Your patch should have
          +<property>
          + <name>mapred.child.java.opts</name>
          + <value>-Xmx200m -Djava.net.preferIPv4Stack=true</value>
          +</property>

          Show
          Amareshwari Sriramadasu added a comment - Luke, mapred.child.java.opts has default value of -Xmx200m. This will be overridden by your change. Your patch should have +<property> + <name>mapred.child.java.opts</name> + <value>-Xmx200m -Djava.net.preferIPv4Stack=true</value> +</property>
          Hide
          Luke Lu added a comment -

          Good catch, Amarsri. I was testing the jvm default (-Xmx500m) vs -Xmx200m to see if the larger default Xmx makes the tests run faster. For the curious, there is no statistical difference in cold runs. However, in repeated tests, the smaller Xmx is actually significantly faster.

          Show
          Luke Lu added a comment - Good catch, Amarsri. I was testing the jvm default (-Xmx500m) vs -Xmx200m to see if the larger default Xmx makes the tests run faster. For the curious, there is no statistical difference in cold runs. However, in repeated tests, the smaller Xmx is actually significantly faster.
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch looks fine.
          Canceling it to run through hudson.

          Show
          Amareshwari Sriramadasu added a comment - Patch looks fine. Canceling it to run through hudson.
          Hide
          Amareshwari Sriramadasu added a comment -

          submitting for hudson.

          Show
          Amareshwari Sriramadasu added a comment - submitting for hudson.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12444245/mr-1617-trunk-v2.patch
          against trunk revision 943358.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444245/mr-1617-trunk-v2.patch against trunk revision 943358. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/182/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          +1

          I committed this. Thanks Amar and Luke!

          Show
          Chris Douglas added a comment - +1 I committed this. Thanks Amar and Luke!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #324 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/324/)
          MAPREDUCE-1617. Use IPv4 stack for unit tests. Contributed by Amar Kamat and Luke Lu

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #324 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/324/ ) MAPREDUCE-1617 . Use IPv4 stack for unit tests. Contributed by Amar Kamat and Luke Lu

            People

            • Assignee:
              Luke Lu
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development