Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Only one MR cluster is brought up and hence there is no scope of jobid clashing.

      Description

      It times out with "Could not find /taskTracker/jobcache/jobid/work in any of the configured local directories".

      1. MAPREDUCE-153-v1.0.patch
        14 kB
        Amar Kamat
      2. MAPREDUCE-153-v1.1.patch
        19 kB
        Amar Kamat
      3. MAPREDUCE-153-v1.1-branch-0.20.patch
        17 kB
        Amar Kamat

        Issue Links

          Activity

          Hide
          amar_kamat Amar Kamat added a comment -

          The only error message I could see was

           [junit] org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905060050_0001/work in any of the configured local directories
              [junit]     at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:381)
              [junit]     at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
              [junit]     at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.localizeTask(TaskTracker.java:1888)
              [junit]     at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.launchTask(TaskTracker.java:2001)
              [junit]     at org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:880)
              [junit]     at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:874)
              [junit]     at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1739)
              [junit]     at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
              [junit]     at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1704)
          
          Show
          amar_kamat Amar Kamat added a comment - The only error message I could see was [junit] org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200905060050_0001/work in any of the configured local directories [junit] at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:381) [junit] at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) [junit] at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.localizeTask(TaskTracker.java:1888) [junit] at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.launchTask(TaskTracker.java:2001) [junit] at org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:880) [junit] at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:874) [junit] at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1739) [junit] at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97) [junit] at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1704)
          Hide
          ravidotg Ravi Gummadi added a comment -

          I too observed the failure with trunk. Seem to be failing if we run 2 or 3 times.

          Show
          ravidotg Ravi Gummadi added a comment - I too observed the failure with trunk. Seem to be failing if we run 2 or 3 times.
          Hide
          amar_kamat Amar Kamat added a comment -

          The problem occurs when the jobtrackers starts within the same minute and the job-id clashes. Attaching a patch that runs all the tests with one mapred cluster. The runtime for this test now is 1m3secs. Working on bringing it further down. Result of test-patch
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 6 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          This is just a testcase change and hence no ant tests required.

          Show
          amar_kamat Amar Kamat added a comment - The problem occurs when the jobtrackers starts within the same minute and the job-id clashes. Attaching a patch that runs all the tests with one mapred cluster. The runtime for this test now is 1m3secs. Working on bringing it further down. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. This is just a testcase change and hence no ant tests required.
          Hide
          amar_kamat Amar Kamat added a comment -

          Attaching a patch that adds 2 more tests

          1. Test listener events with 0 maps and 0 reducers with setup/cleanup
          2. Test listener events with 0 maps, 0 reducers and no setup/cleanup

          Broken down the main test into subtests and made sure the minimr is brought up once. Runtime of the testcase is now 1m19secs. This is a testcase only change and hence no ant test results are required.

          Result of test-patch
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 6 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

          Currently investigating if unit tests can be written for this testcase.

          Show
          amar_kamat Amar Kamat added a comment - Attaching a patch that adds 2 more tests Test listener events with 0 maps and 0 reducers with setup/cleanup Test listener events with 0 maps, 0 reducers and no setup/cleanup Broken down the main test into subtests and made sure the minimr is brought up once. Runtime of the testcase is now 1m19secs. This is a testcase only change and hence no ant test results are required. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Currently investigating if unit tests can be written for this testcase.
          Hide
          jothipn Jothi Padmanabhan added a comment -

          A couple of minor suggestions:

          1. can we use NullOutputFormat for the tests so that we avoid doing any output promotions
          2. testQueuedJobKill can be done at the end – that way we could avoid one call to startInitializer
          Show
          jothipn Jothi Padmanabhan added a comment - A couple of minor suggestions: can we use NullOutputFormat for the tests so that we avoid doing any output promotions testQueuedJobKill can be done at the end – that way we could avoid one call to startInitializer
          Hide
          amar_kamat Amar Kamat added a comment -

          1. can we use NullOutputFormat for the tests so that we avoid doing any output promotions

          I think we can keep it as it is and change it in some jira that deals with UtilsForTests.

          2. testQueuedJobKill can be done at the end - that way we could avoid one call to startInitializer

          I think I did it on purpose. The reason is because I am creating only one mr cluster and that is shared across the testcases. I think its safe not to assume the state of initializer before running the testcase hence I forcefully stop/start the initializer. Its a thread start and stop calls.

          Show
          amar_kamat Amar Kamat added a comment - 1. can we use NullOutputFormat for the tests so that we avoid doing any output promotions I think we can keep it as it is and change it in some jira that deals with UtilsForTests. 2. testQueuedJobKill can be done at the end - that way we could avoid one call to startInitializer I think I did it on purpose. The reason is because I am creating only one mr cluster and that is shared across the testcases. I think its safe not to assume the state of initializer before running the testcase hence I forcefully stop/start the initializer. Its a thread start and stop calls.
          Hide
          jothipn Jothi Padmanabhan added a comment -

          The reason is because I am creating only one mr cluster and that is shared across the testcases. I think its safe not to assume the state of initializer before running the testcase hence I forcefully stop/start the initializer. Its a thread start and stop calls.

          I think this can be easily worked out. However, since the gain by removing one call to the initializer thread start/stop is not much, I am OK with the way things are.

          Show
          jothipn Jothi Padmanabhan added a comment - The reason is because I am creating only one mr cluster and that is shared across the testcases. I think its safe not to assume the state of initializer before running the testcase hence I forcefully stop/start the initializer. Its a thread start and stop calls. I think this can be easily worked out. However, since the gain by removing one call to the initializer thread start/stop is not much, I am OK with the way things are.
          Hide
          sharadag Sharad Agarwal added a comment -

          I committed this. Thanks Amar!

          Show
          sharadag Sharad Agarwal added a comment - I committed this. Thanks Amar!
          Hide
          hudson Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #20 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/20/)

          Show
          hudson Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #20 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/20/ )
          Hide
          amar_kamat Amar Kamat added a comment -

          Attaching a patch for branch-0.20.

          Show
          amar_kamat Amar Kamat added a comment - Attaching a patch for branch-0.20.

            People

            • Assignee:
              amar_kamat Amar Kamat
              Reporter:
              amar_kamat Amar Kamat
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development