Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2657 TaskTracker should handle disk failures
  3. MAPREDUCE-2656

Map Reduce Tasks are continously failing, when one among the several harddisks available on the TaskTracker fails.

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.20.2, 0.20.3
    • Fix Version/s: 0.20.2, 0.20.3
    • Component/s: None
    • Labels:
      None

      Description

      1. Pull out one hard disk from Task tracker node (out of 10 disks pull one). Now it is noted that some jobs are failing.
      However process is continued.
      2. Wait for sometime (15 mins) and pull out one disk from another Task tracker.
      3. More number of jobs failed now and it can be seen from UI. Process is getting paused.

      The exception can be seen in the job tracker UI for a failed job.

       
      Error initializing attempt_201010221528_10174_m_000011_0:
      java.io.IOException: Expecting a line not the end of stream
       at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
       at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
       at org.apache.hadoop.util.Shell.run(Shell.java:137)
       at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
       at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:385)
       at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
       at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
       at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
       at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
       at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
       at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
      
      Error initializing attempt_201010221528_10174_m_000011_1:
      org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_201010221528_10174/work
       at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:454)
       at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:134)
       at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:113)
       at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:835)
       at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1790)
       at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:104)
       at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1753)
      
      

      Task Tracker log can be seen here :

       
      2010-10-25 16:36:24,215 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1211)) - Caught exception: java.io.IOException: Expecting a line not the end of stream
              at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
              at org.apache.hadoop.util.Shell.run(Shell.java:137)
              at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
              at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
              at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
              at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
              at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
              at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
      
      2010-10-25 16:36:24,216 INFO  mapred.TaskTracker (TaskTracker.java:run(1856)) - Lost connection to JobTracker [/192.168.97.1:9001].  Retrying...
      java.lang.Exception: java.io.IOException: Expecting a line not the end of stream
              at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1212)
              at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1848)
              at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3022)
      Caused by: java.io.IOException: Expecting a line not the end of stream
              at org.apache.hadoop.fs.DF.parseExecResult(DF.java:110)
              at org.apache.hadoop.util.Shell.runCommand(Shell.java:182)
              at org.apache.hadoop.util.Shell.run(Shell.java:137)
              at org.apache.hadoop.fs.DF.getAvailable(DF.java:74)
              at org.apache.hadoop.mapred.TaskTracker.getFreeSpace(TaskTracker.java:1586)
              at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1274)
              at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1106)
              ... 2 more
      2010-10-25 16:36:29,550 INFO  mapred.TaskTracker (TaskTracker.java:transmitHeartBeat(1256)) - Resending 'status' to '192.168.97.1' with reponseId '18361
      2010-10-25 16:36:29,550 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
      2010-10-25 16:36:32,656 WARN  mapred.TaskTracker (TaskTracker.java:checkLocalDirs(2982)) - Task Tracker local can not create directory: /hdfsdata/0/mapred/local
      

      This seems to be fixed in the trunk.

        Attachments

        1. MAPREDUCE-2656.patch
          6 kB
          Devaraj Kavali
        2. HADOOP-7130.patch
          4 kB
          Devaraj Kavali

          Issue Links

            Activity

              People

              • Assignee:
                devaraj Devaraj Kavali
                Reporter:
                devaraj Devaraj Kavali
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: