Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1159

Reducers hang when map output file has a checksum error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • 0.12.2, 0.12.3
    • None
    • None
    • None

    Description

      Two reduces hung in our sort benchmark. They always fail to get map outputs from node X due to checksum error when the map outputs are read at that node resulting in a NullPointerException on node X. This leads to constant failures on the two fetching reduces.

      2007-03-26 00:02:57,082 WARN org.apache.hadoop.fs.FileSystem: Moving bad file /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out to /e/c/bad_files/file.out.542279301
      2007-03-26 00:02:57,083 INFO org.apache.hadoop.fs.FSInputChecker: Found checksum error: org.apache.hadoop.fs.ChecksumException: Checksum error: /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out at 106484224
      at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.verifySum(ChecksumFileSystem.java:254)
      at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:211)
      at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
      at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
      at java.io.DataInputStream.read(DataInputStream.java:132)
      at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1659)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
      at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
      at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
      at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
      at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
      at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
      at org.mortbay.http.HttpServer.service(HttpServer.java:954)
      at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
      at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
      at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
      at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
      at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
      at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

      2007-03-26 00:02:57,083 WARN /: /mapOutput?map=task_0002_m_022488_0&reduce=1542:
      java.lang.NullPointerException

      Attachments

        1. h1159-2.patch
          1.0 kB
          Owen O'Malley
        2. h1159.patch
          0.6 kB
          Owen O'Malley
        3. 1159-merge.patch
          3 kB
          Devaraj Das
        4. 1159.patch
          2 kB
          Devaraj Das

        Activity

          People

            omalley Owen O'Malley
            nidaley Nigel Daley
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: