Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.5.0
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None

      Description

      The extended error message with the offending values finally paid off and I was able to get the values that were causing the Summber buffer overflow exception.

      java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, summed=512, read=2880, bytesPerSum=1, inSum=512
      at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
      at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
      at java.io.DataInputStream.read(DataInputStream.java:80)
      at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
      at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
      at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
      Caused by: java.lang.ArrayIndexOutOfBoundsException
      at java.util.zip.CRC32.update(CRC32.java:43)
      at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
      ... 9 more

      Tracking through the code, what happens is inside of FSDataInputStream.Checker.read() the verifySum gets an EOF Exception and turns off the summing. Among other things this sets the bytesPerSum to 1. Unfortunately, that leads to the ArrayIndexOutOfBoundsException.

      I think the problem is that the original EOF exception was logged and ignored. I propose that we allow the original EOF to propagate back to the caller. (So that file not found will still disable the checksum checking, but we will detect truncated checksum files.)

        Activity

        Owen O'Malley created issue -
        Hide
        Doug Cutting added a comment -

        To be clear, currently we ignore errors processing checksums (checksum file not found, too short, timeouts while reading, etc.) so that the checksum system only throws user-visible exceptions when data is known to be corrupt. You're proposing we change this so that, if the checksum file is there, then we may throw user-visible exceptions for errors processing the checksum data (like unexpected eof). Is that right, or are you proposing something else?

        Show
        Doug Cutting added a comment - To be clear, currently we ignore errors processing checksums (checksum file not found, too short, timeouts while reading, etc.) so that the checksum system only throws user-visible exceptions when data is known to be corrupt. You're proposing we change this so that, if the checksum file is there, then we may throw user-visible exceptions for errors processing the checksum data (like unexpected eof). Is that right, or are you proposing something else?
        Doug Cutting made changes -
        Field Original Value New Value
        Fix Version/s 0.6.0 [ 12312025 ]
        Hide
        Eli Collins added a comment -

        Out of date.

        Show
        Eli Collins added a comment - Out of date.
        Eli Collins made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]

          People

          • Assignee:
            Owen O'Malley
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development