Hadoop Common
  1. Hadoop Common
  2. HADOOP-9665

BlockDecompressorStream#decompress will throw EOFException instead of return -1 when EOF

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.2, 2.1.0-beta, 2.3.0
    • Fix Version/s: 1-win, 2.1.0-beta, 1.2.1
    • Component/s: None
    • Labels:
      None

      Description

      BlockDecompressorStream#decompress ultimately calls rawReadInt, which will throw EOFException instead of return -1 when encountering end of a stream. Then, decompress will be called by read. However, InputStream#read is supposed to return -1 instead of throwing EOFException to indicate the end of a stream. This explains why in LineReader,

            if (bufferPosn >= bufferLength) {
              startPosn = bufferPosn = 0;
              if (prevCharCR)
                ++bytesConsumed; //account for CR from previous read
              bufferLength = in.read(buffer);
              if (bufferLength <= 0)
                break; // EOF
            }
      

      -1 is checked instead of catching EOFException.

      Now the problem will occur with SnappyCodec. If an input file is compressed with SnappyCodec, it needs to be decompressed through BlockDecompressorStream when it is read. Then, if it empty, EOFException will been thrown from rawReadInt and break LineReader.

      1. HADOOP-9665.1.patch
        0.9 kB
        Zhijie Shen
      2. HADOOP-9665.2.patch
        3 kB
        Zhijie Shen
      3. HADOOP-9665-branch-1.1.patch
        7 kB
        Zhijie Shen

        Issue Links

          Activity

            People

            • Assignee:
              Zhijie Shen
              Reporter:
              Zhijie Shen
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development