Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26068

ChunkedByteBufferInputStream is truncated by empty chunk

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • Spark Core
    • None

    Description

      If ChunkedByteBuffer contains empty chunk in the middle of it, then the ChunkedByteBufferInputStream will be truncated. All data behind the empty chunk will not be read.

      The problematic code:

      // ChunkedByteBuffer.scala
      // Assume chunks.next returns an empty chunk, then we will reach
      // else branch no matter chunks.hasNext = true or not. So some data is lost.
      override def read(dest: Array[Byte], offset: Int, length: Int): Int = {
        if (currentChunk != null && !currentChunk.hasRemaining && chunks.hasNext)    {
          currentChunk = chunks.next()
        }
        if (currentChunk != null && currentChunk.hasRemaining) {
          val amountToGet = math.min(currentChunk.remaining(), length)
          currentChunk.get(dest, offset, amountToGet)
          amountToGet
        } else {
          close()
          -1
        }
      } 

      Attachments

        Activity

          People

            liulinhong Liu, Linhong
            liulinhong Liu, Linhong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: