Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3121

Wrong implementation of implicit bytesWritableConverter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.0.2, 1.1.0, 1.2.0
    • 1.0.3, 1.1.1, 1.2.0
    • Spark Core
    • None

    Description

      val path = ... //path to seq file with BytesWritable as type of both key and value
      val file = sc.sequenceFile[Array[Byte],Array[Byte]](path)
      file.take(1)(0)._1

      This prints incorrect content of byte array. Actual content starts with correct one and some "random" bytes and zeros are appended. BytesWritable has two methods:

      getBytes() - return content of all internal array which is often longer then actual value stored. It usually contains the rest of previous longer values

      copyBytes() - return just begining of internal array determined by internal length property

      It looks like in implicit conversion between BytesWritable and Array[byte] getBytes is used instead of correct copyBytes.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dubovsky Jakub Dubovsky
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: