Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18546

UnsafeShuffleWriter corrupts encrypted shuffle files when merging

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.1.0
    • Component/s: Spark Core
    • Labels:
      None
    • Target Version/s:

      Description

      The merging algorithm in UnsafeShuffleWriter does not consider encryption, and when it tries to merge encrypted files the result data cannot be read, since data encrypted with different initial vectors is interleaved in the same partition data. This leads to exceptions when trying to read the files during shuffle:

      com.esotericsoftware.kryo.KryoException: com.ning.compress.lzf.LZFException: Corrupt input data, block did not start with 2 byte signature ('ZV') followed by type byte, 2-byte length)
      	at com.esotericsoftware.kryo.io.Input.fill(Input.java:142)
      	at com.esotericsoftware.kryo.io.Input.require(Input.java:155)
      	at com.esotericsoftware.kryo.io.Input.readInt(Input.java:337)
      	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:109)
      	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
      	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
      	at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
      	at org.apache.spark.serializer.DeserializationStream.readKey(Serializer.scala:169)
      	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:512)
      	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:533)
      ...
      

      (This is our internal branch so don't worry if lines don't necessarily match.)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vanzin Marcelo Masiero Vanzin
                Reporter:
                vanzin Marcelo Masiero Vanzin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: