Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7873

Serializer re-use + Kryo autoReset disabled leads to AraryIndexOutOfBounds exception in sort-shuffle bypassMergeSort path

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • Shuffle, Spark Core
    • None

    Description

      This is a somewhat obscure bug, but I think that it will seriously impact KryoSerializer users who use custom registrators which disabled auto-reset. When auto-reset is disabled, then this breaks things in some of our shuffle paths which actually end up creating multiple OutputStreams from the same shared SerializerInstance (which is unsafe). To illustrate this, the following test fails in 1.4:

      class KryoSerializerAutoResetDisabledSuite extends FunSuite with SharedSparkContext {
        conf.set("spark.serializer", classOf[KryoSerializer].getName)
        conf.set("spark.kryo.registrator", classOf[RegistratorWithoutAutoReset].getName)
      
        test("sort-shuffle with bypassMergeSort") {
          val myObject = ("Hello", "World")
          assert(sc.parallelize(Seq.fill(100)(myObject)).repartition(2).collect().toSet === Set(myObject))
        }
      }
      

      This was introduced by a patch (SPARK-3386) which enables serializer re-use in some of the shuffle paths, since constructing new serializer instances is actually pretty costly for KryoSerializer. We had already fixed another corner-case (SPARK-7766) bug related to this, but missed this one. From an engineering risk management perspective, we probably should have just reverted the original serializer reuse patch and added a big cross-product-of-configurations-and-shuffle-managers test suite before attempting to fix the defects.

      I think that I have a pretty simple fix for this, but we still might want to consider a revert for 1.4 just to be safe.

      Attachments

        Issue Links

          Activity

            People

              joshrosen Josh Rosen
              joshrosen Josh Rosen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: