Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8502

One character switches into uppercase, causing failures [serialization? shuffle?]

    XMLWordPrintableJSON

Details

    Description

      This seem to be a weird random and hard to debug issue, when one character changes the case (character is same). But we're seeing it in our 2h+ workflow every 3rd time we run it.

      One example:

      com.esotericsoftware.kryo.KryoException: Unable to find class: [Lorg.apache.sPark.sql.catalyst.expressions.MutableValue
      Serialization trace:
      values (org.apache.spark.sql.catalyst.expressions.SpecificMutableRow)
      at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)

      notice how `spark` turned into `sPark` (!!!)

      What I tracked down so far, is the same "mistake" is present on multiple executors, so quite likely this bug happens during serialization.

      This also happens for other custom datastructures, like a `case class(m: Map[String, String])` which seem to get deserialized OK, but containing a wrong value.

      we use scala 2.10.4 (same happens with 2.10.5), spark 1.3.1, compiled for CDH 5.3.2 with yarn, with Kryo serializer enabled.
      also we do use algebird 0.10.0 (requiring chill 0.6 vs chill 0.5 used in spark 1.3/1.4 - but I'm pretty sure we've seen the issue with older chill 0.5 too)

      Has anyone else seen the issue? Any ideas?

      Attachments

        Activity

          People

            Unassigned Unassigned
            vidma vidmantas zemleris
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: