Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8502

One character switches into uppercase, causing failures [serialization? shuffle?]

    XMLWordPrintableJSON

    Details

      Description

      This seem to be a weird random and hard to debug issue, when one character changes the case (character is same). But we're seeing it in our 2h+ workflow every 3rd time we run it.

      One example:

      com.esotericsoftware.kryo.KryoException: Unable to find class: [Lorg.apache.sPark.sql.catalyst.expressions.MutableValue
      Serialization trace:
      values (org.apache.spark.sql.catalyst.expressions.SpecificMutableRow)
      at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)

      notice how `spark` turned into `sPark` (!!!)

      What I tracked down so far, is the same "mistake" is present on multiple executors, so quite likely this bug happens during serialization.

      This also happens for other custom datastructures, like a `case class(m: Map[String, String])` which seem to get deserialized OK, but containing a wrong value.

      we use scala 2.10.4 (same happens with 2.10.5), spark 1.3.1, compiled for CDH 5.3.2 with yarn, with Kryo serializer enabled.
      also we do use algebird 0.10.0 (requiring chill 0.6 vs chill 0.5 used in spark 1.3/1.4 - but I'm pretty sure we've seen the issue with older chill 0.5 too)

      Has anyone else seen the issue? Any ideas?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vidma vidmantas zemleris
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: