Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-13631

getPreferredLocations race condition in spark 1.6.0?

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.6.2, 2.0.0
    • Component/s: Scheduler
    • Labels:
      None

      Description

      We are seeing something that looks a lot like a regression from spark 1.2. When we run jobs with multiple threads, we have a crash somewhere inside getPreferredLocations, as was fixed in SPARK-4454. Except now it's inside org.apache.spark.MapOutputTrackerMaster.getLocationsWithLargestOutputs instead of DAGScheduler directly.

      I tried Spark 1.2 post-SPARK-4454 (before this patch it's only slightly flaky), 1.4.1, and 1.5.2 and all are fine. 1.6.0 immediately crashes on our threaded test case, though once in a while it passes.

      The stack trace is huge, but starts like this:

      Caused by: java.lang.NullPointerException: null
      at org.apache.spark.MapOutputTrackerMaster.getLocationsWithLargestOutputs(MapOutputTracker.scala:406)
      at org.apache.spark.MapOutputTrackerMaster.getPreferredLocationsForShuffle(MapOutputTracker.scala:366)
      at org.apache.spark.rdd.ShuffledRDD.getPreferredLocations(ShuffledRDD.scala:92)
      at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$2.apply(RDD.scala:257)
      at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$2.apply(RDD.scala:257)
      at scala.Option.getOrElse(Option.scala:120)
      at org.apache.spark.rdd.RDD.preferredLocations(RDD.scala:256)
      at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1545)

      The full trace is available here:
      https://gist.github.com/andy256/97611f19924bbf65cf49

        Attachments

          Activity

            People

            • Assignee:
              asloane Andy Sloane
              Reporter:
              asloane Andy Sloane
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: