Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24687

When NoClassDefError thrown during task serialization will cause job hang

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.0, 2.1.1
    • Fix Version/s: 2.3.3, 2.4.1, 3.0.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      When below exception thrown:

      Exception in thread "dag-scheduler-event-loop" java.lang.NoClassDefFoundError: Lcom/xxx/data/recommend/aggregator/queue/QueueName;
      	at java.lang.Class.getDeclaredFields0(Native Method)
      	at java.lang.Class.privateGetDeclaredFields(Class.java:2436)
      	at java.lang.Class.getDeclaredField(Class.java:1946)
      	at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659)
      	at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
      	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480)
      	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
      	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
      	at java.io.ObjectOutputStream.writeClass(ObjectOutputStream.java:1212)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1119)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1377)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1173)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1377)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1173)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
      

      Stage will always hang.Not abort.

      It is because NoClassDefError will no be catch up below.

      var taskBinary: Broadcast[Array[Byte]] = null
          try {
            // For ShuffleMapTask, serialize and broadcast (rdd, shuffleDep).
            // For ResultTask, serialize and broadcast (rdd, func).
            val taskBinaryBytes: Array[Byte] = stage match {
              case stage: ShuffleMapStage =>
                JavaUtils.bufferToArray(
                  closureSerializer.serialize((stage.rdd, stage.shuffleDep): AnyRef))
              case stage: ResultStage =>
                JavaUtils.bufferToArray(closureSerializer.serialize((stage.rdd, stage.func): AnyRef))
            }
      
            taskBinary = sc.broadcast(taskBinaryBytes)
          } catch {
            // In the case of a failure during serialization, abort the stage.
            case e: NotSerializableException =>
              abortStage(stage, "Task not serializable: " + e.toString, Some(e))
              runningStages -= stage
      
              // Abort execution
              return
            case NonFatal(e) =>
              abortStage(stage, s"Task serialization failed: $e\n${Utils.exceptionString(e)}", Some(e))
              runningStages -= stage
              return
          }
      

        Attachments

        1. hanging-960.png
          264 kB
          zhoukang

          Activity

            People

            • Assignee:
              cane zhoukang
              Reporter:
              cane zhoukang
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: