Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.1.0, 1.2.0
-
None
Description
When deserializing tasks on executors, we sometimes see IOException: unexpected exception type:
java.io.IOException: unexpected exception type java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538) java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:163) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745)
Here are some occurrences of this bug reported on the mailing list and GitHub:
- https://www.mail-archive.com/user@spark.apache.org/msg12129.html
- http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201409.mbox/%3CCAEaWm8UOp9TGarm5scEpPZEy5qxO+H8hU8UjzaH5s-ajyzZB_g@mail.gmail.com%3E
- https://github.com/yieldbot/flambo/issues/13
- https://www.mail-archive.com/user@spark.apache.org/msg13283.html
This is probably caused by throwing exceptions other than IOException from our custom readExternal methods (see http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/io/ObjectStreamClass.java#1022). davies spotted an instance of this in TorrentBroadcast, where a failed require throws a different exception, but this issue has been reported in Spark 1.1.0 as well. To fix this, I'm going to add try-catch blocks around all of our readExternal and writeExternal methods to re-throw caught exceptions as IOException.
This fix should allow us to determine the actual exceptions that are causing deserialization failures.