Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27598

DStreams checkpointing does not work with the Spark Shell

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0, 2.4.1, 2.4.2, 3.0.0
    • None
    • DStreams
    • None

    Description

      When I restarted a stream with checkpointing enabled I got this:

      19/04/29 22:45:06 WARN CheckpointReader: Error reading checkpoint from file file:/tmp/checkpoint/checkpoint-1556566950000.bk
      java.io.IOException: java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.streaming.dstream.FileInputDStream.filter of type scala.Function1 in instance of org.apache.spark.streaming.dstream.FileInputDStream
      at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1322)
      at org.apache.spark.streaming.dstream.FileInputDStream.readObject(FileInputDStream.scala:314)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      It seems that the closure is stored in the Serialized format and cannot be assigned back to a scala function1

      Details of how to reproduce it here: https://gist.github.com/skonto/87d5b2368b0bf7786d9dd992a710e4e6

      Maybe this is spark-shell specific and is not expected to work anyway, as I dont see this to be an issues with a normal jar. 

      Note that with Spark 2.3.3 the error is different and this still does not work but with a different error.

      Attachments

        Activity

          People

            Unassigned Unassigned
            skonto Stavros Kontopoulos
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: