Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18805

InternalMapWithStateDStream make java.lang.StackOverflowError

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 1.6.3, 2.0.2
    • None
    • DStreams
    • mesos

    Description

      When load InternalMapWithStateDStream from a check point.
      If isValidTime is true and if there is no generatedRDD at the given time there is an infinite loop.

      1) compute is call on InternalMapWithStateDStream
      2) InternalMapWithStateDStream try to generate the previousRDD
      3) Stream look in generatedRDD if the RDD is already generated for the given time
      4) It not fund the rdd so it check if the time is valid.
      5) if the time is valid call compute on InternalMapWithStateDStream
      6) restart from 1)

      Here the exception that illustrate this error

      Exception in thread "streaming-start" java.lang.StackOverflowError
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
      	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340)
      	at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:335)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:333)
      	at scala.Option.orElse(Option.scala:289)
      	at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:330)
      	at org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:134)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
      	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340)
      	at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:335)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:333)
      	at scala.Option.orElse(Option.scala:289)
      	at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:330)
      	at org.apache.spark.streaming.dstream.InternalMapWithStateDStream.compute(MapWithStateDStream.scala:134)
      	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            crakjie etienne
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: