Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2009

Key not found exception when slow receiver starts

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.1, 1.1.0
    • Component/s: DStreams
    • Labels:
      None

      Description

      I got "java.util.NoSuchElementException: key not found: 1401756085000 ms" exception when using kafka stream and 1 sec batchPeriod.

      Investigation showed that the reason is that ReceiverLauncher.startReceivers is asynchronous (started in a thread).
      https://github.com/vchekan/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L206

      In case of slow starting receiver, such as Kafka, it easily takes more than 2sec to start. In result, no single "compute" will be called on ReceiverInputDStream before first batch job is executed and receivedBlockInfo remains empty (obviously). Batch job will cause ReceiverInputDStream.getReceivedBlockInfo call and "key not found" exception.

      The patch makes getReceivedBlockInfo more robust by tolerating missing values.

        Attachments

          Activity

            People

            • Assignee:
              vchekan Vadim Chekan
              Reporter:
              vchekan Vadim Chekan
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: