Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20140

Remove hardcoded kinesis retry wait and max retries

    XMLWordPrintableJSON

Details

    Description

      The pull requests proposes to remove the hardcoded values for Amazon Kinesis - MIN_RETRY_WAIT_TIME_MS, MAX_RETRIES.

      This change is critical for kinesis checkpoint recovery when the kinesis backed rdd is huge.
      Following happens in a typical kinesis recovery :

      • kinesis throttles large number of requests while recovering
      • retries in case of throttling are not able to recover due to the small wait period
      • kinesis throttles per second, the wait period should be configurable for recovery

      The patch picks the spark kinesis configs from:

      • spark.streaming.kinesis.retry.wait.time
      • spark.streaming.kinesis.retry.max.attempts

      Attachments

        Activity

          People

            yash360@gmail.com Yash Sharma
            yash360@gmail.com Yash Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: