Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20140

Remove hardcoded kinesis retry wait and max retries

    Details

      Description

      The pull requests proposes to remove the hardcoded values for Amazon Kinesis - MIN_RETRY_WAIT_TIME_MS, MAX_RETRIES.

      This change is critical for kinesis checkpoint recovery when the kinesis backed rdd is huge.
      Following happens in a typical kinesis recovery :

      • kinesis throttles large number of requests while recovering
      • retries in case of throttling are not able to recover due to the small wait period
      • kinesis throttles per second, the wait period should be configurable for recovery

      The patch picks the spark kinesis configs from:

      • spark.streaming.kinesis.retry.wait.time
      • spark.streaming.kinesis.retry.max.attempts

        Attachments

          Activity

            People

            • Assignee:
              yash360@gmail.com Yash Sharma
              Reporter:
              yash360@gmail.com Yash Sharma
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: