Description
I am calling spark-submit passing maxRate, I have a single kinesis receiver, and batches of 1s
spark-submit --conf spark.streaming.receiver.maxRate=10 ....
however a single batch can greatly exceed the stablished maxRate. i.e: Im getting 300 records.
it looks like Kinesis is completely ignoring the spark.streaming.receiver.maxRate configuration.
If you look inside KinesisReceiver.onStart, you see:
val kinesisClientLibConfiguration =
new KinesisClientLibConfiguration(checkpointAppName, streamName, awsCredProvider, workerId)
.withKinesisEndpoint(endpointUrl)
.withInitialPositionInStream(initialPositionInStream)
.withTaskBackoffTimeMillis(500)
.withRegionName(regionName)
This constructor ends up calling another constructor which has a lot of default values for the configuration. One of those values is DEFAULT_MAX_RECORDS which is constantly set to 10,000 records.
Attachments
Attachments
Issue Links
- is cloned by
-
SPARK-23294 Spark Streaming + Rate source + Console Sink : Receiver MaxRate is violated
- Resolved
- links to