Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
2.4.0, 2.5.0
-
None
Description
Running job on Flink in streaming mode and get data from a Kafka topic with parallelism > 1 causes an exception:
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:657) at java.util.ArrayList.get(ArrayList.java:433) at org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:277) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56) at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99) at org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703) at java.lang.Thread.run(Thread.java:748)
It happens when number of Kafka topic partitions is less than value of parallelism (number of task slots).
So, workaround for now can be to set parallelism <= number of topic partitions, thus if parallelism=2 then number_partitions >= 2