Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Invalid
-
1.0.0
-
None
-
None
-
Important
Description
While running a job, without fault tolerance, producing data to Kafka, the job failed due to "Batch Expired exception". I tried to increase the "request.timeout.ms" and "max.block.ms" to 60000 instead of 30000 but still the same problem. The only way to ride on this problem is using snapshotting.
09:58:11,036 WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 48106 on topic-partition flinkWordCountNoFaultToleranceSmall
-2, retrying (2147483646 attempts left). Error: NETWORK_EXCEPTION
09:58:11,036 WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 48105 on topic-partition flinkWordCountNoFaultToleranceSmall
-2, retrying (2147483646 attempts left). Error: NETWORK_EXCEPTION
09:58:11,036 WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 48104 on topic-partition flinkWordCountNoFaultToleranceSmall
-2, retrying (2147483646 attempts left). Error: NETWORK_EXCEPTION
09:58:11,068 ERROR org.apache.flink.streaming.runtime.tasks.StreamTask - Caught exception while processing timer.
java.lang.RuntimeException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:319)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:300)
at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:48)
at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:29)
at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
at org.apache.flink.streaming.runtime.operators.windowing.AggregatingKeyedTimePanes.evaluateWindow(AggregatingKeyedTimePanes.java:59)
at org.apache.flink.streaming.runtime.operators.windowing.AbstractAlignedProcessingTimeWindowOperator.computeWindow(AbstractAlignedProcessingTimeWindowOperator.java:242)
at org.apache.flink.streaming.runtime.operators.windowing.AbstractAlignedProcessingTimeWindowOperator.trigger(AbstractAlignedProcessingTimeWindowOperator.java:223)
at org.apache.flink.streaming.runtime.tasks.StreamTask$TriggerTask.run(StreamTask.java:606)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:319)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:300)
at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:48)
at org.apache.flink.streaming.runtime.io.CollectorWrapper.collect(CollectorWrapper.java:29)
at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:37)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:316)
... 15 more
Caused by: java.lang.Exception: Failed to send data to Kafka: Batch Expired
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.checkErroneous(FlinkKafkaProducerBase.java:282)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.invoke(FlinkKafkaProducerBase.java:249)
at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:37)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:316)
... 20 more
Caused by: org.apache.kafka.common.errors.TimeoutException: Batch Expired
09:58:11,084 INFO org.apache.kafka.clients.producer.KafkaProducer - Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.