Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.0.1
-
None
-
None
Description
On our Production server we intermittently observe Kafka Streams get crashed with TimeoutException while committing offset. The only workaround seems to be restarting the application which is not a desirable solution for a production environment.
While have already implemented ProductionExceptionHandler which does not seems to address this.
Please provide a fix for this or a viable workaround.
Application side logs:
2019-11-17 08:28:48.055 +0000 [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] - org.apache.kafka.streams.processor.internals.AssignedStreamsTasks [org.apache.kafka.streams.processor.internals.AssignedTasks:applyToRunningTasks:373] - stream-thread [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] Failed to commit stream task 0_1 due to the following error:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {AggregateJob-1=OffsetAndMetadata{offset=176729402, metadata=''}}
2019-11-17 08:29:00.891 +0000 [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] - [:lambda$init$2:130] - Stream crashed!!! StreamsThread threadId: AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-12019-11-17 08:29:00.891 +0000 [AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1] [ERROR] - [:lambda$init$2:130] - Stream crashed!!! StreamsThread threadId: AggregateJob-614fe688-c9a4-4dad-a881-71488030918b-StreamThread-1TaskManager MetadataState: GlobalMetadata: [] GlobalStores: [] My HostInfo: HostInfo{host='unknown', port=-1} Cluster(id = null, nodes = [], partitions = [], controller = null) Active tasks: Running: Suspended: Restoring: New: Standby tasks: Running: Suspended: Restoring: New:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {AggregateJob-0=OffsetAndMetadata{offset=189808059, metadata=''}}
Kafka broker logs:
[2019-11-17 13:53:22,774] WARN Client session timed out, have not heard from server in 6669ms for sessionid 0x10068e4a2944c2f (org.apache.zookeeper.ClientCnxn)
[2019-11-17 13:53:22,809] INFO Client session timed out, have not heard from server in 6669ms for sessionid 0x10068e4a2944c2f, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
Regards,
Rohan
Attachments
Issue Links
- Is contained by
-
KAFKA-9274 Gracefully handle timeout exceptions on Kafka Streams
- Resolved