[KAFKA-7130] EOFException after rolling log segment - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.1.0
Fix Version/s: None
Component/s: replication
Labels:
None

Description

When rolling a log segment one of our Kafka cluster got an immediate read error on the same partition. This lead to a flood of log messages containing the corresponding stacktraces. Data was still appended to the partition but consumers were unable to read from that partition. Reason for the exception is unclear.

[2018-07-02 23:53:32,732] INFO [Log partition=ingestion-3, dir=/var/vcap/store/kafka] Rolled new log segment at offset 971865991 in 1 ms. (kafka.log.Log)
[2018-07-02 23:53:32,739] INFO [ProducerStateManager partition=ingestion-3] Writing producer snapshot at offset 971865991 (kafka.log.ProducerStateManager)
[2018-07-02 23:53:32,739] INFO [Log partition=ingestion-3, dir=/var/vcap/store/kafka] Rolled new log segment at offset 971865991 in 1 ms. (kafka.log.Log)
[2018-07-02 23:53:32,750] ERROR [ReplicaManager broker=1] Error processing fetch operation on partition ingestion-3, offset 971865977 (kafka.server.ReplicaManager)

Caused by: java.io.EOFException: Failed to read `log header` from file channel `sun.nio.ch.FileChannelImpl@2e0e8810`. Expected to read 17 bytes, but reached end of file after reading 0 bytes. Started read from position 2147483643.

We mitigated the issue by stopping the affected node and deleting the corresponding directory. Once the partition was recreated for the replica (we use replication-factor 2) the other replica experienced the same problem. We mitigated likewise.

To us it is unclear, what caused this issue. Can you help us in finding the root cause of this problem?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

dump-00000000001311940075.log.bz2
05/Jul/18 09:17
51.79 MB
Karsten Schnitter
dump-00000000001311940075.index.bz2
05/Jul/18 09:17
3.26 MB
Karsten Schnitter

Issue Links

duplicates

KAFKA-6292 KafkaConsumer ran into Unknown error fetching data for topic-partition caused by integer overflow in FileLogInputStream

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Karsten Schnitter

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 03/Jul/18 10:57

Updated:: 06/Jul/18 23:22

Resolved:: 06/Jul/18 23:22