Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-13050

Race between controller creating snapshot and snapshot cleaning

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: controller, kraft
    • Labels:
      None

      Description

      If the controller attempts to take a snapshot with its cached OffsetAndEpoch while snapshot cleaning is happening, it is possible for the OffsetAndEpoch to be invalidated due to truncation.

      [2021-07-08 12:12:41,938] WARN [Controller 1] org.apache.kafka.controller.QuorumController@67e0d836: failed with unknown server exception IllegalArgumentException at epoch -1 in 3207460 us.  Reverting to last committed offset 98. (org.apache.kafka.controller.QuorumController)
      java.lang.IllegalArgumentException: Snapshot id (OffsetAndEpoch(offset=99, epoch=5)) is not valid according to the log: ValidOffsetAndEpoch(kind=SNAPSHOT, offsetAndEpoch=OffsetAndEpoch(offset=180, epoch=8))
      	at kafka.raft.KafkaMetadataLog.createNewSnapshot(KafkaMetadataLog.scala:252)
      	at org.apache.kafka.raft.KafkaRaftClient.lambda$createSnapshot$30(KafkaRaftClient.java:2334)
      	at org.apache.kafka.snapshot.SnapshotWriter.createWithHeader(SnapshotWriter.java:134)
      	at org.apache.kafka.raft.KafkaRaftClient.createSnapshot(KafkaRaftClient.java:2333)
      	at org.apache.kafka.controller.QuorumController$SnapshotGeneratorManager.createSnapshotGenerator(QuorumController.java:351)
      	at org.apache.kafka.controller.QuorumController.checkSnapshotGeneration(QuorumController.java:904)
      	at org.apache.kafka.controller.QuorumController.access$3000(QuorumController.java:121)
      	at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:681)
      	at org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:311)
      	at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
      	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
      	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
      	at java.lang.Thread.run(Thread.java:748)
      [2021-07-08 12:12:41,941] INFO [BrokerMetadataListener id=1] Loading snapshot 180-8. (kafka.server.metadata.BrokerMetadataListener)
      

      This was observed while running a broker in combined mode with artificially low values for snapshot generation and cleaning.

      metadata.log.max.record.bytes.between.snapshots=100
      metadata.log.segment.bytes=1024
      metadata.max.retention.bytes=4096
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mumrah David Arthur
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: