Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2960

DelayedProduce may cause message loss during repeated leader change

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.9.0.0
    • Fix Version/s: 0.10.0.0
    • Component/s: core
    • Labels:
      None

      Description

      related to #KAFKA-1148
      When a leader replica became follower then leader again, it may truncated its log as follower. But the second time it became leader, its ISR may shrink and if at this moment new messages were appended, the DelayedProduce generated when it was leader the first time may be satisfied, and the client will receive a response with no error. But, actually the messages were lost.

      We simulated this scene, which proved the message lose could happen. And it seems to be the reason for a data lose recently happened to us according to broker logs and client logs.

      I think we should check the leader epoch when send a response, or satisfy DelayedProduce when leader change as described in #KAFKA-1148.

      And we may need an new error code to inform the producer about this error.

        Attachments

          Activity

            People

            • Assignee:
              becket_qin Jiangjie Qin
              Reporter:
              peoplebike Xing Huang
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: