Kafka
  1. Kafka
  2. KAFKA-769

On startup, a brokers highwatermark for every topic partition gets reset to zero

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:

      Description

      There is a race condition between the highwatermark thread and the handleLeaderAndIsrRequest call of the request handler thread. When a broker starts, the highwatermark thread tries to persist all the checkpoints of the partitions in ReplicaManager. This partition map in ReplicaManager is initially empty. When the leaderAndIsrRequest runs, it updates each partition and if the highwatermark thread runs during this interval, it is essentially going to overwrite the highwatermark file to an inconsistent state. The read of the highwatermark reads from the file each time and hence would return the inconsistent state.

      1. KAFKA-769-v1.patch
        2 kB
        Sriram Subramanian

        Activity

        Neha Narkhede made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Neha Narkhede made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Neha Narkhede added a comment -

        Checked in patch

        Show
        Neha Narkhede added a comment - Checked in patch
        Hide
        Neha Narkhede added a comment -

        +1

        Show
        Neha Narkhede added a comment - +1
        Sriram Subramanian made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Sriram Subramanian added a comment -

        .../main/scala/kafka/server/ReplicaManager.scala | 9 +++++++--
        1 files changed, 7 insertions, 2 deletions

        Show
        Sriram Subramanian added a comment - .../main/scala/kafka/server/ReplicaManager.scala | 9 +++++++-- 1 files changed, 7 insertions , 2 deletions
        Sriram Subramanian made changes -
        Attachment KAFKA-769-v1.patch [ 12570525 ]
        Hide
        Sriram Subramanian added a comment -

        The patch ensures that highwatermark thread is initialized only after the first leaderIsrRequest batch.

        Show
        Sriram Subramanian added a comment - The patch ensures that highwatermark thread is initialized only after the first leaderIsrRequest batch.
        Hide
        Neha Narkhede added a comment -

        An easy way of resolving this would be to start the highwatermark thread only after the first leader and isr request is completed on a newly restarted broker. This is easy to keep track of since the initial leader and isr request has a special init flag turned on. This will ensure that there is no inconsistent state checkpointed to disk since we will wait until the replica manager has finished initializing the highwatermark for all its replicas from disk. Also, this logic will become trickier when we add the features to change the number of replicas online or change the number of partitions online, but we don't have to worry about that right now.

        Show
        Neha Narkhede added a comment - An easy way of resolving this would be to start the highwatermark thread only after the first leader and isr request is completed on a newly restarted broker. This is easy to keep track of since the initial leader and isr request has a special init flag turned on. This will ensure that there is no inconsistent state checkpointed to disk since we will wait until the replica manager has finished initializing the highwatermark for all its replicas from disk. Also, this logic will become trickier when we add the features to change the number of replicas online or change the number of partitions online, but we don't have to worry about that right now.
        Neha Narkhede made changes -
        Labels p1
        Sriram Subramanian made changes -
        Assignee Neha Narkhede [ nehanarkhede ] Sriram Subramanian [ sriramsub ]
        Hide
        Sriram Subramanian added a comment -

        Assigning this back based on the email thread

        Show
        Sriram Subramanian added a comment - Assigning this back based on the email thread
        Hide
        Neha Narkhede added a comment -

        Sure, go ahead and attach a patch, if you already have it ready

        Show
        Neha Narkhede added a comment - Sure, go ahead and attach a patch, if you already have it ready
        Hide
        Sriram Subramanian added a comment -

        I have a fix for this. Do you still want to take a look?

        Show
        Sriram Subramanian added a comment - I have a fix for this. Do you still want to take a look?
        Hide
        Neha Narkhede added a comment -

        I will take a look at this.

        Show
        Neha Narkhede added a comment - I will take a look at this.
        Neha Narkhede made changes -
        Field Original Value New Value
        Assignee Sriram Subramanian [ sriramsub ] Neha Narkhede [ nehanarkhede ]
        Sriram Subramanian created issue -

          People

          • Assignee:
            Sriram Subramanian
            Reporter:
            Sriram Subramanian
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development