Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2418

txnlog diff sync can skip sending some transactions to followers

    XMLWordPrintableJSON

Details

    Description

      If the leader is having disk issues so that its on disk txnlog is behind the in memory commit log, it will send a DIFF that is missing the transactions in between the two.

      Example:
      There are 5 hosts in the cluster. 1 is the leader. 5 is disconnected.
      We commit up to zxid 1000.
      At zxid 450, the leader's disk stalls, but we still commit transactions because 2,3,4 are up and acking writes.
      At zxid 1000, the txnlog on the leader has 1-450 and the commit log has 500-1000.
      Then host 5 regains its connection to the cluster and syncs with the leader. It will receive a DIFF containing zxids 1-450 and 500-1000.

      This is because queueCommittedProposals in the LearnerHandler just queues everything within its zxid range. It doesn't give an error if there is a gap between peerLastZxid and the iterator it is queueing from.

      Attachments

        Issue Links

          Activity

            People

              enixon Brian Nixon
              nwolchko Nicholas Wolchko
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 166h 20m
                  166h 20m
                  Logged:
                  Remaining Estimate - 166h 20m
                  1h 40m