ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1278

acceptedEpoch not handling zxid rollover in lower 32bits

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Duplicate
    • Affects Version/s: 3.4.0, 3.5.0
    • Fix Version/s: None
    • Component/s: server
    • Labels:
      None
    • Release Note:
      Hide
      Workaround: there is a simple workaround for this issue. Force a leader re-election before the lower 32bits reach 0xffffffff

      Most users won't even see this given the number of writes on a typical installation - say you are doing 500 writes/second, you'd see this after ~3 months if the quorum is stable, any changes (such as upgrading the server software) would cause the xid to be reset, thereby staving off this issue for another period.
      Show
      Workaround: there is a simple workaround for this issue. Force a leader re-election before the lower 32bits reach 0xffffffff Most users won't even see this given the number of writes on a typical installation - say you are doing 500 writes/second, you'd see this after ~3 months if the quorum is stable, any changes (such as upgrading the server software) would cause the xid to be reset, thereby staving off this issue for another period.

      Description

      When the lower 32bits of a zxid "roll over" (zxid is a 64 bit number, however the upper 32 are considered the epoch number) the epoch number (upper 32 bits) are incremented and the lower 32 start at 0 again.

      This should work fine, however, afaict, in the current 3.4/3.5 the acceptedEpoch/currentEpoch files are not being updated for this case.

      See ZOOKEEPER-335 for changes from 3.3 branch.

        Issue Links

          Activity

          Hide
          Patrick Hunt added a comment -

          This turns out to be a duplicate of ZOOKEEPER-1277 - that patch causes the leader to be re-elected just prior to rollover. 1277 was applied to 3.3/3.4/3.5(trunk)

          Show
          Patrick Hunt added a comment - This turns out to be a duplicate of ZOOKEEPER-1277 - that patch causes the leader to be re-elected just prior to rollover. 1277 was applied to 3.3/3.4/3.5(trunk)
          Hide
          Patrick Hunt added a comment -

          This patch passes the simple test, however the others all fail. This is the test/fix from ZOOKEEPER-1277

          Show
          Patrick Hunt added a comment - This patch passes the simple test, however the others all fail. This is the test/fix from ZOOKEEPER-1277
          Hide
          Patrick Hunt added a comment -

          I just tested this with my test from ZOOKEEPER-1277 and it fails with out the hzxid change in ZooKeeperServer. However even with that patch it still fails, I'm assuming because the acceptedEpoch, etc... files are not being updated properly.

          Camille can you take a look?

          Show
          Patrick Hunt added a comment - I just tested this with my test from ZOOKEEPER-1277 and it fails with out the hzxid change in ZooKeeperServer. However even with that patch it still fails, I'm assuming because the acceptedEpoch, etc... files are not being updated properly. Camille can you take a look?

            People

            • Assignee:
              Patrick Hunt
              Reporter:
              Patrick Hunt
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development