Uploaded image for project: 'Bookkeeper'
  1. Bookkeeper
  2. BOOKKEEPER-237 Automatic recovery of under-replicated ledgers and its entries
  3. BOOKKEEPER-378

ReplicationWorker may not get ZK watcher notification on UnderReplication ledger lock deletion.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.2.0
    • None

    Description

      This issue found with BK-248. see comment

      Issue is:
      1) Two Workers started and trying to get the lock for same ledger.
      2) Both worker found that lock file does not exist.
      3) both gone ahead for creating the lock node.
      4) One worker failed with NodeExists exception

      Then it is just removing the children from the list and go for latch wait for the watch notification.

      But here unfortunately we added the watch on lockPath with exists check call. But that time lockPatch really did not exists. SO, the lock may be invalid. Then it will never get the notification when lock has been cleaned by other worker.
      Here other worker partly replicated and now the current worker should take lock. But it can not get that notification as it added that watch when node does not exist.

      Attachments

        1. BOOKKEEPER-378.diff
          7 kB
          Ivan Kelly
        2. BOOKKEEPER-378.patch
          7 kB
          Uma Maheswara Rao G
        3. BOOKKEEPER-378.patch
          1 kB
          Uma Maheswara Rao G

        Issue Links

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: