Uploaded image for project: 'Bookkeeper'
  1. Bookkeeper
  2. BOOKKEEPER-668

Race between PerChannelBookieClient#channelDisconnected() and disconnect() calls can make clients hang while add/reading entries in case of multiple bookie failures

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.2.1, 4.3.0
    • Fix Version/s: 4.2.2, 4.3.0
    • Component/s: bookkeeper-client
    • Labels:
      None

      Description

      1. Ledger was created with ensemble 2 and quorum as 2 and entries were written.
      2. While reading entries, 2 BKs out of 3 in cluster were killed and restarted.
      3. Client was hung at read call waiting for sync counter notification.

      As though I was not able to reproduce this in some tries, but
      By looking at the logs and code, following seems to be problem

      1. BookieWatcher got the notification first for changes in available bookies.
      2. PerChannelBookieClient#disconnect() called from BookieWatcher for failed bookies. This has set the 'this.channel=null;'
      3. PerChannelBookieClient#channelDisconnected() call came now, and it proceeded silently without notifying errors to read ops.

      So client is hung waiting for result.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hustlmsp Sijie Guo
                Reporter:
                vinayakumarb Vinayakumar B
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: