ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-502

bookkeeper create calls completion too many times

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: contrib-bookkeeper
    • Labels:
      None

      Description

      when calling the asynchronous version of create, the completion routine is called more than once.

        Issue Links

          Activity

          Benjamin Reed created issue -
          Hide
          Benjamin Reed added a comment -

          this patch adds a test case that reproduces the problem.

          Show
          Benjamin Reed added a comment - this patch adds a test case that reproduces the problem.
          Benjamin Reed made changes -
          Field Original Value New Value
          Attachment ZOOKEEPER-502.patch [ 12415801 ]
          Benjamin Reed made changes -
          Summary bookkeeper create call completion too many times bookkeeper create calls completion too many times
          Benjamin Reed made changes -
          Component/s contrib-bookkeeper [ 12312643 ]
          Hide
          Utkarsh Srivastava added a comment -

          The cause is the following:

          In Action 1 of createLedger, 4 calls to zk are fired off : one getChildren() and 3 create()s. The callback for the getChildren() moves the op into Action 2 which is not correct because the 3 creates are still pending. Now the create callback just assumes that its already in action 2, and hence advances to action 3. Now the same op is enqueued twice, and its action has been set to 3. The double queuing explains why the callback is called twice.

          Since in this sequence of events action 2 is skipped altogether, and we end up with 0 bookies. If the dequeuer actually happens to dequeue before the action is set to 3, then action 2 will also be carried out which explains why we get 0 bookies only sometimes and not always (which explains ZOOKEEPER-503).

          In general, this style of asynchronous programming with stage numbers is error-prone, and hard to read. Object creation is cheap, and operations like openLedger and createLedger are rare. Why not just create anonymous inner classes as callbacks instead of doing this state machine?

          Show
          Utkarsh Srivastava added a comment - The cause is the following: In Action 1 of createLedger, 4 calls to zk are fired off : one getChildren() and 3 create()s. The callback for the getChildren() moves the op into Action 2 which is not correct because the 3 creates are still pending. Now the create callback just assumes that its already in action 2, and hence advances to action 3. Now the same op is enqueued twice, and its action has been set to 3. The double queuing explains why the callback is called twice. Since in this sequence of events action 2 is skipped altogether, and we end up with 0 bookies. If the dequeuer actually happens to dequeue before the action is set to 3, then action 2 will also be carried out which explains why we get 0 bookies only sometimes and not always (which explains ZOOKEEPER-503 ). In general, this style of asynchronous programming with stage numbers is error-prone, and hard to read. Object creation is cheap, and operations like openLedger and createLedger are rare. Why not just create anonymous inner classes as callbacks instead of doing this state machine?
          Benjamin Reed made changes -
          Link This issue is part of ZOOKEEPER-503 [ ZOOKEEPER-503 ]
          Utkarsh Srivastava made changes -
          Link This issue is part of ZOOKEEPER-507 [ ZOOKEEPER-507 ]
          Hide
          Mahadev konar added a comment -

          Committed in ZOOKEEPER-507.

          Show
          Mahadev konar added a comment - Committed in ZOOKEEPER-507 .
          Mahadev konar made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 3.3.0 [ 12313976 ]
          Resolution Fixed [ 1 ]
          Patrick Hunt made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          172d 22h 21m 1 Mahadev konar 26/Jan/10 23:19
          Resolved Resolved Closed Closed
          58d 18h 5m 1 Patrick Hunt 26/Mar/10 17:24

            People

            • Assignee:
              Flavio Junqueira
              Reporter:
              Benjamin Reed
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development