Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20828 Finish-up AMv2 Design/List of Tenets/Specification of operation
  3. HBASE-20939

There will be race when we call suspendIfNotReady and then throw ProcedureSuspendedException

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.0.1, 2.2.0, 2.1.1
    • amv2
    • None
    • Reviewed

    Description

      This is very typical usage in our procedure implementation, for example, in AssignProcedure, we will call AM.queueAssign and then suspend ourselves to wait until the AM finish processing our assign request.

      But there could be races. Think of this:
      1. We call suspendIfNotReady on a event, and it returns true so we need to wait.
      2. The event has been waked up, and the procedure will be added back to the scheduler.
      3. A worker picks up the procedure and finishes it.
      4. We finally throw ProcedureSuspendException and the ProcedureExecutor suspend us and store the state in procedure store.

      So we have a half done procedure in the procedure store for ever... This may cause assertion when loading procedures. And maybe the worker can not finish the procedure as when suspending we need to restore some state, for example, add something to RootProcedureState. But anyway, it will still lead to assertion or other unexpected errors.

      And this can not be done by simply adding a lock in the procedure, as most works are done in the ProcedureExecutor after we throw ProcedureSuspendException.

      Attachments

        1. HBASE-20939.patch
          14 kB
          Duo Zhang
        2. HBASE-20939.patch
          14 kB
          Duo Zhang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zhangduo Duo Zhang
            zhangduo Duo Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment