Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21291

Add a test for bypassing stuck state-machine procedures

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 3.0.0, 2.2.0, 2.1.1, 2.0.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      bypass will now throw an Exception if passed a lockWait <= 0; i.e bypass will prevent an operator getting stuck on an entity lock waiting forever (lockWait == 0)

      Description

            if (!procedure.isFailed()) {
              if (subprocs != null) {
                if (subprocs.length == 1 && subprocs[0] == procedure) {
                  // Procedure returned itself. Quick-shortcut for a state machine-like procedure;
                  // i.e. we go around this loop again rather than go back out on the scheduler queue.
                  subprocs = null;
                  reExecute = true;
                  LOG.trace("Short-circuit to next step on pid={}", procedure.getProcId());
                } else {
                  // Yield the current procedure, and make the subprocedure runnable
                  // subprocs may come back 'null'.
                  subprocs = initializeChildren(procStack, procedure, subprocs);
                  LOG.info("Initialized subprocedures=" +
                    (subprocs == null? null:
                      Stream.of(subprocs).map(e -> "{" + e.toString() + "}").
                      collect(Collectors.toList()).toString()));
                }
              } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) {
                LOG.debug("Added to timeoutExecutor {}", procedure);
                timeoutExecutor.add(procedure);
              } else if (!suspended) {
                // No subtask, so we are done
                procedure.setState(ProcedureState.SUCCESS);
              }
            }
      

      Currently implementation of ProcedureExecutor will set the reExcecute to true for state machine like procedure. Then if this procedure is stuck at one certain state, it will loop forever.

                IdLock.Entry lockEntry = procExecutionLock.getLockEntry(proc.getProcId());
                try {
                  executeProcedure(proc);
                } catch (AssertionError e) {
                  LOG.info("ASSERT pid=" + proc.getProcId(), e);
                  throw e;
                } finally {
                  procExecutionLock.releaseLockEntry(lockEntry);
      

      Since procedure will get the IdLock and release it after execution done, state machine procedure will never release IdLock until it is finished.
      Then bypassProcedure doesn't work because is will try to grab the IdLock at first.

          IdLock.Entry lockEntry = procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait);
      

        Attachments

        1. HBASE-21291.master.001.patch
          5 kB
          Jingyun Tian
        2. HBASE-21291.master.002.patch
          6 kB
          Jingyun Tian
        3. HBASE-21291.master.003.patch
          6 kB
          Jingyun Tian
        4. HBASE-21291.master.004.patch
          6 kB
          Jingyun Tian
        5. HBASE-21291.master.005.patch
          6 kB
          Jingyun Tian

          Issue Links

            Activity

              People

              • Assignee:
                tianjingyun Jingyun Tian
                Reporter:
                tianjingyun Jingyun Tian
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: