Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.0, 2.0.1
    • Fix Version/s: 3.0.0, 2.1.1, 2.0.2
    • Component/s: amv2
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      After HBASE-20846, we restore lock info for procedures. But, there is a case that the lock and be held by a already success procedure. Since the procedure won't execute again, the lock will held by the procedure forever.

      1. All children for pid=1208 had been finished, but before procedure 1208 awake, the master was killed

      2018-08-05 02:20:14,465 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1659): Finished subprocedure(s) of pid=1208, ppid=1206, state=RUNNABLE, hasLock=true; MoveRegionProcedure hri=c2a23a735f16df57299
      dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034; resume parent processing.
      
      2018-08-05 02:20:14,466 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1296): Finished pid=1232, ppid=1208, state=SUCCESS, hasLock=false; AssignProcedure table=IntegrationTestBigLinkedList, region=c2a
      23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034 in 1.5060sec
      

      2. Master restarts, since procedure 1208 held the lock before restart, so the lock was resotore for it

      2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456): Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=
      e010125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034
      
      2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898): pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj
      a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held the lock before restarting, call acquireLock to restore it.
      
      2018-08-05 02:20:30,818 INFO  [Thread-15] procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=e0
      10125050127.bja,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 checking lock on c2a23a735f16df57299dba6fd4599f2f
      

      3. Since procedure 1208 is success, it won't execute later, so the lock will be held by it forever

      We need to check the state of the procedure before restoring locks, if the procedure is already finished (success or rollback), we do not need to acquire lock for it.

        Attachments

          Activity

            People

            • Assignee:
              allan163 Allan Yang
              Reporter:
              allan163 Allan Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: