Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Invalid
    • None
    • 2.0.1
    • None
    • None

    Description

      Here is how we get into a stuck scenario in hbase-2.0.0RC2.

      • Assign a region. It is moved to OPENING state then RPCs the RS.
      • RS opens region. Tells Master.
      • Master tries to complete the assign by updating hbase:meta.
      • hbase:meta is hosed because I'd deployed a bad patch that blocked hbase:meta updates
      • Master is stuck retrying RPCs to RS hosting hbase:meta; we want to update our new OPEN state in hbase:meta.
      • I kill Master because I want to fix the broke patch.
      • On restart, a script sets table to be DISABLED.
      • As part of startup, we go to assign regions.
      • We skip assigning regions because the table is DISABLED; i.e. we skip the replay of the unfinished assign.
      • The region is now a free-agent; no lock held, so, the queued unassign that is part of the disable table can run
      • It fails because region is in OPENING state, an UnexpectedStateException is thrown.

      We loop complaining the above.

      Resolution requires finishing previous assign first, then we can disable.

      Let me try and write a test to manufacture this state.

      Attachments

        1. HBASE-20498.branch-2.001.patch
          1 kB
          Michael Stack

        Issue Links

          Activity

            People

              stack Michael Stack
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: