Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7245

Recovery on failed snapshot restore

    XMLWordPrintableJSON

Details

    Description

      Restore will do updates to the file system and to meta. it seems that an inopportune failure before meta is completely updated could result in an inconsistent state that would require hbck to fix.

      We should define what the semantics are for recovering from this. Some suggestions:

      1) Fail Forward (see some log saying restore's meta edits not completed, then gather information necessary to build it all from fs, and complete meta edits.).
      2) Fail backwards (see some log saying restore's meta edits not completed, delete incomplete snapshot region entries from meta.)

      I think I prefer 1 – if two processes end somehow updating (somehow the original master didn't die, and a new one started up) they would be idempotent. If we used 2, we could still have a race and still be in a bad place.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jmhsieh Jonathan Hsieh
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: