Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2993

Allow Kudu to start up with a fresh data directory without running update_dirs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.12.0
    • fs
    • None

    Description

      In the event of a disk failure, the current workflow is to have operators:

      1. The Kudu operator shuts down Kudu for a maintenance window
      2. The data center operator replaces their disk
      3. The Kudu operator runs fs update_dirs
      4. The Kudu operator restarts Kudu

      Step 3 is unlike what most systems do. As an operator, it would be nice to not have to do it. Once my disk is replaced, Kudu should just know that it's OK to start up (e.g. because it notices a completely empty disk where it expected an existing one), and perhaps run the update_dirs tool automatically.

      An argument could be made that we shouldn't do this if we're not sure that the operator wants to, as replacing a disk may result in failed tablets. If the missing directory was caused by a simple user input error, maybe we shouldn't have run the tool and failed some tablets. But given many Kudu operators automate their deployment of Kudu, it's hard to think of a time when they wouldn't want to have Kudu run the tool.

      In the case the tool fails because the "missing" directory ended up being a disk failure, we should simply start Kudu up with the data dir marked failed.

      Attachments

        Issue Links

          Activity

            People

              awong Andrew Wong
              awong Andrew Wong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: