Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-434

Add/replace nodes in a tablet (handle permanent node failures) - ability to perform manual changes

    XMLWordPrintableJSON

Details

    Description

      This tracks work required to handle a few similar use cases:

      • node crashes and doesn't come back ever. a new node needs to be recruited to get a tablet's replication back up to 3
        • equivalently, a node loses its disks and thus is essentially a new node
      • add a new node to the cluster, and balance some existing data onto that one

      Implementation-wise:

      • ability to bootstrap a new node into a quorum (copy existing tablet data)
      • ability to change-config a quorum to include a new peer
      • some administrative commands to explicitly migrate data to perform manual load balancing (automatic load balancing tracked separately)

      Attachments

        Issue Links

          Activity

            People

              mpercy Mike Percy
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: