Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2963

Catalog manager never gives up on CreateTablet RPCs

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Bug
    • Affects Version/s: 1.11.0
    • Fix Version/s: NA
    • Component/s: master
    • Labels:

      Description

      This is a problem when there aren't enough live tservers upon which to place a tablet's replicas, or when a chosen tserver doesn't create the replica quickly enough. If the catalog manager decides to replace the tablet, the replaced tablet's CreateTablet RPCs continue to retry ad infinitum. If the previously dead tservers then come back to life, they must needlessly process the CreateTablet RPCs.

      The tablets are eventually deleted, either through explicit DeleteTablet RPCs (triggered by the catalog manager replacement process), or by heartbeating, but it's an unnecessary drain on cluster resources.

      We should probably abort CreateTablet RPCs for tablets that have been removed from their table.

      CreateTableITest_TestCreateWhenMajorityOfReplicasFailCreation demonstrates this acutely.

        Attachments

          Activity

            People

            • Assignee:
              bankim Bankim Bhavsar
              Reporter:
              adar Adar Dembo
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: