Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-849

TS may miss alter schema command if it races with the first tablet report

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Private Beta
    • None
    • master
    • None

    Description

      alter_table-test occasionally fails with a timeout because of something that looks like the following race:

      • TS starts up, and reports its tablet. We get to the part of HandleReportedTablet where we check the schema version, and the schema is up-to-date. However, the thread handling the tablet report hasn't yet gotten to the point of adding the tablet to the replica list of the tablet.
      • client comes in with an Alter Table request. It bumps the schema version, and then iterates over all the tablets
        • the tablet doesn't have any location for the replica being reported, so it doesn't send a request
      • we then finish handling the tablet report and add the replica to the list for this tablet

      After this, the alter table gets "stuck". It could be un-stuck by restarting the TS (to trigger a report) or probably by issuing another alter command.

      Attachments

        Issue Links

          Activity

            People

              tlipcon Todd Lipcon
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: