Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2718

master_failover-itest when HMS is enabled is flaky

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 1.9.0
    • Fix Version/s: NA
    • Component/s: test
    • Labels:
      None

      Description

      This was a failure in HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() = 2, but it's likely possible in every multi-master test with HMS integration enabled.

      It looks like there was a leader master election at the time that the client tried to create the table being tested. The master managed to create the table in HMS, but then there was a failure replicating in Raft because another master was elected leader. So the client retried the request on a different master, but the HMS piece of CreateTable failed because the HMS already knew about the table.

      Thing is, there's code to roll back the HMS table creation if this happens, so I don't see why the retried CreateTable failed at the HMS with "table already exists". Perhaps this is a case where even though we succeeded in dropping the table from HMS, it doesn't reflect that immediately?

      I'm attaching the full log.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              hahao Hao Hao
              Reporter:
              adar Adar Dembo

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment