Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2718

master_failover-itest when HMS is enabled is flaky

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 1.9.0
    • NA
    • test
    • None

    Description

      This was a failure in HmsConfigurations/MasterFailoverTest.TestDeleteTableSync/1, where GetParam() = 2, but it's likely possible in every multi-master test with HMS integration enabled.

      It looks like there was a leader master election at the time that the client tried to create the table being tested. The master managed to create the table in HMS, but then there was a failure replicating in Raft because another master was elected leader. So the client retried the request on a different master, but the HMS piece of CreateTable failed because the HMS already knew about the table.

      Thing is, there's code to roll back the HMS table creation if this happens, so I don't see why the retried CreateTable failed at the HMS with "table already exists". Perhaps this is a case where even though we succeeded in dropping the table from HMS, it doesn't reflect that immediately?

      I'm attaching the full log.

      Attachments

        1. master_failover-itest.1.txt
          1.36 MB
          Adar Dembo

        Issue Links

          Activity

            People

              hahao Hao Hao
              adar Adar Dembo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: