Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2735

RemoteKsckTest.TestClusterWithLocation is flaky

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.9.0
    • Fix Version/s: n/a
    • Component/s: test
    • Labels:
      None

      Description

      RemoteKsckTest.TestClusterWithLocation is flaky

      Alexey took a look at it and here is the analysis:

      In essence, due to slowness of TSAN builds, connection negotiation from kudu CLI to one of master servers timed out, so one of the preconditions of the test didn't meet.  The error output by the test was:

      /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/ksck_remote-test.cc:523: Failure
      Failed                                                                          
      Bad status: Network error: failed to gather info from all masters: 1 of 3 had errors
      

      The corresponding error in the master's log was:

      W0221 12:38:27.119146 31380 negotiation.cc:313] Failed RPC negotiation. Trace:  
      0221 12:38:23.949428 (+     0us) reactor.cc:583] Submitting negotiation task for client connection to 127.25.42.190:51799
      0221 12:38:25.362220 (+1412792us) negotiation.cc:98] Waiting for socket to connect
      0221 12:38:25.363489 (+  1269us) client_negotiation.cc:167] Beginning negotiation
      0221 12:38:25.369976 (+  6487us) client_negotiation.cc:244] Sending NEGOTIATE NegotiatePB request
      0221 12:38:25.431582 (+ 61606us) client_negotiation.cc:261] Received NEGOTIATE NegotiatePB response
      0221 12:38:25.431610 (+    28us) client_negotiation.cc:355] Received NEGOTIATE response from server
      0221 12:38:25.432659 (+  1049us) client_negotiation.cc:182] Negotiated authn=SASL
      0221 12:38:27.051125 (+1618466us) client_negotiation.cc:483] Received TLS_HANDSHAKE response from server
      0221 12:38:27.062085 (+ 10960us) client_negotiation.cc:471] Sending TLS_HANDSHAKE message to server
      0221 12:38:27.062132 (+    47us) client_negotiation.cc:244] Sending TLS_HANDSHAKE NegotiatePB request
      0221 12:38:27.064391 (+  2259us) negotiation.cc:304] Negotiation complete: Timed out: Client connection negotiation failed: client connection to 127.25.42.190:51799: BlockingWrite timed out
      

      We are seeing this on the flaky test dashboard for both TSAN and ASAN builds.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wdberkeley William Berkeley
                Reporter:
                mpercy Mike Percy
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: