Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1073

Single TS falling too far behind hung YCSB

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Cannot Reproduce
    • Private Beta
    • n/a
    • client, consensus
    • None

    Description

      This caused a YCSB job to fail:

      • a server fell behind for some reason (haven't done root cause on why – maybe just a bit slow)
      • leader GCed the logs needed to catch it up, and thus stopped sending it any heartbeats or other messages
      • the server had one write pending
      • the java client apparently just kept retrying over and over against the same server

      The server with the pending txn may actually have been the leader at the time it was written - otherwise not sure why Java keeps retrying it. Or perhaps the Java client got an error on the leader, failed over to try the follower, and RPCs to the follower are timing out.

      Attachments

        Issue Links

          Activity

            People

              jdcryans Jean-Daniel Cryans
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: