Uploaded image for project: 'Phoenix Tephra'
  1. Phoenix Tephra
  2. TEPHRA-257

If start() encounters an RPC timeout, an invalid transaction is left behind

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.13.0-incubating
    • None
    • core
    • None

    Description

      Suppose the following scenario:

      • a thrift client starts a transaction
      • the server responds, but for whatever reason it is slow
      • by the time the response is sent, the client has timed out the connection
      • now the server has started a transaction, but the client has no knowledge of it
      • that transaction will never be committed or aborted and eventually times out
      • it becomes an invalid transaction

      This is a common scenario when HDFS is slow and the write load is high. This means, a lot of change ids have to be written to a slow transaction log. Now we will generate invalid transactions systematically, which eventually degrades the performance of the entire system.

      It would be good if the server could detect this situation and abort the transaction immediately. This is safe to do whenever sending of the response fails, because we know that the client did not receive it, and hence it will not generate data with that transaction id.

      This is a tricky change, though: Thrift does not give us a way to intercept exceptions from socket failures. We would have to copy a Thrift class (ProcessFunction) and change it to handle exceptions that occur during the write of the response.

      Attachments

        Activity

          People

            poorna Poorna Chandra
            anew Andreas Neumann
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: