Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1218

Under pressure client will retry write only to find that a previous attempt succeeded

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: Public beta
    • Fix Version/s: None
    • Component/s: client, impala, tserver
    • Labels:
      None

      Description

      While inserting a large data set into Kudu from Impala, JD and me observed the following issue: It appears as if the writes become throttled at some point in time, timeout or manual reject. Now, the C++ client will retry the operation. However, at this point the previous write will have succeeded and the write operation will fail with "Row already exists in MemRowSet".

      This behavior is very unfortunate, since Impala will believe that the data is corrupt even though the actual error is deeper in the communication between the client and Kudu.

      I think, we will need some additional information to track if a timed-out or rejected write op will be processed in Kudu even though the client is forced to retry.

      This is critical because a insert will look as it inserts the same row twice and abort, even though the row was already inserted. Leaving the system in an inconsistent state from the Impala perspective.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mgrund Martin Grund
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: