Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7347

clock skew can cause data loss

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • SolrCloud
    • None

    Description

      The high bits of versions are created using the system clock.
      System clock skew on the order of magnitude of time it takes for one leader to receive it's last update to the time it takes another replica to become a leader can cause data loss for any updates to the same document until the new leaders clock catches up with the old leaders clock.

      1) replica1 is the leader and indexes document A, choosing version X (and forwards to replicas)
      2) replica1 goes down
      3) replica2 becomes the new leader
      4) replica2 indexes an update for document A, and chooses version Y (which is less than X due to clock skew) and forwards to replica3
      5) replica3 checks for reordered updates, finds version X and thus drops version Y

      This should be rare... you need a big enough clock skew and updates to the same document with different leaders within that time window. We should still fix this of course.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yseeley@gmail.com Yonik Seeley
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: