Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12969

Inconsistency with leader when PeerSync return ALREADY_IN_SYNC

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.6.5, 7.5
    • 8.0
    • replication (java)
    • None

    Description

      Under certain circumstances, replication fails between a leader and follower.  The follower will not receive updates from the leader, even though the leader has a newer version.  If the leader is restarted, it will get the older version from the follower.

       

      This was discussed on the mailing list and risdenk wrote a script that demonstrates this error.  He also verified that the error occurs when the script is run outside of docker.

       

      Here is the scenario of the failure:

      • A collection with 1 shards and 2 replicas
      • Stop non-leader replica (B)
      • Index more than 100 documents to the collection
      • Start replica B, it failed to do PeerSync and starts segments replication
      • Index document 101th to the collection
        • Leader's tlog: [1, 2, 3, ..., 100, 101]
        • Replica's tlog: [101]
      • Stop replica B
      • Index document 102th to the collection
      • Start replica B, on doing PeerSync
        • Leader's tlog: [1, 2, 3, ..., 100, 101, 102]
        • Replica's tlog: [101]
        • Leader's high (80th): 80
        • Replica's low: 101
        • By comparison: replica's low > leader's high => ALREADY_IN_SYNC

      Attachments

        1. SOLR-12969.patch
          20 kB
          Cao Manh Dat
        2. SOLR-12969.patch
          25 kB
          Cao Manh Dat
        3. SOLR-12969.patch
          26 kB
          Cao Manh Dat

        Activity

          People

            caomanhdat Cao Manh Dat
            Jeremy Smith Jeremy Smith
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: