Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-2148

Snapshot transfer may cause followers to trigger reloadStateMachine incorrectly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.1
    • snapshot
    • None

    Description

      Due to the fact that grpc streaming snapshot sending sends all requests at once, error handling is performed after all are sent, and the last snapshot request is used as a completion flag, which may lead to the successful receipt of the last request, but the previous request has failed. The sender handles the failure event during the retransmission of the snapshot. The receiver triggers state.reloadStateMachine because it successfully receives the last request, but due to incomplete snapshot reception
       
      An md5 mismatch exception occurred before the last SnapshotRequest was received

       
      The last snapshot request arrived, then successfully received, and then updated the index.


       
      However, the snapshot reception is incomplete and triggers the reloadStateMachine.

       
      I suggest using a flag to identify whether the entire snapshot request is abnormal.
      If an exception occurs, the subsequent content of the request will not be processed.
      Or the sender will wait for the receiver's reply. If there is a release error, resend it.
       
      Finally, the current error retry level is the entire snapshot directory rather than a single chunk, which will cause a large number of snapshot files to be sent repeatedly, which can be optimized later

      Attachments

        1. image-2024-09-03-14-24-25-652.png
          34 kB
          yuuka
        2. image-2024-09-03-14-25-22-174.png
          40 kB
          yuuka
        3. image-2024-09-03-14-27-39-406.png
          40 kB
          yuuka
        4. image-2024-09-03-14-28-31-529.png
          34 kB
          yuuka
        5. image-2024-09-03-14-30-02-751.png
          114 kB
          yuuka
        6. image-2024-09-03-14-33-40-760.png
          285 kB
          yuuka
        7. image-2024-09-03-14-33-49-573.png
          285 kB
          yuuka

        Issue Links

          Activity

            People

              tohsakarin__ yuuka
              tohsakarin__ yuuka
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m