Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8709

Add checksum to the TopicStream to ensure delivery of all documents within a Topic

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Currently the TopicStream can miss documents if version numbers are received out-of-order. The TopicStream sorts on version number so it will only miss out-of-order versions that span commit boundaries. Stress testing was not able to create a missed document scenario (see comment below), but code review points to the possibility of this happening.

      In order to resolve this issue we can adopt an approach that keeps a checksum of the version numbers for a sliding time window. This checksum can be checked each run and if the checksums don't match the documents from the time window can be resent. As long as the time window is longer then the softCommit interval, this will guarantee delivery of all documents for the Topic. This won't guarantee one time delivery but should be provide a reasonable expectation of one time delivery.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            jbernste Joel Bernstein

            Dates

              Created:
              Updated:

              Slack

                Issue deployment