Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13050

SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 7.7, 8.0
    • None
    • None

    Description

      A chicken/egg issue of the way the autoscaling SystemLogListener uses the .system collection to record event history is that in the case of a nodeLost event for the .system collection's leader, there is a window of time during leader election where trying to add the "Document" representing that nodeLost event to the .system collection can fail.

      This isn't a silently failure: the SystemLogListener, acting the role of a Solr client, is informed that the "add" failed, but it doesn't/can't do much to deal with this situation other then to "log" (to the slf4j Logger) that it wasn't able to add the doc.


      I'm not sure how much of a "real world" impact this has on users, but I noticed the issue while diagnosing a jenkins test failure and wanted to track it.

      Attachments

        1. SOLR-13050.test-workaround.patch
          4 kB
          Chris M. Hostetter
        2. jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
          2.19 MB
          Chris M. Hostetter

        Activity

          People

            ab Andrzej Bialecki
            hossman Chris M. Hostetter
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: