Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13050

SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.7, 8.0
    • Component/s: None
    • Labels:
      None

      Description

      A chicken/egg issue of the way the autoscaling SystemLogListener uses the .system collection to record event history is that in the case of a nodeLost event for the .system collection's leader, there is a window of time during leader election where trying to add the "Document" representing that nodeLost event to the .system collection can fail.

      This isn't a silently failure: the SystemLogListener, acting the role of a Solr client, is informed that the "add" failed, but it doesn't/can't do much to deal with this situation other then to "log" (to the slf4j Logger) that it wasn't able to add the doc.


      I'm not sure how much of a "real world" impact this has on users, but I noticed the issue while diagnosing a jenkins test failure and wanted to track it.

        Attachments

        1. SOLR-13050.test-workaround.patch
          4 kB
          Chris M. Hostetter
        2. jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
          2.19 MB
          Chris M. Hostetter

          Activity

            People

            • Assignee:
              ab Andrzej Bialecki
              Reporter:
              hossman Chris M. Hostetter
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: