Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5805

SolrCloud: run a healthcheck in a background thread

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.7
    • None
    • SolrCloud
    • None

    Description

      From a discussion on the mailing list:

      We had a brief SolrCloud outage this weekend when a node's SSD began to fail but the node still appeared to be up to the rest of the SolrCloud cluster (i.e. still green in clusterstate.json). Distributed queries that reached this node would fail but whatever heartbeat keeps the node in the clusterstate.json must have continued to succeed.

      We eventually had to power the node down to get it to be removed from clusterstate.json.

      Mark Miller:
      "One simple improvement might even be a background thread that periodically checks some local readings and depending on the results, pulls itself out of the mix as best it can (remove itself from clusterstate.json or simply closes it’s zk connection)."

      Attachments

        Activity

          People

            Unassigned Unassigned
            greggny3 Gregg Donovan
            Votes:
            3 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: