Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
6.6.2
-
None
-
None
-
None
Description
If a SOLR node gets a corrupted checksum in its index, it will still report as "healthy" but all writes routing to that node will fail. Thus leading to a total write outage which is totally silent server side.
I think this behavior should report within the cluster status. I think "Down" is appropriate, and then that should lead to a recovery triggering. Perhaps though a new category of "CorruptIndex" or something is more appropriate. However it happens, I believe the right response is (a) release leader lock if held (b) once new leader is elected, copy that leader's index