Solr currently "supports" disabling the UpdateLog, though it is "required" for NRT replicas (per the docs). However, when the update log is disabled and a replica is in BUFFERING state (e.g. during MigrateCmd or SplitShardCmd), updates are lost silently. While most users will likely never consider disabling the updateLog, it seems pertinent to provide a better support option.
Options as discussed in ASF Slack:
- No longer support disabling the updateLog as it is considered an integral feature in SolrCloud. This might be undesirable for use cases where some data loss is acceptable and the updateLog takes up too much space.
- Improve Solr documentation to explicitly outline the risks of disabling the updateLog.
- Add logging to indicate when an update is swallowed in this state.
- My preferred option: Support disabling the updateLog by providing additional replica states besides BUFFERING, so that there is no data loss when updateLog is disabled and replica goes offline for an operation like split. Some ideas:
- REJECTING: Fail updates so that the client can retry again once the operation is complete.
- BLOCKING: Stall update until operation is complete, and then execute update.
Feedback is welcome; once we establish a path forward I'd be happy to pick it up. If others are interested I can document my findings as well.