Description
Spin-off from SOLR-11449:
There's a section of code in moveNormalReplica that ensures that we don't lose a shard leader during move. There's no corresponding protection in moveHdfsReplica, which means that moving a replica that is also a shard leader may potentially lead to data loss (eg. when replicationFactor=1).
Also, there's no rollback strategy when moveHdfsReplica partially fails, unlike in moveNormalReplica where the code simply skips deleting the original replica - it seems that the code should attempt to restore the original replica in this case? When RF=1 and such failure occurs then not restoring the original replica means lost shard.
Attachments
Attachments
Issue Links
- is blocked by
-
SOLR-11661 New HDFS collection reuses unremoved data from a deleted HDFS collection with same name causes inconsistent view of documents
- Closed