Description
The problem we saw was that splitting a shard took a long time and at the end of it the sub-shards contained fewer documents than the original shard.
The root cause was eventually tracked down to the disappearing documents not falling into the hash ranges of the sub-shards.
Could SolrIndexSplitter split report per-segment numDocs for parent and sub-shards, with at least a warning logged for any discrepancies (documents falling into none of the sub-shards or documents falling into several sub-shards)?
Additionally, could a case be made for erroring out when discrepancies are detected i.e. not proceeding with the shard split? Either to always error or to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD action.