There's kind of an odd situation that arises when a Solr node starts up (after a crash) and tries to recover from it's tlog that causes deletes to be redundantly & excessively applied – at a minimum it causes confusing really log messages....
- UpdateLog.init(...) creates TransactionLog instances for the most recent log files found (based on numRecordsToKeep) and then builds a RecentUpdates instance from them
- Delete entries from the RecentUpdates are used to populate 2 lists:
- oldDeletes (for deleteById).
- Then when UpdateLog.recoverFromLog is called a LogReplayer is used to replay any (uncommited) TransactionLog enteries
- during replay UpdateLog delegates to the UpdateRequestProcessorChain to for the various adds/deletes, etc...
- when an add makes it to RunUpdateProcessor it delegates to DirectUpdateHandler2, which (independent of the fact that we're in log replay) calls UpdateLog.getDBQNewer for every add, looking for any "Reordered" deletes that have a version greater then the add
- if it finds any DBQs "newer" then the document being added, it does a low level IndexWriter.updateDocument and then immediately executes all the newer DBQs ... once per add
- these deletes are also still executed as part of the normal tlog replay, because they are in the tlog.
Which means if you are recovering from a tlog with 90 addDocs, followed by 5 DBQs, then each of those 5 DBQs will each be executed 91 times – and for 90 of those executions, a DUH2 INFO log messages will say "Reordered DBQs detected. ..." even tough the only reason they are out of order is because Solr is deliberately applying them out of order.
- At a minimum we should improve the log messages
- Ideally we should stop (pre-emptively) applying these deletes during tlog replay.
- is related to
SOLR-8760 PeerSync replay of ADDs older than ourLowThreshold interacting with DBQs to stall new leadership