Description
Here is a bit of a background:
- I wanted to implement a custom merging strategy that would have a custom i/o flow control (global),
- currently, the CMS is tightly coupled with a few classes – MergeRateLimiter, OneMerge, IndexWriter.
Looking at the code it seems to me that everything with respect to I/O control could be nicely pulled out into classes that explicitly control the merging process, that is only MergePolicy and MergeScheduler. By default, one could even run without any additional I/O accounting overhead (which is currently in there, even if one doesn't use the CMS's throughput control).
Such refactoring would also give a chance to nicely move things where they belong – job aborting into OneMerge (currently in RateLimiter), rate limiter lifecycle bound to OneMerge (MergeScheduler could then use per-merge or global accounting, as it pleases).
Just a thought and some initial refactorings for discussion.