Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
Description
In Lucene, an optimize() call iteratively merges segments until only one is left - and while it's merging it (ultimately) needs to make a copy of the entire index, because readers attempting to open the index "mid-optimize" need to see a consistent copy of the index.
In Solr, we have control over when new readers/searchers get opened, so what if when we recieved an <optimize/> command, we under the covers we made iterative partial optimize calls and only opened a new searcher when we were finished with all of them? In theory this seems like it would help reduce the disk space used during optimize, without really affecting the time it takes to "optimize"
These are the threads that prompted this idea...
http://old.nabble.com/eternal-optimize-interrupted-to24805680.html#a24805680
http://old.nabble.com/Re%3A-eternal-optimize-interrupted-to24928754.html#a24928754
http://old.nabble.com/Optimization-of-large-shard-succeeded-to25809281.html#a25809281