Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
I observed Lucene indexing contributing to up to 99% of repository growth. While the size of the index itself is well inside reasonable bounds, the overall turnover of data being written and removed again can be as much as 99%.
In the case of the TarMK this negatively impacts overall system performance due to fast growing number of tar files / segments, bad locality of reference, cache misses/thrashing when looking up segments and vastly prolonged garbage collection cycles.
Attachments
Attachments
Issue Links
- breaks
-
OAK-6704 Set default merge polity to tiered as CommitMitigatingTieredMergePolicy seems to be bad for performance
- Closed
- is related to
-
OAK-6269 Support non chunk storage in OakDirectory
- Closed
-
OAK-6710 CommitMitigated merge policy should not reduce performance significantly
- Closed
- relates to
-
OAK-6514 Make Lucene merge policy configurable
- Closed
-
OAK-6211 Optimize indexing for file-system-centric deployments
- Open
-
OAK-6412 Consider upgrading to newer Lucene versions
- Open
-
OAK-2808 Active deletion of 'deleted' Lucene index files from DataStore without relying on full scale Blob GC
- Closed