Affects Version/s: 3.5
Fix Version/s: 4.0-ALPHA
We have a large 50 gig index which is optimized as one segment, with a 66 MEG .tii file. This index has no norms, and no field cache.
It takes about 5 seconds to load this index, profiling reveals that 60% of the time is spent in GrowableWriter.set(index, value), and most of time in set(...) is spent resizing PackedInts.Mutatable current.
In the constructor for TermInfosReaderIndex, you initialize the writer with the line,
GrowableWriter indexToTerms = new GrowableWriter(4, indexSize, false);
For our index using four as the bit estimate results in 27 resizes.
The last value in indexToTerms is going to be ~ tiiFileLength, and if instead you use,
int bitEstimate = (int) Math.ceil(Math.log10(tiiFileLength) / Math.log10(2));
GrowableWriter indexToTerms = new GrowableWriter(bitEstimate, indexSize, false);
Load time improves to ~ 2 seconds.
|Resolution||Fixed [ 1 ]|
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Assignee||Michael McCandless [ mikemccand ]|
|Fix Version/s||4.0 [ 12314025 ]|
|Status||Resolved [ 5 ]||Closed [ 6 ]|