I would like to propose some small improvements to this nice feature.
I've worked out a patch (will attach shortly). Doron if you agree /
or we can iterate then I'll commit it! Thanks.
- Renamed "withNrm()" to "getHasMergedNorms" to be more
descriptive. Also changed the field to "hasMergedNorms".
- Explicitly store "hasMergedNorms" in the segments_N file.
I think in general we should favor storing things like this
explicitly instead of relying on IO operations (fileExists).
We've made great progress lately in reducing such IO operations so
I'd like to keep that up when possible
I created a new FORMAT_MERGED_NORMS in SegmentInfos for this. The
change is fully backwards compatible (old indices work fine). I
extended TestBackwardsCompatibility to test this.
This then has the nice side effect of not having to create the
fleeting CompoundFileReader in "SegmentInfo.getHasMergedNorms"
(which was somewhat spooky to me) for indices written to after
this is committed. For indices written to before this gets
committed but after the first version was committed (10 days ago),
the check is still needed so I've left it in there with a comment.
- Fixed the TestDoc unit test to actually create & return
SegmentInfo's vs recreating a new SegmentInfo every time (which
causes problems whenever we add something to SegmentInfo). This
is still a correct test but more scalable with time as we make
changes to SegmentInfo.