An alternative would be to put both static methods into CodecUtils, but this would also not help with changes in format.
Or I can make the SIWriter do its own (private) thing. Yeah, that's an "abstraction violation" (public Version ctor), and, yeah, future places that need to write/read versions constants (e.g.
LUCENE-5954) will have to dup this code, but then the format is clearly owned by that writer/reader. Already we are debating 4 vs 3 ints (format change...).
Can we encode 3 ints instead of 4? As far as I know, the 'prerelease' was added to support 4.0-alpha/4.0-beta. This was confusing (my fault), and this confusion ultimately worked its way into an index corruption bug. I think we should try to contain it to 4.0 instead and not keep things complicated like that.
OK... but should we never expect to use prerelease anymore (e.g 5.0)?
Can we consider just making a new 5.0 si writer? its a pain to bump the codec version, but I'll do the work here. We can remove conditionals like 'supports checksums' as well.
Separately we should make it easier to roll a new Codec version ... it's bad if it's "daunting" since it pressures us to hide biggish changes under the existing writers.
We can followup with this by improving the exceptions for tiny "slurp-in" classes like this (I would personally, as in do the work, also fix .fnm, segments_N, .nvm, .dvm, .fdt, .tvx as well). I would add a CodecUtil.addSuppressedChecksum or something, to easily allow these guys to 'annotate' any exc on init with checksum failure information. These are small but important and it would help considering we are dodging challenges like JVM bugs here.
Big +1: this would mean on any strange exc when reading these files, we would also see if (in addition) their checksum did or did not match? This saves the extra hassle of asking user to run CheckIndex to figure out if that file was corrupt...
I also want to bump 5.0 codec anyway, to fix the bug where Lucene42TermVectorsFormat uses the same codecName as Lucene41StoredFieldsFormat in the codec header, thats a stupid bug we should fix.
I think I'll break out the format change from this issue, and leave this as just improving the Version error messages, having it not judge major version, etc... I'll open a new issue for Lucene50Codec.