This patch only changes the on-disk format right? The specialized in-memory readers are still backed by native arrays (short/int/long, etc.)?
Ie, in general, I think the version constants should be created once and then not changed (write once), and VERSION_CURRENT changes to point to whichever is most recent.
Ok, I'll change it.
That careful anonymous subclass in PackedInts to handle seeking to the end when the last value is read is sort of sneaky ... this should only kick in when reading the old (long-aligned) format right?
This only happens when reading the old format AND the number of bytes used to serialized the array is not a multiple of 8. I'll add an assert to make sure that this condition can only be true with the old format.
Or ... maybe... we should not "promise" this (no trailing wasted bytes) in the API?
Or maybe we expose a new explicit method to "seek to the end of this packed ints" or something (eg maybe "skipTrailingBytes").
These were my first ideas, but the truth is that I was very scared to break something (for example doc values rely on the assumption that after reading the last value of a direct array, the whole stream is consumed). Fixing PackedInts to make sure those assumptions are still true looked easier to me as I was able to create "fake" long-aligned packed ints and make sure that the whole stream was consumed after reading the last value.
But your option makes perfect sense to me and I will do it if you think it is cleaner.
Thanks for the review!