Here is a new patch, I fixed assertHistogram to be called in an assertion and added the suggested docs.
Maybe the visitor should also take BytesRef? Codec impls could read a whole byte values block in at once
I am not sure codecs could leverage this. I think a serious codec impl would do prefix compression to save space, so it could not read large byte anyway as it would need to concatenate the shared prefix and the suffix that is specific to the value at every iteration?
We could also fix BKDWriter.writeCommonPrefixes to save the copy there, though that's just once per leaf block.
I remember trying it out and it didn't help.
Have you tweaked 20 to see if that's a good value? Sorting BKD points is rather costly since when we swap, we swap whole values (docID, maybe ord, then the byte value for this field).
I remember tweaking it a long time ago when I worked in this Sorter abstraction, and values in [20,50] looked fine when sorting a simple int (so both comparisons and swaps were cheap) so I picked 20 to err on the safe side. It's true it might be different with points that have costly swaps.