Thanks Robert for adding the version check. I think we should indeed never copy raw bytes across versions.
The clone is necessary because otherwise, the naive merge routine that is used eg. when chunks contain deleted documents is going to seek on the underlying IndexInput without notifying the checksuming wrapper about it.
I tried to fix it by using a checksuming index input for this naive merge routine as well, but it doesn't work since for every document it needs to seek to the start offset of the block, which implies seeking backward if you need to merge two term vectors that are in the same block.
I added a minor modification to Robert's patch in order to make sure that the index input of the vectors reader and the clone that is used for checksuming are moved in parallel. Otherwise if you stop using bulk merging after a few documents (eg. if deletions shift chunks in such a way that all of them need to be rebuilt), the checksum input might need to read the whole file when checking integrity.