Description
I was playing with on-the-fly checksum verification and this made me stumble upon an issue with BufferedChecksumIndexInput.
I have some code that skips over a DataInput by reading bytes into /dev/null, eg.
private static final byte[] SKIP_BUFFER = new byte[1024]; private static void skipBytes(DataInput in, long numBytes) throws IOException { assert numBytes >= 0; for (long skipped = 0; skipped < numBytes; ) { final int toRead = (int) Math.min(numBytes - skipped, SKIP_BUFFER.length); in.readBytes(SKIP_BUFFER, 0, toRead); skipped += toRead; } }
It is fine to read into this static buffer, even from multiple threads, since the content that is read doesn't matter here. However, it breaks with BufferedChecksumIndexInput because of the way that it updates the checksum:
@Override public void readBytes(byte[] b, int offset, int len) throws IOException { main.readBytes(b, offset, len); digest.update(b, offset, len); }
If you are unlucky enough so that a concurrent call to skipBytes started modifying the content of b before the call to digest.update(b, offset, len) finished, then your checksum will be wrong.
I think we should make BufferedChecksumIndexInput read into a private buffer first instead of relying on the user-provided buffer.