Details
Description
When calling FileUtils.checksumCRC32 from multiple threads (in order to improve throughput when calculating CRC's for a large folder), the code is not thread-safe, resulting in incorrect CRC output.
The following simple test demonstrates the issue:
@Test public void should() throws ExecutionException, InterruptedException { File testFile = new File("C:\\Temp\\large-file.txt"); // ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1); ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(5); List<Future<Long>> futures = new ArrayList<>(); for (int i = 0; i < 20; i++) { futures.add(scheduler.submit(() -> FileUtils.checksumCRC32(testFile))); } List<Long> crcs = new ArrayList<>(); for (Future<Long> future : futures) { crcs.add(future.get()); } Assertions.assertThat(crcs).allMatch(c -> crcs.get(0).equals(c)); }
In the above code, with a thread-pool size of 1, all calculated CRC's for the file are the same. With a thread-pool size of more, the CRC's differ.
The issue appears to be related to the use of a common SKIP_BYTE_BUFFER in IOUtils.consume. The multiple threads all read into the same buffer as the data is being "discarded". However, FileUtils.checksum uses a CheckedInputStream to calculate the CRC, which uses the value read into the shared buffer. With multiple threads writing to that buffer the CRC mechanism breaks down.