Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.26.0, 1.26.1
-
None
-
None
Description
I've run into a bug which occurs when attempting to read a 7zip file in several threads simultaneously. The following code illustrates the problem. The file.7z is in attachment
import java.io.InputStream; import java.nio.file.Paths; import java.util.stream.IntStream; import org.apache.commons.compress.archivers.sevenz.SevenZArchiveEntry; import org.apache.commons.compress.archivers.sevenz.SevenZFile; public class TestZip { public static void main(final String[] args) { final Runnable runnable = () -> { try { try (final SevenZFile sevenZFile = SevenZFile.builder().setPath(Paths.get("file.7z")).get()) { SevenZArchiveEntry sevenZArchiveEntry; while ((sevenZArchiveEntry = sevenZFile.getNextEntry()) != null) { if ("file4.txt".equals(sevenZArchiveEntry.getName())) { // The entry must not be the first of the ZIP archive to reproduce final InputStream inputStream = sevenZFile.getInputStream(sevenZArchiveEntry); // treatments... break; } } } } catch (final Exception e) { // java.io.IOException: Checksum verification failed e.printStackTrace(); } }; IntStream.range(0, 30).forEach(i -> new Thread(runnable).start()); } }
Below is the output I receive on version 1.26:
java.io.IOException: Checksum verification failed at org.apache.commons.compress.utils.ChecksumVerifyingInputStream.verify(ChecksumVerifyingInputStream.java:98) at org.apache.commons.compress.utils.ChecksumVerifyingInputStream.read(ChecksumVerifyingInputStream.java:92) at org.apache.commons.io.IOUtils.skip(IOUtils.java:2422) at org.apache.commons.io.IOUtils.skip(IOUtils.java:2380) at org.apache.commons.compress.archivers.sevenz.SevenZFile.getCurrentStream(SevenZFile.java:912) at org.apache.commons.compress.archivers.sevenz.SevenZFile.getInputStream(SevenZFile.java:988) at com.infotel.arcsys.nativ.archiving.zip.TestZip.lambda$main$0(TestZip.java:21) at java.base/java.lang.Thread.run(Thread.java:833)
The issue seems to arise from the transition from version 1.25 to 1.26 of Apache Commons Compress. In the SevenZFile class of the library, the private method getCurrentStream has migrated from IOUtils.skip(InputStream, long) to a method with a same signature but in Commons-IO package, which leads to a change in behavior. In version 1.26, it uses a shared and unsynchronized buffer, theoretically intended only for writing (SCRATCH_BYTE_BUFFER_WO). This causes checksum verification issues within the library. The problem seems to be resolved by specifying the Supplier of the buffer to use.
try (InputStream stream = deferredBlockStreams.remove(0)) { org.apache.commons.io.IOUtils.skip(stream, Long.MAX_VALUE, () -> new byte[org.apache.commons.io.IOUtils.DEFAULT_BUFFER_SIZE]); }