Uploaded image for project: 'Commons Compress'
  1. Commons Compress
  2. COMPRESS-569

OutOfMemoryError on a crafted tar file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.21
    • 1.21
    • None
    • None

    Description

      Apache Commons Compress at commit
      https://github.com/apache/commons-compress/commit/1b7528fbd6295a3958daf1b1114621ee5e40e83c throws an OutOfMemoryError after consuming ~5 minutes of CPU on my
      machine on a crafted tar archive that is less than a KiB in size:

      Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
      at java.base/sun.nio.cs.UTF_8.newDecoder(UTF_8.java:70)
      at org.apache.commons.compress.archivers.zip.NioZipEncoding.newDecoder(NioZipEncoding.java:182)
      at org.apache.commons.compress.archivers.zip.NioZipEncoding.decode(NioZipEncoding.java:135)
      at org.apache.commons.compress.archivers.tar.TarUtils.parseName(TarUtils.java:311)
      at org.apache.commons.compress.archivers.tar.TarUtils.parseName(TarUtils.java:275)
      at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:1550)
      at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:554)
      at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:570)
      at org.apache.commons.compress.archivers.tar.TarFile.getNextTarEntry(TarFile.java:250)
      at org.apache.commons.compress.archivers.tar.TarFile.<init>(TarFile.java:211)
      at org.apache.commons.compress.archivers.tar.TarFile.<init>(TarFile.java:94)
      at TarFileTimeout.main(TarFileTimeout.java:22)

      I attached both the tar file and a Java reproducer for this issue.

      Citing Stefan Bodewig's analysis of this issue:
      Your archive contains an entry with a claimed size of -512 bytes. When
      TarFile reads entries it tries to skip the content of th entry as it is
      only interested in meta data on a first scan and does so by positioning
      the input stream right after the data of the current entry. In this case
      it positions it 512 bytes backwards right at the start of the current
      entry's meta data again. This leads to an infinite loop that reads a new
      entry, stores the meta data, repositions the stream and starts over
      again. Over time the list of collected meta data eats up all available
      memory.

      Attachments

        1. timeout.tar
          0.5 kB
          Fabian Meumertzheim
        2. TarFileTimeout.java
          1 kB
          Fabian Meumertzheim

        Activity

          People

            Unassigned Unassigned
            Meumertzheim Fabian Meumertzheim
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: