Commons Compress
  1. Commons Compress
  2. COMPRESS-181

Tar files created by AIX native tar, and which contain symlinks, cannot be read by TarArchiveInputStream

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2, 1.3
    • Fix Version/s: 1.4
    • Component/s: Archivers
    • Labels:
      None
    • Environment:

      AIX 5.3

      Description

      A simple tar file created on AIX using the native (/usr/bin/tar tar utility) and which contains a symbolic link, cannot be loaded by TarArchiveInputStream:

      java.io.IOException: Error detected parsing the header
      	at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:201)
      	at Extractor.extract(Extractor.java:13)
      	at Extractor.main(Extractor.java:28)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.tools.ant.taskdefs.ExecuteJava.run(ExecuteJava.java:217)
      	at org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:152)
      	at org.apache.tools.ant.taskdefs.Java.run(Java.java:771)
      	at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:221)
      	at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:135)
      	at org.apache.tools.ant.taskdefs.Java.execute(Java.java:108)
      	at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
      	at org.apache.tools.ant.Task.perform(Task.java:348)
      	at org.apache.tools.ant.Target.execute(Target.java:390)
      	at org.apache.tools.ant.Target.performTasks(Target.java:411)
      	at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
      	at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
      	at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
      	at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
      	at org.apache.tools.ant.Main.runBuild(Main.java:809)
      	at org.apache.tools.ant.Main.startAnt(Main.java:217)
      	at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
      	at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
      Caused by: java.lang.IllegalArgumentException: Invalid byte 0 at offset 0 in '{NUL}1722000726 ' len=12
      	at org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:99)
      	at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:819)
      	at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:314)
      	at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:199)
      	... 29 more
      

      Tested with 1.2 and the 1.4 nightly build from Feb 23 (Implementation-Build: trunk@r1292625; 2012-02-23 03:20:30+0000)

        Issue Links

          Activity

          Robert Clark created issue -
          Hide
          Robert Clark added a comment -

          An example tar file, containing a symbolic link, created by the AIX native tar utility

          Show
          Robert Clark added a comment - An example tar file, containing a symbolic link, created by the AIX native tar utility
          Robert Clark made changes -
          Field Original Value New Value
          Attachment simple-aix-native-tar.tar [ 12516187 ]
          Robert Clark made changes -
          Description A simple tar file created on AIX using the native ({{/usr/bin/tar}} tar utility) *and* which contains a symbolic link, cannot be loaded by TarArchiveInputStream:

          {noformat}
          java.io.IOException: Error detected parsing the header
          at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:201)
          at Extractor.extract(Extractor.java:13)
          at Extractor.main(Extractor.java:28)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.tools.ant.taskdefs.ExecuteJava.run(ExecuteJava.java:217)
          at org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:152)
          at org.apache.tools.ant.taskdefs.Java.run(Java.java:771)
          at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:221)
          at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:135)
          at org.apache.tools.ant.taskdefs.Java.execute(Java.java:108)
          at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
          at org.apache.tools.ant.Task.perform(Task.java:348)
          at org.apache.tools.ant.Target.execute(Target.java:390)
          at org.apache.tools.ant.Target.performTasks(Target.java:411)
          at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
          at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
          at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
          at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
          at org.apache.tools.ant.Main.runBuild(Main.java:809)
          at org.apache.tools.ant.Main.startAnt(Main.java:217)
          at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
          at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
          Caused by: java.lang.IllegalArgumentException: Invalid byte 0 at offset 0 in '{NUL}1722000726 ' len=12
          at org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:99)
          at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:819)
          at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:314)
          at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:199)
          ... 29 more
          {noformat}

          Tested with 1.2 and the 1.4 nightly build from Feb 23 ({{Implementation-Build: trunk@r1292625; 2012-02-23 03:20:30+0000}})

          I don't have a place to post the example tar file, but I can send it to anyone who wants it (size is 10240 bytes)
          A simple tar file created on AIX using the native ({{/usr/bin/tar}} tar utility) *and* which contains a symbolic link, cannot be loaded by TarArchiveInputStream:

          {noformat}
          java.io.IOException: Error detected parsing the header
          at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:201)
          at Extractor.extract(Extractor.java:13)
          at Extractor.main(Extractor.java:28)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.tools.ant.taskdefs.ExecuteJava.run(ExecuteJava.java:217)
          at org.apache.tools.ant.taskdefs.ExecuteJava.execute(ExecuteJava.java:152)
          at org.apache.tools.ant.taskdefs.Java.run(Java.java:771)
          at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:221)
          at org.apache.tools.ant.taskdefs.Java.executeJava(Java.java:135)
          at org.apache.tools.ant.taskdefs.Java.execute(Java.java:108)
          at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
          at org.apache.tools.ant.Task.perform(Task.java:348)
          at org.apache.tools.ant.Target.execute(Target.java:390)
          at org.apache.tools.ant.Target.performTasks(Target.java:411)
          at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
          at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
          at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
          at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
          at org.apache.tools.ant.Main.runBuild(Main.java:809)
          at org.apache.tools.ant.Main.startAnt(Main.java:217)
          at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
          at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
          Caused by: java.lang.IllegalArgumentException: Invalid byte 0 at offset 0 in '{NUL}1722000726 ' len=12
          at org.apache.commons.compress.archivers.tar.TarUtils.parseOctal(TarUtils.java:99)
          at org.apache.commons.compress.archivers.tar.TarArchiveEntry.parseTarHeader(TarArchiveEntry.java:819)
          at org.apache.commons.compress.archivers.tar.TarArchiveEntry.<init>(TarArchiveEntry.java:314)
          at org.apache.commons.compress.archivers.tar.TarArchiveInputStream.getNextTarEntry(TarArchiveInputStream.java:199)
          ... 29 more
          {noformat}

          Tested with 1.2 and the 1.4 nightly build from Feb 23 ({{Implementation-Build: trunk@r1292625; 2012-02-23 03:20:30+0000}})
          Hide
          Sebb added a comment -

          The problem field is the modification time for the symbolic link, which Compress expects to be either all null or valid octal with trailing null/space.

          Note: 7zip reads the file OK but complains that there is data after the end of the archive.
          It treats the link field as having no mtime.

          I've yet to find any documentation that says a leading null is allowed.
          Perhaps AIX tar is being lazy and failing to null the whole mtime field (which Compress could handle).

          Show
          Sebb added a comment - The problem field is the modification time for the symbolic link, which Compress expects to be either all null or valid octal with trailing null/space. Note: 7zip reads the file OK but complains that there is data after the end of the archive. It treats the link field as having no mtime. I've yet to find any documentation that says a leading null is allowed. Perhaps AIX tar is being lazy and failing to null the whole mtime field (which Compress could handle).
          Hide
          Stefan Bodewig added a comment -

          GNU tar extracts it with a date/time of 1978-02-15 08:55 - which more or less looks as if it had translated the leading null to an ASCII 0 (and it looks as if that was supposed to be an ASCII 1 to match the timestamp of the dir).

          Show
          Stefan Bodewig added a comment - GNU tar extracts it with a date/time of 1978-02-15 08:55 - which more or less looks as if it had translated the leading null to an ASCII 0 (and it looks as if that was supposed to be an ASCII 1 to match the timestamp of the dir).
          Hide
          Stefan Bodewig added a comment - - edited

          GNU tar from_header in list.c contains a workaround for this case:

            /* Accommodate buggy tar of unknown vintage, which outputs leading
               NUL if the previous field overflows.  */
            where += !*where;
          

          this basically skips the first byte if it is a binary 0.

          Show
          Stefan Bodewig added a comment - - edited GNU tar from_header in list.c contains a workaround for this case: /* Accommodate buggy tar of unknown vintage, which outputs leading NUL if the previous field overflows. */ where += !*where; this basically skips the first byte if it is a binary 0.
          Hide
          Sebb added a comment -

          I don't think the previous field has overflowed in this case - it's all ascii '0'.

          Not sure that the GNU tar approach is sensible as the time value is then meaningless.
          I think it would be better to skip the entire field.

          But do we just skip the field (as if it were all null), or skip it only on "non-strict" mode, or perhaps generate some kind of warning?

          Show
          Sebb added a comment - I don't think the previous field has overflowed in this case - it's all ascii '0'. Not sure that the GNU tar approach is sensible as the time value is then meaningless. I think it would be better to skip the entire field. But do we just skip the field (as if it were all null), or skip it only on "non-strict" mode, or perhaps generate some kind of warning?
          Hide
          Stefan Bodewig added a comment -

          It doesn't look like an overflow was the reason but if you look at the timestamp it certainly reads as if the first byte was a binary 0 by accident (if you put an ASCII 1 in there it is identical to the timestamp of the directory).

          In any case the resulting timestamp is not what it used to be, so using any other timestamp would be as valid as trying to parse the rest.

          Show
          Stefan Bodewig added a comment - It doesn't look like an overflow was the reason but if you look at the timestamp it certainly reads as if the first byte was a binary 0 by accident (if you put an ASCII 1 in there it is identical to the timestamp of the directory). In any case the resulting timestamp is not what it used to be, so using any other timestamp would be as valid as trying to parse the rest.
          Hide
          Sebb added a comment -

          I agree - it looks like the first byte was set to null accidentally, or possibly the intention was to invalidate the stamp.

          In any case the resulting timestamp is not what it used to be, so using any other timestamp would be as valid as trying to parse the rest.

          Sorry, I don't follow.

          I'm suggesting that using the corrupted timestamp is worse than ignoring it by treating it as all nulls.

          Show
          Sebb added a comment - I agree - it looks like the first byte was set to null accidentally, or possibly the intention was to invalidate the stamp. In any case the resulting timestamp is not what it used to be, so using any other timestamp would be as valid as trying to parse the rest. Sorry, I don't follow. I'm suggesting that using the corrupted timestamp is worse than ignoring it by treating it as all nulls.
          Hide
          Stefan Bodewig added a comment -

          We don't really have an option to ignore a timestamp unless we allow ArchiveEntry#getLastModifiedDate to return null.

          What I was trying to say is it doesn't matter much which timestamp we return as any choice is wrong. Returning the equivalent of a 0 timestamp is fine with me. Unfortunately we don't have an infrastructure for warnings (would have been good for COMPRESS-176 as well), something for an API redesign in 2.0, I guess.

          Show
          Stefan Bodewig added a comment - We don't really have an option to ignore a timestamp unless we allow ArchiveEntry#getLastModifiedDate to return null. What I was trying to say is it doesn't matter much which timestamp we return as any choice is wrong. Returning the equivalent of a 0 timestamp is fine with me. Unfortunately we don't have an infrastructure for warnings (would have been good for COMPRESS-176 as well), something for an API redesign in 2.0, I guess.
          Hide
          Sebb added a comment -

          OK, let's return the timestamp as 0.
          This will presumably display as 1970-01-01 00:00.

          Show
          Sebb added a comment - OK, let's return the timestamp as 0. This will presumably display as 1970-01-01 00:00.
          Stefan Bodewig made changes -
          Link This issue is related to COMPRESS-182 [ COMPRESS-182 ]
          Hide
          Stefan Bodewig added a comment -

          Robert, could you delete and re-add the attachment, granting the ASF a license to include it this time? That way we could add the tar to our testsuite.

          Show
          Stefan Bodewig added a comment - Robert, could you delete and re-add the attachment, granting the ASF a license to include it this time? That way we could add the tar to our testsuite.
          Robert Clark made changes -
          Attachment simple-aix-native-tar.tar [ 12516187 ]
          Hide
          Robert Clark added a comment -

          Re-uploaded example archive, with license grant so it can be included in the tests for the project.

          Show
          Robert Clark added a comment - Re-uploaded example archive, with license grant so it can be included in the tests for the project.
          Robert Clark made changes -
          Attachment simple-aix-native-tar.tar [ 12516740 ]
          Stefan Bodewig made changes -
          Affects Version/s 1.4 [ 12318850 ]
          Hide
          Stefan Bodewig added a comment -

          fixed with svn revision 1296420

          Thanks for the testcase

          Show
          Stefan Bodewig added a comment - fixed with svn revision 1296420 Thanks for the testcase
          Stefan Bodewig made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 1.4 [ 12318850 ]
          Resolution Fixed [ 1 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          4d 2h 13m 1 Stefan Bodewig 02/Mar/12 20:01

            People

            • Assignee:
              Unassigned
              Reporter:
              Robert Clark
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development