Uploaded image for project: 'Commons Compress'
  1. Commons Compress
  2. COMPRESS-183

Support for de/encoding of tar entry names other than plain 8BIT conversion.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.3
    • 1.4
    • Archivers

    Description

      The names of tar entries are currently encoded/decoded by means of plain 8bit conversions of byte to char and vice-versa. This prohibits the use of encodings like UTF8 in the file names. Whether the use of UTF8 (or any other non ASCII) in file names is sensible is a chapter of its own. However tar archives that contain files which names have been encoded with UTF8 do float around. These files currently can not be read correctly by commons-compress due to the encoding being hardcoded to plain 8BIT only.
      The supplied patch allows to use encodings other than 8BIT using a TarArchiveCodec structure. It does not change the standard functionality, but adds to it the possibility of using a different encoding.
      A method was added to the TarUtilsTest junit test to test the added functionality.

      Attachments

        1. patch-tar-name-encoding.diff
          23 kB
          Joao Schim
        2. patch-tar-name-encoding.diff
          23 kB
          Joao Schim
        3. patch-tar-name-encoding.diff
          22 kB
          Joao Schim

        Issue Links

          Activity

            People

              Unassigned Unassigned
              joao@schimsalabim.eu Joao Schim
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: