Uploaded image for project: 'Commons IO'
  1. Commons IO
  2. IO-471

Support for additional encodings needed in ReversedLinesFileReader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.4
    • 2.5
    • Utilities
    • None

    Description

      I¹m working on a product that uses Commons IO via Jackrabbit Oak. In the
      process of testing the launch of such product on Japanese Windows 2012
      Server R2, I came across the following exception:
      "(java.io.UnsupportedEncodingException: Encoding windows-31j is not
      supported yet (feel free to submit a patch))"

      windows-31j is the IANA name for Windows code page 932 (Japanese), and
      is returned by Charset.defaultCharset(), used in org.apache.commons.io.input.ReversedLinesFileReader [0].

      This issue can be resolved by adding a check for
      'windows-31j' to ReversedLinesFileReader.

      The attached patch includes such addition, as well as those needed to support Chinese Simplified, Chinese Traditional and Korean.

      A newline byte can never appear as part of a multi-byte character in any
      of those encodings.

      Attachments

        1. test-file-x-windows-950.bin
          0.0 kB
          Leandro Reis
        2. test-file-x-windows-949.bin
          0.0 kB
          Leandro Reis
        3. test-file-windows-31j.bin
          0.0 kB
          Leandro Reis
        4. test-file-gbk.bin
          0.0 kB
          Leandro Reis
        5. commons-io-reversedlinesfilereader-ccjk-encodings.txt
          6 kB
          Leandro Reis

        Issue Links

          Activity

            People

              krosenvold Kristian Rosenvold
              lreis Leandro Reis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: