Commons IO
  1. Commons IO
  2. IO-166

Fix URL decoding in FileUtils.toFile()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 2.0
    • Component/s: Utilities
    • Labels:
      None

      Description

      The sequence "%2520" should decode to "%20".

      1. IO-166.patch
        7 kB
        Benjamin Bentmann
      2. IO-166.patch
        2 kB
        Benjamin Bentmann

        Issue Links

          Activity

          Hide
          Niall Pemberton added a comment -
          Show
          Niall Pemberton added a comment - Fixed, thanks for the patch: http://svn.apache.org/viewvc?view=revision&revision=1002457
          Hide
          Benjamin Bentmann added a comment -

          New patch to also address the following issues:

          1. URL decoding should use UTF-8
          2. URL decoding should be lenient

          Rationale for 1. is to bring the method in sync with the behavior of the decoding done by the JDK, i.e. the output from

          URI url = new URI("file:/home/%C3%A4%C3%B6%C3%BC%C3%9F");
          System.out.println(new File(url));
          System.out.println(FileUtils.toFile(url.toURL()));
          

          is currently

          /home/äöüß
          /home/äöü�
          

          Rationale for 2. is to better work with invalid URLs returned by bad class loaders. There are still enough class loader implementations out that will return a URL like "file:/<snip>/%file.txt" when queried for a resource named "%file.txt", i.e. the URL is not encoded at all and can as such potentially include literal percent characters. Hence I believe it is preferable for the method to simply pass such characters literally through instead of failing with an exception.

          Show
          Benjamin Bentmann added a comment - New patch to also address the following issues: URL decoding should use UTF-8 URL decoding should be lenient Rationale for 1. is to bring the method in sync with the behavior of the decoding done by the JDK, i.e. the output from URI url = new URI( "file:/home/%C3%A4%C3%B6%C3%BC%C3%9F" ); System .out.println( new File(url)); System .out.println(FileUtils.toFile(url.toURL())); is currently /home/äöüß /home/äöüÃ? Rationale for 2. is to better work with invalid URLs returned by bad class loaders. There are still enough class loader implementations out that will return a URL like "file:/<snip>/%file.txt" when queried for a resource named "%file.txt", i.e. the URL is not encoded at all and can as such potentially include literal percent characters. Hence I believe it is preferable for the method to simply pass such characters literally through instead of failing with an exception.

            People

            • Assignee:
              Unassigned
              Reporter:
              Benjamin Bentmann
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development