Uploaded image for project: 'Commons IO'
  1. Commons IO
  2. IO-556

Unexpected behavior of FileNameUtils.normalize may lead to limited path traversal vulnerabilies

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.1, 1.2, 1.3, 1.3.1, 1.3.2, 1.4, 2.0.1, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6
    • 2.7
    • Utilities
    • all


      I sent this report in an Email to security@apache.org on 2017-10-16. I did not receive any kind of response yet (2017-11-18 as of writing). I am now posting it publicly, to open the issue up for discussion, and hopefully initiate a fix.

      This report is not about a vulnerability in commons-io per se, but an unexpected behavior that has a high chance of introducing a path traversal vulnerability when using FilenameUtils.normalize to sanitize user input. The traversal is limited to a single out-of-bounds-stepping "/../" segment.

      FilenameUtils.normalize("//../foo");        // returns "//../foo" or "\\\\..\\foo", based on java.io.File.separatorChar
      FilenameUtils.normalize("\\\\..\\foo");        // returns "//../foo" or "\\\\..\\foo", based on java.io.File.separatorChar
      Possible impact (example)

      Consider a web-application that uses FilenameUtils.normalize to sanitize a user-supplied file name string, and then appends the sanitized value to a configured upload directory to store the uploaded content in:

      String fileName = "//../foo";            // actually user-supplied (e.g. from multipart-POST request)
      fileName = FilenameUtils.normalize(fileName);    // still holds the same value ("//../foo")   
      if (fileName != null) {
          File newFile = new File("/base/uploads", fileName);    // java.io.File treats double forward slashes as singles
                                      // newFile now points to "/base/uploads//../foo"
          newFile = newFile.getCanonicalFile();            // newFile now points to "/base/foo", which should be inaccessible
          // Write contents to newFile...
      } else {
          // Assume malicious activity, handle error
      Relevant code locations
      • org.apache.commons.io.FilenameUtils#getPrefixLength : everything between a leading "//" and the next "/" is treated as a UNC server name, and ignored in all further validation logic of org.apache.commons.io.FilenameUtils#doNormalize .

      One might argue that the given example is a misuse of the FilenameUtils.normalize method, and that everyone using it should expect absolute paths, full UNC paths, etc. to be returned by the method.
      Furthermore, the vulnerability can only occur due to the lax behavior of java.io.File .

      On the other hand, I believe that the JavaDoc of FilenameUtils.normalize suggests to most readers, that this method is exactly what is needed to sanitize file names:

           * Normalizes a path, removing double and single dot path steps,
           * and removing any final directory separator.
           * <p>
           * This method normalizes a path to a standard format.
           * The input may contain separators in either Unix or Windows format.
           * The output will contain separators in the format of the system.
           * <p>
           * A trailing slash will be removed.
           * A double slash will be merged to a single slash (but UNC names are handled).
           * A single dot path segment will be removed.
           * A double dot will cause that path segment and the one before to be removed.
           * If the double dot has no parent path segment to work with, {@code null}
           * is returned.
           * <p>
           * The output will be the same on both Unix and Windows except
           * for the separator character.
           * <pre>
           * /foo//               --&gt;   /foo
           * /foo/./              --&gt;   /foo
           * /foo/../bar          --&gt;   /bar
           * /foo/../bar/         --&gt;   /bar
           * /foo/../bar/../baz   --&gt;   /baz
           * //foo//./bar         --&gt;   /foo/bar
           * /../                 --&gt;   null
           * ../foo               --&gt;   null
           * foo/bar/..           --&gt;   foo
           * foo/../../bar        --&gt;   null
           * foo/../bar           --&gt;   bar
           * //server/foo/../bar  --&gt;   //server/bar
           * //server/../bar      --&gt;   null
           * C:\foo\..\bar        --&gt;   C:\bar
           * C:\..\bar            --&gt;   null
           * ~/foo/../bar/        --&gt;   ~/bar
           * ~/../bar             --&gt;   null
           * </pre>
           * (Note the file separator returned will be correct for Windows/Unix)
           * @param filename  the filename to normalize, null returns null
           * @return the normalized filename, or null if invalid. Null bytes inside string will be removed

      I have done a quick survey of the usages of the method in public GitHub repositories. I have found numerous projects that suffer from the limited path traversal vulnerability that is described here because of this very issue. This includes Webservers, Web-Frameworks, Archive-Extraction-Software, and others.

      Preventing path traversal attacks is not trivial, and many people turn to libraries like commons-io to avoid making mistakes when implementing parsing logic on their own. They trust that FilenameUtils will provide them with the most complete, and suitable sanitation logic for file names.
      In addition, ".." is not a valid UNC host name according to https://msdn.microsoft.com/de-de/library/gg465305.aspx , so prohibiting it shouldn't result in any major problems.


        Issue Links


          This comment will be Viewable by All Users Viewable by All Users


            Unassigned Unassigned
            Lukas Euler Lukas Euler
            0 Vote for this issue
            5 Start watching this issue




                Issue deployment