Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2858

JAXRS server: allow passwords with special chars (MIME encoded words)

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.20
    • None
    • server
    • None

    Description

      Tika Server allows passing a document password in a special Password request header; however, I don't believe this header allows for passwords with non-US-ASCII characters, or for passwords with leading or trailing spaces.

      One potential solution would be to allow MIME encoded-word values (RFC 2047) in the password header so that one could specify any password with only US-ASCII. This extra decoding could be enabled / disabled with some other flag or header value, in order to avoid any breaking changes for clients that are not encoding this header (e.g. if the password happens to literally be "=?UTF-8?B??=").

      Attached are 2 sample PDF files that I'm unable to use with TIka Server due to their passwords. These passwords are a bit contrived, but I have come across this issue with real passwords. I've included the passwords in code blocks to avoid the issue editor / viewer from collapsing multiple spaces into one.

      The file named "protected - 4 space password.pdf" has a password of 4 literal spaces:

      // Password is on line below (4 literal spaces)
          
      

      The file named "protected - Unicode password.pdf" has a password of mostly special characters, with 2 leading spaces and 2 trailing spaces thrown in for good measure:

      // Password is on following line (with 2 leading spaces, 2 trailing spaces)
        ! < > " \ € œ ¤ ¼ ½ 𠜎 𩶘 😀  
      

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            rossj Ross Johnson

            Dates

              Created:
              Updated:

              Slack

                Issue deployment