Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3320

TikaServer Header Name is Case-sensitive

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.25
    • Fix Version/s: 1.26
    • Component/s: core, server
    • Labels:
      None

      Description

      It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case sensitive. Same can be confirmed for latest main brunch version.

      This is creating issue in a system where request headers are automatically lowercased, before passing down to TikaServer.

       

      According to https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

      "Field names are case-insensitive"

       

      The issue is due to

      First a case-sensitive checking happens for startsWith "X-Tika-PDF" or "X-Tika-OCR". Then getDeclaredField of the respective config class is called to get field, and invokes the setter method.

      The same is maintained in newer TikaServer.

       

      Possible solution:

      Case-insensitive checking for startsWith. For getDeclaredField we can assume only fields will be there (irrespective of case) for any name, and then find out the field for it. Then derive setter from actual field name. Invoke the same.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              subhajitdas298 Subhajit Das

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment