Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3718

Special PDF document causes Tika parser to hang

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.28.1, 2.3.0
    • None
    • app
    • None
    • The problem can be reproduced under (Windows + Java8).   However, the problem does not appear to be environment specific.   

    Description

      Attempting to parse the attached "map.pdf" causes the Tika parser to hang due to an infinite loop involving "PDFStreamParser" logic.

      This problem occurs in both tika-app 1.28.1 and 2.3.0.

      It is also worth noting that Acrobat itself will become unresponsive if attempting to open this document.

      To reproduce the problem, just run:

      java -jar tika-app-1.28.1.jar map.pdf

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            DavidAvant David Avant

            Dates

              Created:
              Updated:

              Slack

                Issue deployment