Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3718

Special PDF document causes Tika parser to hang

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.28.1, 2.3.0
    • None
    • app
    • None
    • The problem can be reproduced under (Windows + Java8).   However, the problem does not appear to be environment specific.   

    Description

      Attempting to parse the attached "map.pdf" causes the Tika parser to hang due to an infinite loop involving "PDFStreamParser" logic.

      This problem occurs in both tika-app 1.28.1 and 2.3.0.

      It is also worth noting that Acrobat itself will become unresponsive if attempting to open this document.

      To reproduce the problem, just run:

      java -jar tika-app-1.28.1.jar map.pdf

      Attachments

        1. map.pdf
          1.61 MB
          David Avant

        Issue Links

          Activity

            People

              Unassigned Unassigned
              DavidAvant David Avant
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: