Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4778

Avoid illegal matrix values

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.18, 3.0.0 PDFBox
    • 2.0.19, 3.0.0 PDFBox
    • Parsing
    • None

    Description

      Tim Allison found a bunch of problematic pdfs and one of them runs very slow/forever.

      The attached pdf has a broken content stream so that a lot of matrix operations are skipped due to missing values which lead to illegal values when multiplying matrices. Some of the calculated text positions are broken and in the end text extraction with sorting is very slow or runs infinite.

      Attachments

        1. hang-090214-015108-51.pdf
          1.45 MB
          Andreas Lehmkühler

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            lehmi Andreas Lehmkühler
            lehmi Andreas Lehmkühler
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment