Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4778

Avoid illegal matrix values

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.18, 3.0.0 PDFBox
    • Fix Version/s: 2.0.19, 3.0.0 PDFBox
    • Component/s: Parsing
    • Labels:
      None

      Description

      Tim Allison found a bunch of problematic pdfs and one of them runs very slow/forever.

      The attached pdf has a broken content stream so that a lot of matrix operations are skipped due to missing values which lead to illegal values when multiplying matrices. Some of the calculated text positions are broken and in the end text extraction with sorting is very slow or runs infinite.

        Attachments

        1. hang-090214-015108-51.pdf
          1.45 MB
          Andreas Lehmkühler

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                lehmi Andreas Lehmkühler
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: