Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-581

Avoid warnings for graphics operations when extracting text

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 1.0.0
    • Text extraction
    • None

    Description

      PDFStreamEngine logs warnings of all encountered PDF operators for which an OperatorProcessor has not explicitly been configured. This is a bit annoying for things like text extraction where many graphics operators can simply be ignored.

      To solve this we can either disable the warnings entirely or add an explicit "Ignore" operator processor that simply ignores the selected operators. I'm inclined to implement the latter solution as I think it's a good idea to log warnings for truly unexpected operators.

      Attachments

        Activity

          People

            jukkaz Jukka Zitting
            jukkaz Jukka Zitting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: