Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4371

Improve ExtractText utility so that it can extract rotated text automatically

    XMLWordPrintableJSON

    Details

      Description

      In a first step, detect all rotations by analyzing the effective text rendering matrix. In a second step, do a text extraction for each rotation by prepending an appropriate transform to the page content stream (so that our text has angle == 0) and then filtering any rotated text. Test file: the file fromĀ PDFBOX-4368.

        Attachments

        1. ExtractAngledText.java
          4 kB
          Tilman Hausherr

          Issue Links

            Activity

              People

              • Assignee:
                tilman Tilman Hausherr
                Reporter:
                tilman Tilman Hausherr
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: