Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4371

Improve ExtractText utility so that it can extract rotated text automatically

    XMLWordPrintableJSON

Details

    Description

      In a first step, detect all rotations by analyzing the effective text rendering matrix. In a second step, do a text extraction for each rotation by prepending an appropriate transform to the page content stream (so that our text has angle == 0) and then filtering any rotated text. Test file: the file fromĀ PDFBOX-4368.

      Attachments

        1. ExtractAngledText.java
          4 kB
          Tilman Hausherr

        Issue Links

          Activity

            People

              tilman Tilman Hausherr
              tilman Tilman Hausherr
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: