Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-508

Lost spacing as a result of operator "Tc" ignoring.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0-incubator
    • 1.0.0
    • Text extraction
    • None
    • JDK 1.6.0_16

    Description

      Continue https://issues.apache.org/jira/browse/PDFBOX-234

      Lost spacing as a result of operator "Tc" ignoring.
      Ex:
      ****************************************
      BT
      6 0 0 6 244.0800018311 795.8400268555 Tm
      6.5475001335 Tc
      (41) Tj
      ****************************************
      Here PDFTextStripper.writeText() returns "41" (without spacing )

      Attachments

        1. 2a_repl2.pdf
          181 kB
          Dmitry Gutso
        2. 2a.pdf
          103 kB
          Dmitry Gutso
        3. PDFStreamEngine_For_Spacing.diff
          4 kB
          Dmitry Gutso
        4. TextPosition_for_Spacing.diff
          1 kB
          Dmitry Gutso

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gtsdmtry Dmitry Gutso
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: