Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2425

Extracted text has extra spaces

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.8.7, 1.8.10, 1.8.11, 2.0.0
    • None
    • Text extraction
    • None

    Description

      This is a very old issue, originally from PDFBOX-37. The attached file has extra spaces inserted in the title text by PDFTextStripper.

      A Framework  for D i s t r i bu t ed  Au thor i z a t i on*  
      (Extended Abstract) 
      Thoma s  Y .C .  Woo  S imon  S. L am  
      Depa r tmen t  of  Compu t e r  Sc i ences  
      Th e  Un i v e r s i t y  of  T ex a s  a t  Au s t i n  
      Au s t i n ,  T exa s  78712-1188  
      1 In t r oduc t i on  
      

      Attachments

        1. WooLam93c.pdf
          624 kB
          John Hewson
        2. WooLam93c-Visible-p1.pdf
          1.04 MB
          Tilman Hausherr

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jahewson John Hewson
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: