Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-603

PDFBox performance issue: Encoding.java getCharacter() method tweak

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0-incubator
    • 1.0.0
    • Text extraction
    • None
    • All

    Description

      During parsing / text extraction the Encoding.getCharacter(COSName) method is invoked repeatedly.

      It includes a string test that is performed up front but should only occur rarely. The code should be restructured slightly to only perform that test later. I.E. it should succeed fast and fail slow.

      I'll post an attachment that rewrites the method slightly. The performance gains is fairly significant.

      Attachments

        1. Encoding.java
          11 kB
          Mel Martinez

        Activity

          People

            jukkaz Jukka Zitting
            m.martinez Mel Martinez
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: