Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-601

PDFBox performance issue: PDSimpleFont, PDFont performance tweaks

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0-incubator
    • 1.0.0
    • PDModel
    • None
    • All

    Description

      During text extraction, font size / descriptor / encoding attributes are accessed repeatedly in order to do positional calculations and byte-character conversions.

      The current code has several accessors for these things that redo rather slow calculations each time - even thought the font object state is not changed.

      The results of these calculations should be persisted in instance fields once calculated. This greatly improves performance.

      I'll attach new versions of PDFont, PDFontDescriptorDictionary and PDSimpleFont that have these tweaks.

      Attachments

        1. PDSimpleFont.java
          12 kB
          Mel Martinez
        2. PDFontDescriptorDictionary.java
          15 kB
          Mel Martinez
        3. PDFont.java
          30 kB
          Mel Martinez

        Activity

          People

            jukkaz Jukka Zitting
            m.martinez Mel Martinez
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: