Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4869

Reading standard 14 fonts is slow

    XMLWordPrintableJSON

    Details

      Description

      I am testing text extraction from PDF and profiling the execution.

      I found that the second biggest time consumer is the static code in Standard14Fonts that loads fonts from the pdf box jar.

      The culprit seems to be the direct use of the stream returned getResurceAsStream.
      That would be a ZipInputStream when using PDFBox as a jar.

      Using a buffered stream around it reduces the load time a lot.

       

        Attachments

        1. PDFBOX-4869.patch
          11 kB
          Alfred

          Activity

            People

            • Assignee:
              tilman Tilman Hausherr
              Reporter:
              Faltiska Alfred
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1m
                1m
                Remaining:
                Remaining Estimate - 1m
                1m
                Logged:
                Time Spent - Not Specified
                Not Specified