Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3421

Optimize float to string conversion in PDAbstractContentStream

    XMLWordPrintableJSON

Details

    Description

      Drawing lines in a PDF (and also other operations) writes the coordinates to the content stream. Currently, the PDAbstractContentStream#writeOperand(float) method uses the NumberFormat class to convert the float values. This is inefficient for multiple reasons:

      • NumberFormat is designed to format numbers locale dependent. That is not needed in this case
      • NumberFormat uses a pattern to format the value which is also not needed in this case
      • The formatting first creates a String object, converts it to a byte array with ASCII encoding and then writes it to the stream. This generates a lot of garbage.

      A different approach to formatting real operands should be used.

      Attachments

        1. PDFBOX-3421_Float_formatting_performance_rev1.patch
          6 kB
          Michael Doswald
        2. pdfbox-performance-floatformat.zip
          377 kB
          Michael Doswald
        3. PDFBOX-3421_Float_formatting_performance_rev2.patch
          13 kB
          Michael Doswald
        4. PDFBOX-3421_Float_formatting_performance_rev3.patch
          17 kB
          Michael Doswald
        5. PDFBOX-3421_Float_formatting_performance_rev4.patch
          17 kB
          Michael Doswald

        Activity

          People

            tilman Tilman Hausherr
            michaeldoswald Michael Doswald
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: