Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5824

Allow COSDictionary.MAP_THRESHOLD to be defined as System property

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.0.3 PDFBox, 4.0.0
    • 3.0.3 PDFBox, 4.0.0
    • PDModel
    • None
    • Patch

    Description

      COSDictionary.MAP_THRESHOLD controls which Map class is used to optimize memory usage. By default, a SmallMap is used. However, if the number of items in a COSDictionary reaches the MAP_THRESHOLD value (hardcoded to 1,000), the references are copied to a LinkedHashMap.

      For larger documents, where the COSDictionary is expected to be substantial bigger than this limit, this copying occurs frequently. Additionally, SmallMap.keySet is not efficient. The attached screenshot shows pdfbox performance with SmallMap (in red) versus using LinkedHashMap, ignoring the threshold (in green).

      Would it be beneficial to allow MAP_THRESHOLD to be defined as a System property?

      If set to 0, LinkedHashMap would be used. If not set, it would default to the current MAP_THRESHOLD value and SmallMap, not changing the current behaviour.

      Attachments

        1. Screenshot 2024-05-21 at 11.00.25.jpg
          381 kB
          Jonathan Prates

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              thumbox Jonathan Prates
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: