Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5695

Improve PDFBox Logging

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • None

    Description

      I know this has been an issue before PDFBOX-693, but since then a lot of time has passed. I propose to switch Logging to either SLF4J (my preference) or Log4J.

      Why? Currently, PDFBox uses commons logging. That library seems not to be actively maintained anymore, the last release was in 2014 with the comment „the main purpose of the 1.2 release is to drop support for Java 1.1“. PDFBox 4 requires at least Java 11. The Java language has evolved a lot over the last few years, and so have to logging frameworks.

      Maintainabilty and Performance

      The main point that speaks against commons logging IMHO is that you have to make compromises between readability and performance. Let’s look at an example (taken from COSParser.java):

       

      if (offsetOrObjstmObNr != null)
      { 
          LOG.debug("Set missing offset " + offsetOrObjstmObNr + " for object " + objKey);
          document.getXrefTable().put(objKey, offsetOrObjstmObNr);
      } 
      

      Each time the if-statement's body is entered, the logging message is constructed. This involves:

      • converting offsetOrObjstmObNr to a String
      • converting objKey to a String which in turn does two more conversions of integer and long values to a String, creates a new String by joining results of former conversions
      • creating a new String consisting of the results of the both operations above and some static text

      A lot of temporary objects are created and all of this happens regardless of whether logging is necessary or not.

      Both SLF4J and Log4J (and even JUL logging) support a message format syntax where arguments to logging calls are only evaluated when needed:

       

      if (offsetOrObjstmObNr != null)
      {
          LOG.debug("Set missing offset {} for object {}", offsetOrObjstmObNr, objKey);
          document.getXrefTable().put(objKey, offsetOrObjstmObNr);
      }
      

      Using common s logging, you can guard the logging statement:

       

      if (offsetOrObjstmObNr != null)
      {
          if (LOG.isDebugEnabled())
          {
              LOG.debug("Set missing offset " + offsetOrObjstmObNr + " for object " + objKey);
          }
          document.getXrefTable().put(objKey, offsetOrObjstmObNr);
      }
      

      This works, but makes the code less readable and harder to maintain because the level has to be changed in two places instead of one.

      Flexibility

      Another thing is that both SLF4J and Log4J offer you a facade that you can bridge to nearly every logging implementation in widespread use nowadays whereas commons logging is a full logging implementation that most of us do not need (there are ways to make commons logging work with other backends, but they are more of a workaround than a clean solution).

      JPMS support

      And of course, both SLF4J and Log4J offer first class support for the module system.

      So what are the PDFBox maintainer's thoughts on this? I think PDFBox 4 offers an opportunity to switch to a new Logging facade/framework, and if there's a concensus, I volunteer to contribute a patch to make the transition.

      Support Lambda Logging

      In some cases, even using a message format ist not enough, for example there are instances in the code where a `byte[]` value is logged as a String. Since the String conversion has to be done explicitly, it would be done before the actual logging method is entered. This can be solved using lambdas by passing `() -> new String(value)` instead (syntax may vary).

      Alternatives

      The alternatives (sorted according to my own preference) are:

      • use SLF4J as logging facade
      • use Log4J2 as logging facade
      • use JUL logging (comes with a cost when used with other logging backends)
      • stay with commons logging and add guards in the code where they are not yet present
      • stay with commons logging and try to convince the maintainers (if there still are any) to add a message format support that is comparable to what the other three and then switch to the new version of commons logging once the functionality is there (and yes, if there's a chance this gets approved by the maintainers I'd see if I can contribute the necessary changes to commons logging provided the idea gets support over there)

      Cheers,
      Axel

      Attachments

        1. Optimize_debug_logging_across_multiple_files.patch
          62 kB
          Axel Howind
        2. use_Property_for_log4j_version_replace_commons_logging_by_log4j-api_(fontbox).patch
          64 kB
          Axel Howind
        3. replace_commons-logging_with_log4j_(xmpbox).patch
          1 kB
          Axel Howind
        4. replace_commons-logging_with_log4j_(pdfbox-tools).patch
          10 kB
          Axel Howind
        5. replace_commons-logging_with_log4j_(pdfbox-io).patch
          9 kB
          Axel Howind
        6. update_log4j_to_2_21_1.patch
          0.8 kB
          Axel Howind
        7. replace_commons-logging_with_log4j_(pdfbox).patch
          311 kB
          Axel Howind
        8. replace_commons-logging_with_log4j_(pdfbox-debugger).patch
          30 kB
          Axel Howind
        9. replace_commons-logging_with_log4j_(examples).patch
          33 kB
          Axel Howind
        10. fix_log_statement_with_wrong_parameter_count.patch
          1 kB
          Axel Howind
        11. replace_commons-logging_with_log4j.patch
          7 kB
          Axel Howind
        12. replace_commons-logging_with_log4j2_(cleanup_POM_files).patch
          4 kB
          Axel Howind
        13. Fixed_DebugLogAppender_not_being_initialised_and_runtime_exceptions_in_LogDialog.patch
          5 kB
          Axel Howind
        14. replace_commons-logging_by_log4j_(benchmark).patch
          3 kB
          Axel Howind
        15. add_logging_imports_to_benchmark_classes.patch
          2 kB
          Axel Howind
        16. update_log4j_to_2_22_0.patch
          0.8 kB
          Axel Howind
        17. image-2023-11-25-16-22-45-143.png
          42 kB
          Tilman Hausherr
        18. add_missing_version_for_log4j-core.patch
          0.7 kB
          Axel Howind
        19. use_maven-shade-plugin.patch
          10 kB
          Axel Howind
        20. fix_debugger_in_app.patch
          3 kB
          Axel Howind
        21. remove_bouncycastle_from_debugger_app.patch
          0.9 kB
          Axel Howind
        22. remove_console_logging_when_running_the_debugger_app.patch
          2 kB
          Axel Howind

        Activity

          People

            lehmi Andreas Lehmkühler
            axh Axel Howind
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: