Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2318

NPE in new DomXmpParser when no type is found

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.0
    • XmpBox

    Description

      Discovered this bug while trying to sync both TIKA and PDFBOX to their current SNAPSHOT builds.

      Issue came to light when running Tika's JpegParserTest.testJPEGEmptyEXIFDateTime() junit test case since the test file contains the following property photoshop:LegacyIPTCDigest which is a non defined property in the PhotoshopSchema.

      This causes a null Type to be created in DomXmpParser.parseDescriptionRoot(). The solution in my patch is to default to text for any undefined types. It may be beneficial to also log a warning about such types so that the schema files can be properly amended. (Currently the LegacyIPTCDigest has not been added to the Schema in this patch)

      Relates to work done via Tika in TIKA-1285

      Attachments

        1. PDFBOX-2318.patch
          3 kB
          Jeremy Anderson
        2. jsr170-1.0.pdf
          2.39 MB
          Jeremy Anderson

        Activity

          People

            lehmi Andreas Lehmkühler
            rpialum Jeremy Anderson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: