Tika
  1. Tika
  2. TIKA-811

Upgrade metadatExtractor version for OpenJDK 7 support

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0, 1.1
    • Fix Version/s: 1.3
    • Component/s: parser
    • Labels:
    • Environment:

      OpenJDK 7

      Description

      The metadataextractor library (2.4.0-beta-1) is quite old and is depending on some Sun classes thus making it unable to run on openJDK 7 which is now the default JDK on Linux distributions.
      Upgrading the library to the new version 2.5.0-RC3 fixes this issue but the API has changed.
      Appending a patch to the MetadataExtactor class (and the tests) to take advantage of this.

      1. metadata.diff
        25 kB
        Emmanuel Hugonnet

        Issue Links

          Activity

          Hide
          Emmanuel Hugonnet added a comment -

          Patch to fix the issue with upgrading to MetadataExtractor 2.5.0-RC3

          Show
          Emmanuel Hugonnet added a comment - Patch to fix the issue with upgrading to MetadataExtractor 2.5.0-RC3
          Hide
          Nick Burch added a comment -

          Do you know if 2.5.0-RC3 available in Maven Central, or will we need to wait for the final 2.5.0 release to be able to get it via maven?

          Show
          Nick Burch added a comment - Do you know if 2.5.0-RC3 available in Maven Central, or will we need to wait for the final 2.5.0 release to be able to get it via maven?
          Hide
          Emmanuel Hugonnet added a comment -

          Alas it is not present, it is downloadable as well as the source code (but we would have to create the javadoc artifact) http://code.google.com/p/metadata-extractor/downloads/list. It depends on some Adobe jar.
          You can take a look on our own Nexus https://www.silverpeas.org/nexus/content/groups/silverpeas/com/drewnoakes/metadata-extractor/ for the POM infos and to try my code.

          Show
          Emmanuel Hugonnet added a comment - Alas it is not present, it is downloadable as well as the source code (but we would have to create the javadoc artifact) http://code.google.com/p/metadata-extractor/downloads/list . It depends on some Adobe jar. You can take a look on our own Nexus https://www.silverpeas.org/nexus/content/groups/silverpeas/com/drewnoakes/metadata-extractor/ for the POM infos and to try my code.
          Hide
          Miguel Moquillon added a comment -

          Tika is now in version 1.1 and depends again on the old beta version 2.4.0-beta-1 of metadata-extractor. Latter is currently in stable version 2.6.2.

          This version is available in the Nexus at https://www.silverpeas.org/nexus/ (more exactly at https://www.silverpeas.org/nexus/content/groups/silverpeas/com/drewnoakes/metadata-extractor/2.6.2/). Its dependency on the Adobe XMP Core library is also available in this Nexus (more exactly at https://www.silverpeas.org/nexus/content/groups/silverpeas/com/adobe/xmp/xmp-core/5.1.0/). You can take and upload them into your central maven repo.

          Please, integrate the patch in order to be compatible with OpenJDK 7 by depending on the latest versions of metadata-extractor. This will us avoiding to patch tika by ourselves each time a new version of tika is published.

          Show
          Miguel Moquillon added a comment - Tika is now in version 1.1 and depends again on the old beta version 2.4.0-beta-1 of metadata-extractor. Latter is currently in stable version 2.6.2. This version is available in the Nexus at https://www.silverpeas.org/nexus/ (more exactly at https://www.silverpeas.org/nexus/content/groups/silverpeas/com/drewnoakes/metadata-extractor/2.6.2/ ). Its dependency on the Adobe XMP Core library is also available in this Nexus (more exactly at https://www.silverpeas.org/nexus/content/groups/silverpeas/com/adobe/xmp/xmp-core/5.1.0/ ). You can take and upload them into your central maven repo. Please, integrate the patch in order to be compatible with OpenJDK 7 by depending on the latest versions of metadata-extractor. This will us avoiding to patch tika by ourselves each time a new version of tika is published.
          Hide
          Jörg Ehrlich added a comment -

          Tika can integrate with version 5.1.1 of the XMPCore library. Version 5.1.0 had accidentally been compiled for JDK 1.7 which is incompatible with Tika. But Version 5.1.1 is compatible with Tika.

          Show
          Jörg Ehrlich added a comment - Tika can integrate with version 5.1.1 of the XMPCore library. Version 5.1.0 had accidentally been compiled for JDK 1.7 which is incompatible with Tika. But Version 5.1.1 is compatible with Tika.
          Hide
          Emmanuel Hugonnet added a comment -

          Yep, and all is working fine :
          Tika 1.1 + patch + metadata-extractor-2.6.2 + XMPCore-5.1.1
          Cheers

          Show
          Emmanuel Hugonnet added a comment - Yep, and all is working fine : Tika 1.1 + patch + metadata-extractor-2.6.2 + XMPCore-5.1.1 Cheers
          Hide
          Jukka Zitting added a comment -

          See http://www.sonatype.com/people/2009/02/why-putting-repositories-in-your-poms-is-a-bad-idea/ for why we want to avoid using other repositories than the central one.

          To make it possible for Tika to upgrade to a more recent version of metadata-extractor, you should work with the upstream project to get their latest releases uploaded to the central repository. See https://support.sonatype.com/entries/20914616-how-do-i-get-my-software-into-central for pointers on how to do that. It might even be possible to set up an automatic sync from the silverpeas repository to the central one.

          Show
          Jukka Zitting added a comment - See http://www.sonatype.com/people/2009/02/why-putting-repositories-in-your-poms-is-a-bad-idea/ for why we want to avoid using other repositories than the central one. To make it possible for Tika to upgrade to a more recent version of metadata-extractor, you should work with the upstream project to get their latest releases uploaded to the central repository. See https://support.sonatype.com/entries/20914616-how-do-i-get-my-software-into-central for pointers on how to do that. It might even be possible to set up an automatic sync from the silverpeas repository to the central one.
          Hide
          Jukka Zitting added a comment -

          Looking at TIKA-915 it appears that the process of getting the latest metadata-extractor version to the central repository is already in progress.

          Show
          Jukka Zitting added a comment - Looking at TIKA-915 it appears that the process of getting the latest metadata-extractor version to the central repository is already in progress.
          Hide
          Ray Gauss II added a comment -

          Thanks for the patch Emmanuel, I wish I had seen it earlier as I ended up duplicating some of your work.

          There were some additional issues presented by 2.6.2, along with a deprecated Tiff parsing method, some tests not passing, and I've refactored the GeotagHandler, but all seems to be working now.

          This should be resolved in r1366967.

          Show
          Ray Gauss II added a comment - Thanks for the patch Emmanuel, I wish I had seen it earlier as I ended up duplicating some of your work. There were some additional issues presented by 2.6.2, along with a deprecated Tiff parsing method, some tests not passing, and I've refactored the GeotagHandler, but all seems to be working now. This should be resolved in r1366967.
          Hide
          Ray Gauss II added a comment -

          Resolved by r1366967

          Show
          Ray Gauss II added a comment - Resolved by r1366967

            People

            • Assignee:
              Ray Gauss II
              Reporter:
              Emmanuel Hugonnet
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development