Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-862

JPSS HDF5 files not being detected appropriately

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0
    • None
    • parser
    • None

    Description

      As commented in TIKA-614, JPSS HDF 5 files are not being properly detected by Tika. See this:

      from minfing:

      We were trying to extract metadata from our h5 file (i.e. with JPSS extension). We ran the following command line:

      [ryu@localhost hdf5extractor]$ java -jar tika-app-1.0.jar -m \
      > /usr/local/staging/products/h5/SVM13_npp_d20120122_t1659139_e1700381_b01225_c20120123000312144174_noaa_ops.h5
      Content-Encoding: windows-1252
      Content-Length: 22187952
      Content-Type: text/plain
      resourceName: SVM13_npp_d20120122_t1659139_e1700381_b01225_c20120123000312144174_noaa_ops.h5
      [ryu@localhost hdf5extractor]$
      

      We noticed that the content type in text/plain and only 4 lines of output (i.e. we expected al lots of metadata).

      Let me know if more information is needed. Thanks!

      Richard

      Attachments

        Issue Links

          Activity

            People

              chrismattmann Chris A. Mattmann
              minfing Richard Yu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: