Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2684

Tika does not extract *.fits header text, just file level metadata

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: 1.18
    • Fix Version/s: None
    • Component/s: metadata, mime, parser
    • Labels:
      None
    • External issue ID:
      Tika-874

      Description

      Tika only pull file level metadata for *.fits (flexible image transport system) files, using:

      java -jar tika-app-1.18.jar --gui

      Content-Length: 699840
      Content-Type: application/fits
      X-Parsed-By: org.apache.tika.parser.DefaultParser
      X-Parsed-By: org.apache.tika.parser.gdal.GDALParser
      X-TIKA:digest:MD5: d93e8f4654902c45c7f3e4f4bf5f63e2
      X-TIKA:digest:SHA256: da7c0f1b6643850856cba100e9b3e8db76b80e91583eb088635c416a2b4161b3
      resourceName: WFPC2u5780205r_c0fx.fits

      Rather than text from the header (extracted with astropy.py):

      SIMPLE  =                    T / file does conform to FITS standard             BITPIX  =                  -32 / number of bits per data pixel                  NAXIS   =                    3 / number of data axes                            NAXIS1  =                  200 / length of data axis 1                          NAXIS2  =                  200 / length of data axis 2                          NAXIS3  =                    4 / length of data axis 3                          EXTEND  =                    T / FITS dataset may contain extensions            COMMENT   FITS (Flexible Image Transport System) format is defined in 'AstronomyCOMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H BSCALE  =                1.0E0 / REAL = TAPE*BSCALE + BZERO                     BZERO   =                0.0E0 /                                                OPSIZE  =                 2112 / PSIZE of original image                        ORIGIN  = 'STScI-STSDAS'       / Fitsio version 21-Feb-1996                     FITSDATE= '2004-01-09'         / Date FITS file was created                     FILENAME= 'u5780205r_cvt.c0h'  / Original filename                              ALLG-MAX=           3.777701E3 / Data max in all groups                         ALLG-MIN=          -7.319537E1 / Data min in all groups                         ODATTYPE= 'FLOATING'           / Original datatype: Single precision real       SDASMGNU=                    4 / Number of groups in original image    

       

      This was capability was mentioned in Tika-874. I'm looking at netCDF files/headers as model for this behaviour. 

      Thank you!

       

       

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                sborda Susan
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: