Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-874

Identify FITS (Flexible Image Transport System) files

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2
    • Component/s: mime
    • Labels:
      None

      Description

      Tika does not have a defined signature for application/fits files. I have created a patch (based on file(1) magic) to address identification of such files, including a simple unit test.

      This patch only handles identification, not parsing of FITS files.

        Activity

        Hide
        pete.s.may Peter May added a comment - - edited

        This patch identifies FITS files, based on the signature used by the file(1) command and also specified in RFC4047 (http://fits.gsfc.nasa.gov/rfc4047.txt).

        It includes a simple unit test (added to TestMimeTypes) using a FITS file created using imageMagick to convert https://github.com/apache/tika/blob/trunk/tika-parsers/src/test/resources/test-documents/testJPEG.jpg to a FITS image.

        Comments welcome.

        Show
        pete.s.may Peter May added a comment - - edited This patch identifies FITS files, based on the signature used by the file(1) command and also specified in RFC4047 ( http://fits.gsfc.nasa.gov/rfc4047.txt ). It includes a simple unit test (added to TestMimeTypes) using a FITS file created using imageMagick to convert https://github.com/apache/tika/blob/trunk/tika-parsers/src/test/resources/test-documents/testJPEG.jpg to a FITS image. Comments welcome.
        Hide
        chrismattmann Chris A. Mattmann added a comment -
        • update fix version, no affects version since new feature.
        Show
        chrismattmann Chris A. Mattmann added a comment - update fix version, no affects version since new feature.
        Hide
        chrismattmann Chris A. Mattmann added a comment -
        • patch applied in r1299703. Thank you Peter!
        Show
        chrismattmann Chris A. Mattmann added a comment - patch applied in r1299703. Thank you Peter!
        Hide
        rahul_k Rahul Khanna added a comment - - edited

        I've created a parser for FITS files that extracts metadata using the nom.tam.fits library available at http://heasarc.gsfc.nasa.gov/docs/heasarc/fits/java/v1.0/ . The code is used in The Australian National University's Data Commons project available at https://github.com/anu-doi/anudc . Code for the parser can be viewed at https://github.com/anu-doi/anudc/blob/master/DcShared/src/main/java/au/edu/anu/dcbag/metadata/FitsParser.java .

        I was wondering if the Apache Tika Project is accepting contributions in the form of parsers created by other users such as myself. If yes, how can I submit the code?

        Show
        rahul_k Rahul Khanna added a comment - - edited I've created a parser for FITS files that extracts metadata using the nom.tam.fits library available at http://heasarc.gsfc.nasa.gov/docs/heasarc/fits/java/v1.0/ . The code is used in The Australian National University's Data Commons project available at https://github.com/anu-doi/anudc . Code for the parser can be viewed at https://github.com/anu-doi/anudc/blob/master/DcShared/src/main/java/au/edu/anu/dcbag/metadata/FitsParser.java . I was wondering if the Apache Tika Project is accepting contributions in the form of parsers created by other users such as myself. If yes, how can I submit the code?
        Hide
        gagravarr Nick Burch added a comment -

        We do, where appropriate. (Sometimes it's better for the parser to live in the same project as the library it depends on, and just include both as dependencies in Tika)

        Your best bet is probably to start a thread on dev@tika.apache.org, and we can all work out between us if we're best off bringing the code into Tika or if it's best to leave it outside and simply depend on it.

        Show
        gagravarr Nick Burch added a comment - We do, where appropriate. (Sometimes it's better for the parser to live in the same project as the library it depends on, and just include both as dependencies in Tika) Your best bet is probably to start a thread on dev@tika.apache.org, and we can all work out between us if we're best off bringing the code into Tika or if it's best to leave it outside and simply depend on it.
        Hide
        chrismattmann Chris A. Mattmann added a comment -

        Hi Rahul: I would recommend creating a new issue here on the TIKA JIRA and then as Nick mentioned, move the discussion of this new parser to dev@tika, and then reference the JIRA issue.

        I for one am happy to help shepherd the parser into Tika if it makes sense, and to help you earn the merit to help shepherd it yourself

        Show
        chrismattmann Chris A. Mattmann added a comment - Hi Rahul: I would recommend creating a new issue here on the TIKA JIRA and then as Nick mentioned, move the discussion of this new parser to dev@tika, and then reference the JIRA issue. I for one am happy to help shepherd the parser into Tika if it makes sense, and to help you earn the merit to help shepherd it yourself

          People

          • Assignee:
            chrismattmann Chris A. Mattmann
            Reporter:
            pete.s.may Peter May
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development