Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1843

Tika parser for SEG-Y files and new MIME type application/segy

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: mime, parser
    • Labels:
      None

      Description

      This ticket refers to the parsing of SEG-Y files (extensions .seg, .segy and .sgy).
      The SEG-Y format is used to store seismic data, you can find more information here http://pubs.usgs.gov/of/2001/of01-326/HTML/FILEFORM.HTM.

      I have:

      • added a new MIME type application/segy matching the file name extensions .segy, .seg and .sgy.
      • created a new SEGYParser, matching that MIME type.

      In order to parse the SEG-Y files, I am using a modified version of the sigrun code (available under Apache license, here https://github.com/mikhail-aksenov/sigrun). Notably I have done a fix and changed some method signatures to be able to read from a ReadableByteChannel instead of FileChannel.
      For the moment I have put it directly into the new Tika's segy package. Is this the right thing to do or should I reference it as external library thus modifying the pom.xml?

      Thanks and best regards,
      Giovanni

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              gusai Giovanni Usai
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: