Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1892

Mime Magic for application/x-mobipocket-ebook and application/x-shapefile

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.12
    • 1.13
    • mime
    • None
    • Patch

    Description

      Our FHT analysis for mobipocket-ebook and shapefiles shows high corelation of initial header bytes. Further inspection of these files over online available and TREC polar data sets revealed presence of common bytes for mime identification

      patch content
      <mime-type type="application/x-netcdf">
      <acronym>NETCDF</acronym>
      <_comment>Network Common Data Format</_comment>
      <magic priority="60">
      <match value="CDF" type="string" offset="0" />
      </magic>
      <glob pattern="*.nc"/>
      </mime-type>
      <mime-type type="application/x-mobipocket-ebook">
      <acronym>MOBI</acronym>
      <_comment>Mobipocket Ebook</_comment>
      <magic priority="60">
      <match value="BOOKMOBI" type="string" offset="23" />
      </magic>
      <glob pattern="*.mobi"/>
      </mime-type>
      <mime-type type="application/x-shapefile">
      <acronym>ESRI Shapefiles</acronym>
      <_comment>ESRI Shapefiles</_comment>
      <magic priority="60">
      <match value="0x0000270a" type="big32" offset="2" />
      </magic>
      <glob pattern="*.shp"/>
      </mime-type>

      Attachments

        Activity

          People

            Unassigned Unassigned
            kashyap5 Suman Kashyap
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: