Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1892

Mime Magic for application/x-mobipocket-ebook and application/x-shapefile

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.12
    • Fix Version/s: 1.13
    • Component/s: mime
    • Labels:
      None
    • Flags:
      Patch

      Description

      Our FHT analysis for mobipocket-ebook and shapefiles shows high corelation of initial header bytes. Further inspection of these files over online available and TREC polar data sets revealed presence of common bytes for mime identification

      patch content
      <mime-type type="application/x-netcdf">
      <acronym>NETCDF</acronym>
      <_comment>Network Common Data Format</_comment>
      <magic priority="60">
      <match value="CDF" type="string" offset="0" />
      </magic>
      <glob pattern="*.nc"/>
      </mime-type>
      <mime-type type="application/x-mobipocket-ebook">
      <acronym>MOBI</acronym>
      <_comment>Mobipocket Ebook</_comment>
      <magic priority="60">
      <match value="BOOKMOBI" type="string" offset="23" />
      </magic>
      <glob pattern="*.mobi"/>
      </mime-type>
      <mime-type type="application/x-shapefile">
      <acronym>ESRI Shapefiles</acronym>
      <_comment>ESRI Shapefiles</_comment>
      <magic priority="60">
      <match value="0x0000270a" type="big32" offset="2" />
      </magic>
      <glob pattern="*.shp"/>
      </mime-type>

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kashyap5 Suman Kashyap
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: