Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1204

DWFX files detection

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.4
    • Fix Version/s: None
    • Component/s: detector, mime
    • Labels:

      Description

      DWFX are AutoCAD Design web format files and follow Open Packaging Conventions.
      Tika "correctly" detects these files as application/zip.
      It would be better if Tika could recognize the true mimetype: model/vnd.dwfx+xps.
      Please add logic in ZipContainerDetector in such a way could be possible to detect dwfx. We need a method behaving like detectOfficeOpenXML(OPCPackage pkg):

      PackageRelationshipCollection core = pkg.getRelationshipsByType("http://schemas.autodesk.com/dwfx/2007/relationships/documentsequence");
      if (core.size() != 1) {
       // Invalid DWFX Package received
       return null;
      }
      PackagePart corePart = pkg.getPart(core.getRelationship(0));
      String coreType = corePart.getContentType();
      return MediaType.parse(coreType);
      

      Thank you,
      Marco

        Attachments

        1. General assembly filter.dwfx
          1.66 MB
          Marco Quaranta

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              101000 Marco Quaranta
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: