Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
IdentifyMimeType uses tika configured with a custom-mimetypes.xml[1] to specify (among others) the flowfile-v* mime types. However, these do not include priorities. Therefore, a NiFi FlowFile V3 package with a payload containing, for example, html including the string:
<html xmlns=
will be identified as "application/xhtml+xml" [2] which, while matching the pattern, is not as correct as identifying it as application/flowfile-v3. To fix this, I believe we need to specify a higher priority for the FlowFile V3 "magic"...
[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/org/apache/tika/mime/custom-mimetypes.xml#L26-L31
[2] https://gitbox.apache.org/repos/asf?p=tika.git;a=blob;f=tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml;hb=refs/heads/master
Attachments
Issue Links
- is duplicated by
-
NIFI-11166 IdentifyMimeType processor identifies flowfile-v3 as video/x-ms-wmv when containing wmv file
-
- Resolved
-
- links to