Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Whether we use actual ml or build rules from patterns we see in the data, it would be useful to gather features from field names, directory names, etc of zipfile-based file types from our regression corpus to (potentially) improve the efficiency of mime detection.
Attachments
Issue Links
- relates to
-
TIKA-2849 TikaInputStream copies the input stream locally
- Resolved