Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Currently only the jcr:mimeType property is used to determine the MIME type and thus the applicable text extractor to use for indexing a document. If the jcr:mimeType property is not available or is set to a generic value like "application/octet-stream", then the indexer could also use some heuristics based on the node name or magic numbers within the binary stream to determine the type of the document.