Description
We are working with Tika to implement our mime types detection module. The library seemingly cannot detect Mathematica files although the documentation confirmed it does [1]. The Tika detector always returns `text/plain` instead of `application/mathematica` as described in the documentation as well as unit tests [2].
By doing the same need with Python code as below, we can obtain the right mime types for any Mathematica file downloaded from the Wolfram Library Archive [3].
#!/usr/bin/python3
import mimetypes, os, sys
test_file = sys.argv[1]
print(mimetypes.MimeTypes().guess_type(test_file)[0])
Therefore, we suspected there is a bug in Tika detector where it tries to guess mime types for Mathematica files.
Also, there is an existing ticket asking for the implementation of Mathematica file detector. Here it is: https://issues.apache.org/jira/browse/TIKA-1520
References:
[1] https://tika.apache.org/1.23/formats.html
[3] https://library.wolfram.com/infocenter/Courseware/4706/