Details
-
Task
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
In 1.x, users could send a URL including a file url to tika-server and have tika-server fetch the bytes. In 2.x, we created the tika-pipes modules and included a file fetcher in tika-core and put an http-fetcher in its own module because of its dependency on httpclient.
To smooth the transition to 2.x, it might be useful to add a URLFetcher that uses the built-in basic Java URL.getConnection() functionality. I'd want to prohibit the file protocol because of the history with that as a vulnerability. If folks want to fetch files, they have to explicitly choose a different fetcher and specify a base path.
Attachments
Issue Links
- is depended upon by
-
TIKA-3523 A replacement for enableFileUrl or Support for Google Cloud
- Open