-
Type:
Bug
-
Status: Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.20
-
Fix Version/s: 1.21
-
Component/s: None
-
Labels:None
When doing "tika.detect(stream, name)" and the stream is a "TikaInputStream", execution gets to "TikaInputStream#getPath" which does a "Files.copy(in, path, REPLACE_EXISTING);" which is very, very bad. This input stream could be, as in our case, an input stream from a network file which is tens or hundreds of gigabytes large. Copying it locally is a huge waste of resources to say the least. Why does it do that and can I make it not do it? Or is this something that has to be fixed in Tika?
- is related to
-
TIKA-2853 Consider applying NaiveBayes or similar simple ML to streaming zip detector
-
- Open
-