Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
This improvement will wrap Apache Tika using an updated version of Tim Spann's ExtractTextProcessor processor. I contacted Tim via LinkedIn, and he agreed to make it part of the NiFi code base going forward. In addition, this ticket adds the include-media profile which makes it possible to easily add the NiFi media bundle to a custom build of NiFi.
Attachments
Issue Links
- relates to
-
NIFI-10218 ExtractDocumentText processor does not handle certain characters when extracting from a PDF
-
- Open
-
- links to