Details
-
New Feature
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Using the boilerpipe library (https://code.google.com/archive/p/boilerpipe/ ), I created a simple processor that reads the content of a URL and extract its text into a flowfile.
I think it is a good complement to the HMTL nar bundle.
Link to my implementation: https://github.com/paulvid/nifi/tree/NIFI-5534/nifi-nar-bundles/nifi-html-bundle/nifi-html-processors/