Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5534

Create a Nifi Processor using Boilerpipe Article Extractor

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Using the boilerpipe library (https://code.google.com/archive/p/boilerpipe/ ), I created a simple processor that reads the content of a URL and extract its text into a flowfile.

      I think it is a good complement to the HMTL nar bundle.

       

      Link to my implementation: https://github.com/paulvid/nifi/tree/NIFI-5534/nifi-nar-bundles/nifi-html-bundle/nifi-html-processors/

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              paulvid3 Paul Vidal
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified