Details
-
Wish
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
ManifoldCF 2.10
-
None
-
None
Description
I’ll be crawling a website with the standard Web connecter. I want to extract just certain html tags like <h1>, <h2> and <p>.
I’ve set up an HTML extractor transformation connector and the internal Tika transformation connector. But I can’t find any place to do a mapping to the output for this.
Do I have to write my own transformation connector to extract the content of these tags? Or is there a built in solution?