Description
The name 'HTMLParserFilter' is slightly confusing as it gives the impression that the implementations of this endpoint are getting only HTML documents.
The plugin parse-tika calls the HTMLParserFilters and passes them a DOM representation of the XHTML-like documents it got from the underlying Tika parsers. This means that we are getting a DOM representation for documents in any format recognised by Tika and not only HTML.
What about renaming HTMLParserFilter into ParserFilter? Any other suggestions?