Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
So I am able to use Tika Pipes to extract the text content from a document.
But is it possible to use Tika Pipes to obtain structured documents? I believe Tika does this in XHTML.
The plain text extracted from the document is great for indexing into search engine.
But if you want the structured text output like XHTML?