Details
Description
ParserChecker and IndexingFiltersChecker should report when a document is truncated due to
{http,file,ftp}.content.limit.
Truncated content may cause text and metadata extraction to fail for PDF and other binary document formats.
A hint that truncation (and not a broken plugin) is the possible reason would be useful.
See NUTCH-965 and ParseSegment.isTruncated(content).