Description
We include token and unique token (type) counts in tika-eval. We should include type counts for alphabetic and common words. If one tool is incorrectly duplicating/triplicating content dramatically, that would incorrectly inflate the "common_tokens" sum for that tool.