Description
It would be useful to be able to compare counts of common structure tags in tika-eval. We could also detect and flag bad structure tags that we may be generating, e.g.: <i><u></i></u>
It would be useful to be able to compare counts of common structure tags in tika-eval. We could also detect and flag bad structure tags that we may be generating, e.g.: <i><u></i></u>