1. Tika




The language identifier component for use in content detection and analysis.

Issues: Unresolved

Key Summary Due Date
Improvement TIKA-369 Improve accuracy of language detection
New Feature TIKA-491 Add language identification support for Norwegian Bokmål and Norwegian Nynorsk
Bug TIKA-496 Language identifier profile comparison favors large profiles

View Issues

Issues: Updated recently

Key Summary Updated
New Feature TIKA-1696 Language Identification with Text Processing Toolkit from MITLL
Bug TIKA-1622 Expose Tika LanguageIdentifier via Tika Server
Improvement TIKA-1625 Add support to Tika Server for parsing remote file URLs and for providing language detection

View Issues

Versions: Unreleased

Name Release date
Unreleased 2.0  
Unreleased 1.10