1. Tika




The language identifier component for use in content detection and analysis.

Issues: Unresolved

Key Summary Due Date
Improvement TIKA-354 ProfilingHandler should take a length-limiting parameter
Improvement TIKA-369 Improve accuracy of language detection
New Feature TIKA-491 Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

View Issues

Issues: Updated recently

Key Summary Updated
Bug TIKA-1549 Two times speed increase of language profile distance calculation
Bug TIKA-1405 German content detected as French
Improvement TIKA-1337 LanguageProfile for Persian/Farsi

View Issues

Versions: Unreleased

Name Release date
Unreleased 1.8