XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • Bundlelists
    • None

    Description

      After looking at the documentation and supported languages of both I think that we should switch from the LangId Engine (based on Apache Tika Language detection) to the Langdetect Engine (based on http://code.google.com/p/language-detection/).

      Normal users should not notice any difference as both engines create the same Annotations. However the later supports considerable more languages.

      This change will come along with a lot of changes in the integration tests as those check on a lot of places for the LangId Engine. Those need to be changed to the Langdetect Engine.

      Attachments

        Activity

          People

            rwesten Rupert Westenthaler
            rwesten Rupert Westenthaler
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: