Uploaded image for project: 'Stanbol (Retired)'
  1. Stanbol (Retired)
  2. STANBOL-980

Add Japanese Language support by using the Solr/Lucene Kuromoji Analyzer

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • commons-0.11.0
    • 0.12.0
    • Commons, Enhancer
    • None

    Description

      With the most recent Solr/Lucene versions the Kuromoji Analyzer for Japanese was added. This module will allow to

      • index and search Entities with Japanese language labels and texts
      • Tokenize Japanese Text
      • POS tagging of Japanese Text
      • NER for Persons, Organizations and Places
      • Lemmatization
      • Correct Label Tokenization required for linking Japanese labels of Entities

      This will required three modules:

      • extension to the commons.solr.core module that provide the Kuromoji Analyzer as Bundle
      • NLP processing Engine
      • LabelTokenizer implementation

      In addition an own bundlelist that includes those three modules. This Bundlelist should be added by default to the Full Stanbol Launcher.

      Attachments

        Activity

          People

            rwesten Rupert Westenthaler
            rwesten Rupert Westenthaler
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: