Uploaded image for project: 'Stanbol (Retired)'
  1. Stanbol (Retired)
  2. STANBOL-855 Add basic language support for Chinese
  3. STANBOL-854

Add optional support for Chinese via the Solr/lucene smartcn module

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • commons-0.11.0
    • Commons
    • None

    Description

      Create a Bundle for the Solr/Lucene Smart Chinese analyzer modules

      If this Bundle is installed to Stanbol than users will be able to use Solr Field configuration such as

      <!--
      CHINESE (http://wiki.apache.org/solr/LanguageAnalysis#Chinese.2C_Japanese.2C_Korean)
      This requires the

      {instanceDir}

      /lib/lucene-smartcn-3.6.1.jar file to be present
      -->
      <fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
      <tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
      <filter class="solr.SmartChineseWordTokenFilterFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
      <tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
      <filter class="solr.SmartChineseWordTokenFilterFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      <filter class="solr.PositionFilterFactory" />
      </analyzer>
      </fieldType>

      with the Stanbol Commons Solr Core module. This will also allow the Entityhub SolrYard to use indexes with fieldType definitions like that.

      Attachments

        Activity

          People

            rwesten Rupert Westenthaler
            rwesten Rupert Westenthaler
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: