[STANBOL-854] Add optional support for Chinese via the Solr/lucene smartcn module - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: commons-0.11.0
Component/s: Commons
Labels:
None

Description

Create a Bundle for the Solr/Lucene Smart Chinese analyzer modules

If this Bundle is installed to Stanbol than users will be able to use Solr Field configuration such as

<!--
CHINESE (http://wiki.apache.org/solr/LanguageAnalysis#Chinese.2C_Japanese.2C_Korean)
This requires the

{instanceDir}

/lib/lucene-smartcn-3.6.1.jar file to be present
-->
<fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
<filter class="solr.SmartChineseWordTokenFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.SmartChineseSentenceTokenizerFactory"/>
<filter class="solr.SmartChineseWordTokenFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.PositionFilterFactory" />
</analyzer>
</fieldType>

with the Stanbol Commons Solr Core module. This will also allow the Entityhub SolrYard to use indexes with fieldType definitions like that.

Attachments

Activity

People

Assignee:: Rupert Westenthaler

Reporter:: Rupert Westenthaler

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 22/Dec/12 06:12

Updated:: 22/Jan/13 12:17

Resolved:: 22/Dec/12 12:43