[STANBOL-1229] Convert all OpenNLP Enhancement Engines to Configuration Factories - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.12.0
Fix Version/s: 0.12.0
Component/s: Enhancement Engines
Labels:
None

Description

Currently the OpenNLP Sentence Detection and Tokenizer Enhancement Engines do not support OSGI Configuration Factories. Because of that they do only allow a single instance.

However this can create problems if one wants to configure multiple Enhancement Chains with different NLP frameworks.

Here an example

Chain1:

OpenNLP for English, German and Spanish

Chain2:

Stanford NLP for English
OpenNLP for German
Freeling NLP for Spanish

As OpenNLP does support all three mentioned languages a user would like to configure the following Engines configurations for OpenNLP:

1. OpenNLP engines for sentence detection, tokenization, POS tagging and Chunking that include all three languages.
2. OpenNLP engines that only process German language texts for sentence detection, tokenization, POS tagging and Chunking
3. RESTful NLP Analysis Engine calling StanfordNLP for English language texts
4. RESTful NLP Analysis Engine calling Freeling for Spanish language texts

Chain1 would use the OpenNLP engines configured to process all languages while Chain 2 would use the engine configurations listed under point 2 to 4.

However as the OpenNLP Tokenizer and Sentence detection engine do not support OSGI Configuration Factories this is currently not possible as only a single Engine instance of those two engines can be configured.

Because of that English and Spanish Text sent to Chain2 would be processed by two Sentence Detectors and Tokenizers and this results in duplicate Sentence and Token annotations.

Adding support for OSGI Configuration Factories to all OpenNLP EnhancementEngines will solve this issue. Existing Configurations will be not affected as all engines do already use "ConfigurationPolicy.OPTIONAL" - meaning that a default instance with the default configuration is created automatically.

This Issues affects both the trunk as well as the 0.12 releasing branch

Attachments

Activity

People

Assignee:: Rupert Westenthaler

Reporter:: Rupert Westenthaler

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 03/Dec/13 08:02

Updated:: 03/Dec/13 09:14

Resolved:: 03/Dec/13 09:14