Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
With more and more Enhancement Engines managed by the Stanbol Community the currently used default Enhancement Chain (all active EnhancementEngines) becomes more and more problematic.
To give only some recent examples:
STANBOL-707a 2nd language identification engine was added. resulting in the fact that now two Engines add language annotations with the current default chain.
STANBOL-706will bring support for DBpedia Spotlight. This includes Engines for Spotting, Entity Candidates and full DBpedia Spotlight annotations. With the current default chain all those Engines would be included (typically one would only want one of those Engines within a single Enhancement Chain). In addition results of those Engines would be expected to be mostly duplicates to those produced by the NER and EntityTagging Engine working with the DBpedia default data included with the Stanbol Launcher.
To work around that the proposal is to:
1. explicitly configure the default EnhancementChain used by the Stanbol Launchers
2. keep the current default chain - that includes all active EnhancementEngines - but ensure that this is not used as default. "all-active" should be used as name for this chain.
Those configuration changes should be provided by the "org.apache.stanbol.data.defaultconfig" module.
Default Chain configuration:
The Default Chain configuration should include the following Engines
metaxa;optional
tika;optional
langid
ner
dbpediaLinking
entityhubExtraction
this represents the typical configuration as it was already with the 0.9.0-incubating relase
Attachments
Issue Links
- blocks
-
STANBOL-706 DBpedia Spotlight EnhancementEngines integration
- Closed
- relates to
-
STANBOL-707 Language detection for CJK languages
- Closed