Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Entity Co-Mention Engine
======
The goal of this engine is to extract co-mentions of Entities already detected for an document. The typical example are persons only mentioned by their family name after an initial mention with the full name e.g.
... Barack Obama gave a talk to members of the Labor Union ... Obama specially mentioned ...
But also alternate names used to refer to Entities might be used for extracting co-mentions.
NOTE: that this Engine does not use NLP level co-reference (e.g. linking a Pronoun with the Entity it stands for).
Implementation
This Engine will be implemented based on existing Entity linking functionality as implemented by the EntityLinkingEngine. The main difference is that an in-memory EntityMentionIndex will be used as controlled vocabulary to link against. This EntityMentionIndex will implement the EntitySearcher interface as used by the EntityLinkingEngine to search for Entities.
The EntityMentionIndex will contain both fise:TextAnnotations (such as NamedEntities) as well as fise:EntityAnnotations (entity suggested for fise:TextAnnotations).
Writing results of the co-mention extraction will involve
- creating new fise:TextAnnotations with suggested fise:EntityAnnotation (e.g. for additional mentions not previously detected by any other engine)
- modification of existing Suggestions for fise:TextAnnotations (e.g. if 'Sevenson' was linked with "Svenson" (http://rdf.freebase.com/ns/m.0n5rh_s), a fictional character from the 1930 film The Silver Horde but "Peter Svenson" (http://rdf.freebase.com/ns/m.05wxvv9) an Author was already earlier mentioned in the document - the later would be added as additional suggestion to an existing TextAnnotation and also confidence values would be adapted accordingly.
- creation of relations between enhancements to express entity co-mentions (most likely dc:relation from the co-mention to the initial mention of an Entity.
Attachments
Issue Links
- is broken by
-
STANBOL-1091 EntityLinking Engine should not process the same tokens twice
- Resolved
- is related to
-
STANBOL-1179 Extend Entity Co-Mention engine to support alternate names of detected Entities
- Open