[STANBOL-1121] Event extraction Enhancement Engine - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Enhancement Engines
Labels:
- extraction
- triple

Description

Functionality
=========

Develop an Enhancement Engine which would construct a formal knowledge representation from natural language text. The knowledge extracted from the text would be in the form of Triples (Subject-Verb-Object). This Enhancement Engine will be mainly concerned with representation of real-world events.

Example :
We have the text "Google buys Youtube". Google=Subject, buys=verb, Youtube=object.

Implementation
===========

Triple Extraction
-----------------------
The following will be applied on the natural language text in order to extract the triples:
+ Named entity extraction
+ Co-reference resolution of those named entites
+ POS Tagging or dependency trees to figure out what verbs and object are in conjunction to the named entities.

Based on the last step we would have the set of triples.

Formal representation of triples
---------------------------------------------
The formal representation of the triples will be based on the DOLCE foundational ontology. We will have the following data structures :

fise:SettingAnnotation
{fise:Enhancement}
metadata
describes the context of the data

fise:ParticipantAnnotation
{fise:Enhancement}
metadata
fise:inSetting {settingAnnotation}
* fise:hasMention {textAnnotation}
* fise:suggestion {entityAnnotation} (multiple if there are more
suggestions)
* dc:type one of fise:Agent, fise:Patient, fise:Instrument, fise:Cause
describes the participants from the context. In our example these would be "Google" and "Youtube". In Dolce ontology these would be the Endurants.

* fise:OccurrentAnnotation
* {fise:Enhancement} metadata
* fise:inSetting {settingAnnotation}
fise:hasMention {textAnnotation}
dc:type set to fise:Activity
*??:hasRelations (describes the particpants linked to this occurent - TBD)
describes the action made by the participants. In our example this would be "buys". In Dolce ontology this would be the Perdurant.

For further information see also the Mail Thread related to this Issue: http://markmail.org/message/qed6y5avbymvmmgu

Attachments

Issue Links

depends upon

STANBOL-1132 Add co-reference resolution and dependency tree support in the Stanbol NLP processing API

Resolved

STANBOL-1133 Extend the Stanford NLP API with support for creating coref resolution and dependency tree info

Closed

relates to

STANBOL-1295 YAGO Integration in Stanbol

Resolved

Activity

People

Assignee:: Rupert Westenthaler

Reporter:: Cristian Aurelian Petroaca

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Jun/13 11:26

Updated:: 22/Jul/15 08:54