[STANBOL-1157] Freebase Disambiguation Algorithm - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: Enhancement Engines, Enhancer, Entityhub
Labels:
None

Description

The disambiguation algorithm should take into account a local disambiguation score (comparing in some way the document context with the contexts provided by Wikilinks resource) and a global disambiguation score computed by a graph based algorithm using the Freebase graph imported in a Neo4j database. Each disambiguation score would have a different weight in the final disambiguation store for each entity. The algorithm's steps, for each TextAnnotation, can be the following:

1. Local score: for each EntityAnnotation, retrieves from Wikilinks database all the contexts associated to the referenced entity. Compare (similarity, distance....) the mention context (selected-context) with the wikilinks contexts.

2. Global score: build a subgraph with all the possible entities and its relations in Freebase. Extract a set of possibles solutions from such graph (note: a solution should include only one entity annotation for each text annotation). Compute the Dijsktra distance between each pair of entities belonging to a possible solution.

3. Weights normalization and confidence values refinement.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

gsoc-freebase-disambiguation-engine-1.0-SNAPSHOT.zip
27/Sep/13 12:17
32 kB
Antonio David Pérez Morales

Activity

People

Assignee:: Unassigned

Reporter:: Rafa Haro

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Sep/13 16:28

Updated:: 03/Oct/13 15:12

Resolved:: 03/Oct/13 15:12

Time Tracking

Estimated:

672h

Remaining:

672h

Logged:

Not Specified