Details

    Description

      The disambiguation algorithm should take into account a local disambiguation score (comparing in some way the document context with the contexts provided by Wikilinks resource) and a global disambiguation score computed by a graph based algorithm using the Freebase graph imported in a Neo4j database. Each disambiguation score would have a different weight in the final disambiguation store for each entity. The algorithm's steps, for each TextAnnotation, can be the following:

      1. Local score: for each EntityAnnotation, retrieves from Wikilinks database all the contexts associated to the referenced entity. Compare (similarity, distance....) the mention context (selected-context) with the wikilinks contexts.

      2. Global score: build a subgraph with all the possible entities and its relations in Freebase. Extract a set of possibles solutions from such graph (note: a solution should include only one entity annotation for each text annotation). Compute the Dijsktra distance between each pair of entities belonging to a possible solution.

      3. Weights normalization and confidence values refinement.

      Attachments

        1. gsoc-freebase-disambiguation-engine-1.0-SNAPSHOT.zip
          32 kB
          Antonio David Pérez Morales

        Activity

          People

            Unassigned Unassigned
            rafaharo Rafa Haro
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 672h
                672h
                Remaining:
                Remaining Estimate - 672h
                672h
                Logged:
                Time Spent - Not Specified
                Not Specified