Uploaded image for project: 'Stanbol (Retired)'
  1. Stanbol (Retired)
  2. STANBOL-509

TextAnnotations should use PlainLiterals instead of TypesLiterals for the selected-text and context

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.9.0-incubating
    • Enhancer
    • None

    Description

      Currently all EnhancementEngines that create TextAnnotations use TypedLiterals of the type xsd:string for values of the fise:selected-text and fise:context properties. However both values are in fact natural language text therefore it would be better to use PlainLiterals and also add the langage as detected for the parsed content.

      Example:

      parsed Content: "The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley."
      Detected lanauge: "en"
      Text Annotations: "Paris" and "Bob Marley"

      currently the selection context and the selected-text would be represented like:

      <fise:selection-context rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley.</j.7:selection-context>
      <fise:selected-text rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Bob Marley</j.7:selected-text>

      after this issue is resolved the same information would be represented like

      <fise:selection-context xml:lang="en">The Stanbol enhancer can detect famous cities such as Paris and people such as Bob Marley.</j.7:selection-context>
      <fise:selected-text xml:lang="en">Bob Marley</j.7:selected-text>

      Advantages:

      • The suggested representation is more in line with the semantic meaning
      • Engines that consume text selections could use the language as provided by current TextAnnotation. This would allow to correctly search for entities in documents containing parts in multiple languages.
      • Still such engines could use the language annotation for the document as fallback if no language is provided by TextAnnotations (backward compatibility)

      Attachments

        Activity

          People

            rwesten Rupert Westenthaler
            rwesten Rupert Westenthaler
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: