Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The SPARQL query:
PREFIX enhancer: <http://fise.iks-project.eu/ontology/>
PREFIX dc: <http://purl.org/dc/terms/>
SELECT ? textAnnotation ?text ?entity ?entity_label ?confidence
WHERE
.
?textAnnotation enhancer:selected-text ?text .
OPTIONAL
}
ORDER BY
?text
gets very inefficient on the in-memory RDF model as returned by ContentItem.getMetadata.
On a Content enhanced with about 150 TextAnnotations and 200 EntityAnntoations the time to execute this query for all types supproted by the UI (Person, Organizations, Places, Concepts and Others) was about 20 seconds while the enhancement process required about 1 sec.
I know this part is only used for the HTTP of "http://
{stanbol-instance}/engines" and therefore does not influence the performance of the RESTful services.
However exactly this interface is usually the first contact point of - potential - users with Apache Stanbol therefore it is very likely that people get a very "wrong" impression about the performance of Stanbol if they try to parse longer texts that results in a lot of Enhancements.
Because of that I will replace the current implementation with an other one that does not require the use of SPARQL.