Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
STANBOL-607 introduced that natural language constraints containing of multiple words are encoded using "Frankfurt am Main" instead of (Frankfurt AND am AND Main).
However the implementation does not correctly put "quotes" around multi word tokens
Because of that a query for the rdfs:label "Frankfurt am Main" is encoded as
(_!@/rdfs\:label/:Frankfurt am Main)
instead of
(_!@/rdfs\:label/:"Frankfurt am Main")
resulting in Solr to search for
- "Frankfurt" in the values of rdfs:label OR
- "am" in the full text field OR
- "Main" in the full text field
instead of "Frankfurt am Main" in the values of rdfs:label.
Sadly all unit test passes because for the used DBpedia test data Solr ranking "ensures" that the wrongly encoded query has the same result as a correctly encoded one.
However on bigger data sets with more data in the full text field this really has a big impact on query results.
NOTE: the release 0.9.0-incubating version is NOT affected by this as this was only introduced in the trunk while working on 0.10.0!
Attachments
Issue Links
- is broken by
-
STANBOL-607 SolrYard should use quotes instead of AND for multi word TextConstraints
- Closed