Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.4
-
None
-
None
Description
Symptom: searching for a string like a domain name containing a '.', the Solr 1.4 analyzer tells me that I will get a match, but when I enter the search either in the client or directly in Solr, the search fails.
test string: Identi.ca
queries that fail: IdentiCa, Identi.ca, Identi-ca
query that matches: Identi ca
schema in use is:
http://drupalcode.org/viewvc/drupal/contributions/modules/apachesolr/schema.xml?revision=1.1.2.1.2.34&content-type=text%2Fplain&view=co&pathrev=DRUPAL-6--1
Screen shots:
analysis: http://img.skitch.com/20100327-nt1uc1ctykgny28n8bgu99h923.png
dismax search: http://img.skitch.com/20100327-byiduuiry78caka7q5smsw7fp.png
dismax search: http://img.skitch.com/20100327-gckm8uhjx3t7px31ygfqc2ugdq.png
standard search: http://img.skitch.com/20100327-usqyqju1d12ymcpb2cfbtdwyh.png
Whether or not the bug appears is determined by the surrounding text:
"would be great to have support for Identi.ca on the follow block"
fails to match "Identi.ca", but putting the content on its own or in another sentence:
"Support Identi.ca"
the search matches. Testing suggests the word "for" is the problem, and it looks like the bug occurs when a stop word preceeds a word that is split up using the word delimiter filter.
Setting enablePositionIncrements="false" in the stop filter and reindexing causes the searches to match.
According to Mark Miller in #solr, this bug appears to be fixed already in Solr trunk, either due to the upgraded lucene or changes to the WordDelimiterFactory