-
Type:
Bug
-
Status: Closed
-
Priority:
Major
-
Resolution: Won't Fix
-
Affects Version/s: 2.3.1, 2.3.2
-
Fix Version/s: None
-
Component/s: modules/analysis
-
Labels:None
-
Environment:
Java 5
-
Lucene Fields:New
As of 2.3.1 the documentation for the StandardTokenizer states that it "Recognizes email addresses and internet hostnames as one token."
However hostnames such as "my-host.com" are recognized as two tokens "my" and "host.com".
Any host with a dash in the name is not recognized properly.
- is related to
-
LUCENE-1373 Most of the contributed Analyzers suffer from invalid recognition of acronyms.
-
- Resolved
-