Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
4.0
-
None
-
New
Description
LUCENE-4220 added nekohtml as new parser for HTML files in benchamrk module. Unfortunately the nekohtml parser had the well known lowercase dotless-i bug when using the turkish locale.
Version 1.9.17 of nekohtml fixes this bug and was released a few days ago (http://nekohtml.sourceforge.net/changes.html). This issue will update it and remove the workaround.
Attachments
Attachments
Issue Links
- relates to
-
LUCENE-4220 Replace benchmarks crazy HTML parser by a nekohtml 10-liner
- Closed