Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4589

Upgrade benchmark modules nekohtml and remove turkish HTML element lowercasing workaround!

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.0
    • Fix Version/s: 4.1, 6.0
    • Component/s: modules/benchmark
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      LUCENE-4220 added nekohtml as new parser for HTML files in benchamrk module. Unfortunately the nekohtml parser had the well known lowercase dotless-i bug when using the turkish locale.

      Version 1.9.17 of nekohtml fixes this bug and was released a few days ago (http://nekohtml.sourceforge.net/changes.html). This issue will update it and remove the workaround.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                thetaphi Uwe Schindler
                Reporter:
                thetaphi Uwe Schindler
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: