Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.5, 4.0-ALPHA
    • 3.6, 4.0-ALPHA
    • modules/analysis
    • None
    • New, Patch Available

    Description

      A JFlex-based HTMLStripCharFilter replacement would be more performant and easier to understand and maintain.

      Attachments

        1. BaselineWarcTest.java
          3 kB
          Steven Rowe
        2. HTMLStripCharFilterWarcTest.java
          4 kB
          Steven Rowe
        3. jenkins_test.patch
          1 kB
          Robert Muir
        4. JFlexHTMLStripCharFilterWarcTest.java
          4 kB
          Steven Rowe
        5. LUCENE-3690.patch
          2.41 MB
          Steven Rowe
        6. LUCENE-3690.patch
          2.36 MB
          Steven Rowe
        7. LUCENE-3690.patch
          924 kB
          Steven Rowe
        8. LUCENE-3690.patch
          230 kB
          Steven Rowe
        9. LUCENE-3690.patch
          229 kB
          Steven Rowe
        10. LUCENE-3690-handle-utf16-surrogates.patch
          13 kB
          Steven Rowe

        Issue Links

          Activity

            People

              sarowe Steven Rowe
              sarowe Steven Rowe
              Votes:
              3 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment