Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description


      On Solr trunk, all CharFilters have been non-functional since LUCENE-3396 was committed in r1175297 on 25 Sept 2011, until Yonik's fix today in r1235810; Solr 3.x was not affected - CharFilters have been working there all along.

        Issue Links

          Activity

          Mike Hugo created issue -
          Mike Hugo made changes -
          Field Original Value New Value
          Attachment htmlstripfilter_test.patch [ 12511724 ]
          Hide
          Robert Muir added a comment -

          According to the mailing thread, this has nothing to do with htmlstripcharfilter.

          Show
          Robert Muir added a comment - According to the mailing thread, this has nothing to do with htmlstripcharfilter.
          Robert Muir made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Not A Problem [ 8 ]
          Hide
          Mike Hugo added a comment -

          Robert, does the attached test case work for you? If not, do you know how I can get it running the way it used to?

          Show
          Mike Hugo added a comment - Robert, does the attached test case work for you? If not, do you know how I can get it running the way it used to?
          Hide
          Mike Hugo added a comment -

          Also - I'm happy to change the title if this isn't related to HTMLStripCharFilterFactory, I'm just seeing that the behavior we saw in Solr3.x is different (with the same configuration) in Solr4 - just trying to track down how we get it to work the way it used to! Thanks for the help!

          Show
          Mike Hugo added a comment - Also - I'm happy to change the title if this isn't related to HTMLStripCharFilterFactory, I'm just seeing that the behavior we saw in Solr3.x is different (with the same configuration) in Solr4 - just trying to track down how we get it to work the way it used to! Thanks for the help!
          Hide
          Yonik Seeley added a comment -

          OK, so it looks like all CharFilters were broken in Solr by LUCENE-3396 (since last Sept!).
          I just checked in a fix and added a test.
          Thanks for bringing this to our attention Mike!

          Show
          Yonik Seeley added a comment - OK, so it looks like all CharFilters were broken in Solr by LUCENE-3396 (since last Sept!). I just checked in a fix and added a test. Thanks for bringing this to our attention Mike!
          Yonik Seeley made changes -
          Resolution Not A Problem [ 8 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Yonik Seeley made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Fix Version/s 4.0 [ 12314025 ]
          Resolution Fixed [ 1 ]
          Hide
          Steve Rowe added a comment -

          I just checked in a fix and added a test.

          Mike's test succeeds for me with the fix.

          Thanks Yonik.

          Show
          Steve Rowe added a comment - I just checked in a fix and added a test. Mike's test succeeds for me with the fix. Thanks Yonik.
          Hide
          Steve Rowe added a comment -

          I committed BasicFunctionalityTest.testHTMLStrip() to branch_3x - it succeeds for me with no changes required.

          Show
          Steve Rowe added a comment - I committed BasicFunctionalityTest.testHTMLStrip() to branch_3x - it succeeds for me with no changes required.
          Hide
          Hoss Man added a comment -

          updated summary and description

          Show
          Hoss Man added a comment - updated summary and description
          Hoss Man made changes -
          Summary HTMLStripCharFilterFactory behavior is different in Solr4 than it was in Solr 3.x CharFilters not being invoked in Solr
          Assignee Yonik Seeley [ yseeley@gmail.com ]
          Description In Solr3, using the attached configuration, HTML entities like trademark and registered were being stripped (and NOT indexed) using the HTMLStripCharFilterFactory. In Solr4 it looks like those values are still making it through to the index and are then appearing in faceted results (we'd like them not to)

          see http://lucene.472066.n3.nabble.com/HTMLStripCharFilterFactory-not-working-in-Solr4-td3685599.html for background

          possibly related to this https://issues.apache.org/jira/browse/LUCENE-3690


          On Solr trunk, *all* CharFilters have been non-functional since LUCENE-3396 was committed in r1175297 on 25 Sept 2011, until Yonik's fix today in r1235810; Solr 3.x was not affected - CharFilters have been working there all along.
          Hoss Man made changes -
          Link This issue is broken by LUCENE-3396 [ LUCENE-3396 ]
          Uwe Schindler made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Yonik Seeley
              Reporter:
              Mike Hugo
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development