Solr
  1. Solr
  2. SOLR-3359

SynonymFilterFactory should accept analyzer attribute

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      I've not been realized that CJKTokenizer and its factory classes was marked deprecated in 3.6/4.0 (the ticket is LUCENE-2906) until someone talked to me.

       * @deprecated Use StandardTokenizer, CJKWidthFilter, CJKBigramFilter, and LowerCaseFilter instead.           
      

      I agree with the idea of using the chain of the Tokenizer and TokenFilters instead of CJKTokenizer, but it could be a problem for the existing users of SynonymFilterFactory with CJKTokenizerFactory.

      So this ticket comes to my mind again.

        Activity

        Hide
        Hoss Man added a comment -

        bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

        Show
        Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
        Hide
        Robert Muir added a comment -

        rmuir20120906-bulk-40-change

        Show
        Robert Muir added a comment - rmuir20120906-bulk-40-change
        Hide
        Hoss Man added a comment -

        There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion.

        Show
        Hoss Man added a comment - There is no indication that anyone is actively working on this issue, so removing 4.0 from the fixVersion.
        Hide
        Ryo Onodera added a comment -

        Hi, I wrote a patch indirectly fixing this bug.

        As far as I read the source code, it seems that making SynonymFilterFactory accept fieldType is hard to realize.

        However, I want to use SynonymFilterFactory with CJK-friendly bigram.

        So, I made SynonymFilterFactory accept analyzer attribute so that I can specify CJKAnalyzer.

        Show
        Ryo Onodera added a comment - Hi, I wrote a patch indirectly fixing this bug. As far as I read the source code, it seems that making SynonymFilterFactory accept fieldType is hard to realize. However, I want to use SynonymFilterFactory with CJK-friendly bigram. So, I made SynonymFilterFactory accept analyzer attribute so that I can specify CJKAnalyzer.
        Hide
        Koji Sekiguchi added a comment -

        So, I made SynonymFilterFactory accept analyzer attribute so that I can specify CJKAnalyzer.

        I've never thought up analyzer. Interesting idea.

        Show
        Koji Sekiguchi added a comment - So, I made SynonymFilterFactory accept analyzer attribute so that I can specify CJKAnalyzer. I've never thought up analyzer. Interesting idea.
        Hide
        Ryo Onodera added a comment -

        Thanks for quick reply!

        My patch is working as expected in my test environment.

        How could I do for my patch to be included into the trunk from now on?

        Show
        Ryo Onodera added a comment - Thanks for quick reply! My patch is working as expected in my test environment. How could I do for my patch to be included into the trunk from now on?
        Hide
        Koji Sekiguchi added a comment -

        When I opened the ticket, I thought SynonymFilterFactory should accept (Solr's) fieldType attribute as I told in the title.

        But today, as SynonymFilterFactory is in Lucene land, I think analyzer attribute is more natural than (Solr's) fieldType attribute.

        I'd like to commit the patch in a few days if no one objects.

        Show
        Koji Sekiguchi added a comment - When I opened the ticket, I thought SynonymFilterFactory should accept (Solr's) fieldType attribute as I told in the title. But today, as SynonymFilterFactory is in Lucene land, I think analyzer attribute is more natural than (Solr's) fieldType attribute. I'd like to commit the patch in a few days if no one objects.
        Hide
        ASF subversion and git services added a comment -

        Commit 1504037 from Koji Sekiguchi in branch 'dev/trunk'
        [ https://svn.apache.org/r1504037 ]

        SOLR-3359: add analyzer attribute/property to SynonymFilterFactory

        Show
        ASF subversion and git services added a comment - Commit 1504037 from Koji Sekiguchi in branch 'dev/trunk' [ https://svn.apache.org/r1504037 ] SOLR-3359 : add analyzer attribute/property to SynonymFilterFactory
        Hide
        ASF subversion and git services added a comment -

        Commit 1504055 from Koji Sekiguchi in branch 'dev/trunk'
        [ https://svn.apache.org/r1504055 ]

        SOLR-3359: oops. This is for 5.0

        Show
        ASF subversion and git services added a comment - Commit 1504055 from Koji Sekiguchi in branch 'dev/trunk' [ https://svn.apache.org/r1504055 ] SOLR-3359 : oops. This is for 5.0
        Hide
        Koji Sekiguchi added a comment -

        Thanks, Onodera-san!

        Show
        Koji Sekiguchi added a comment - Thanks, Onodera-san!
        Hide
        Jack Krupansky added a comment -

        Is there any particular reason why this enhancement is not targeted at 4.x as well?

        Also, could the title summary be updated to reflect the fact that the change specifies the analyzer class name rather than "fieldType"?

        Show
        Jack Krupansky added a comment - Is there any particular reason why this enhancement is not targeted at 4.x as well? Also, could the title summary be updated to reflect the fact that the change specifies the analyzer class name rather than "fieldType"?
        Hide
        Koji Sekiguchi added a comment -

        Is there any particular reason why this enhancement is not targeted at 4.x as well?

        Well, my motivation was that CJKTokenizer(Factory) marked as deprecated and it would be gone at 5.0. If someone provide a patch for 4.x, I'm happy to commit it.

        Also, could the title summary be updated to reflect the fact that the change specifies the analyzer class name rather than "fieldType"?

        Done.

        Show
        Koji Sekiguchi added a comment - Is there any particular reason why this enhancement is not targeted at 4.x as well? Well, my motivation was that CJKTokenizer(Factory) marked as deprecated and it would be gone at 5.0. If someone provide a patch for 4.x, I'm happy to commit it. Also, could the title summary be updated to reflect the fact that the change specifies the analyzer class name rather than "fieldType"? Done.
        Hide
        Ryo Onodera added a comment -

        Thank you very much, Koji-san!!

        I'm very happy for your quick response.

        Show
        Ryo Onodera added a comment - Thank you very much, Koji-san!! I'm very happy for your quick response.
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Koji Sekiguchi
            Reporter:
            Koji Sekiguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development