Lucene - Core
  1. Lucene - Core
  2. LUCENE-5153

Allow wrapping Reader from AnalyzerWrapper

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.5, 6.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      It can be useful to allow AnalyzerWrapper extensions to wrap the Reader given to initReader, e.g. with a CharFilter.

      1. LUCENE-5153.patch
        5 kB
        Robert Muir
      2. LUCENE-5153.patch
        6 kB
        Shai Erera
      3. LUCENE-5153.patch
        1 kB
        Shai Erera
      4. LUCENE-5153.patch
        1 kB
        Shai Erera

        Activity

        Hide
        Shai Erera added a comment -

        simple patch

        Show
        Shai Erera added a comment - simple patch
        Hide
        Shai Erera added a comment -

        If there are no objections, I'll commit it.

        Show
        Shai Erera added a comment - If there are no objections, I'll commit it.
        Hide
        Robert Muir added a comment -

        One odd thing is that wrapComponents adds to the end of the TokenStream chain, but with this patch wrapReader inserts into the beginning of the charfilter chain.

        Not saying its wrong, but is it the right thing?

        Show
        Robert Muir added a comment - One odd thing is that wrapComponents adds to the end of the TokenStream chain, but with this patch wrapReader inserts into the beginning of the charfilter chain. Not saying its wrong, but is it the right thing?
        Hide
        Adrien Grand added a comment -

        I think this is the right thing? On the opposite, if wrapReader inserted char filters at the end of the charfilter chain, the behavior of the wrapper analyzer would be altered (it would allow to insert something between the first CharFilter and the last TokenFilter of the wrapped analyzer).

        Show
        Adrien Grand added a comment - I think this is the right thing? On the opposite, if wrapReader inserted char filters at the end of the charfilter chain, the behavior of the wrapper analyzer would be altered (it would allow to insert something between the first CharFilter and the last TokenFilter of the wrapped analyzer).
        Hide
        Shai Erera added a comment -

        I think this is the right thing?

        I tend to agree. If by wrapping we look at the wrapped object as a black box, then we should only allow intervention on its fronts – before its char filters and after its token stream.

        Show
        Shai Erera added a comment - I think this is the right thing? I tend to agree. If by wrapping we look at the wrapped object as a black box, then we should only allow intervention on its fronts – before its char filters and after its token stream.
        Hide
        Robert Muir added a comment -

        Sounds good to me! +1 to the patch, though we might want to add a test.

        Show
        Robert Muir added a comment - Sounds good to me! +1 to the patch, though we might want to add a test.
        Hide
        Hoss Man added a comment -

        FWIW: a recent thread on this very point...

        http://mail-archives.apache.org/mod_mbox/lucene-java-user/201306.mbox/%3CAD079BD2-E01E-4E00-B8F6-17594B6C4D5C@likeness.com%3E

        +1 to the wrapReader semantics in the patch.

        Show
        Hoss Man added a comment - FWIW: a recent thread on this very point... http://mail-archives.apache.org/mod_mbox/lucene-java-user/201306.mbox/%3CAD079BD2-E01E-4E00-B8F6-17594B6C4D5C@likeness.com%3E +1 to the wrapReader semantics in the patch.
        Hide
        Shai Erera added a comment -

        Added a test to TestingAnalyzers which is under lucene/analysis/common. Is there a suitable test under lucene/core?

        Also, now that someone can override either components or reader, maybe wrapComponents should also not be abstract, returning the passed components? Or we make both of them abstract?

        Another question (unrelated to this issue) – why do we need getWrappedAnalyzer vs taking the wrapped analyzer in the ctor, like all of our Filter classes do?

        Show
        Shai Erera added a comment - Added a test to TestingAnalyzers which is under lucene/analysis/common. Is there a suitable test under lucene/core? Also, now that someone can override either components or reader, maybe wrapComponents should also not be abstract, returning the passed components? Or we make both of them abstract? Another question (unrelated to this issue) – why do we need getWrappedAnalyzer vs taking the wrapped analyzer in the ctor, like all of our Filter classes do?
        Hide
        Robert Muir added a comment -

        I dont see the test in the patch... but I think it should be under lucene/core (and just wrap MockAnalyzer with MockCharFilter or something).

        I think its good to make wrapComponents just return the components as a default. This will make PerFieldAnalyzerWrapper look less stupid

        the getWrappedAnalyzer is explained by its javadocs. You might want a different analyzer for different fields.

        Show
        Robert Muir added a comment - I dont see the test in the patch... but I think it should be under lucene/core (and just wrap MockAnalyzer with MockCharFilter or something). I think its good to make wrapComponents just return the components as a default. This will make PerFieldAnalyzerWrapper look less stupid the getWrappedAnalyzer is explained by its javadocs. You might want a different analyzer for different fields.
        Hide
        Shai Erera added a comment -

        I dont see the test in the patch

        Hmm, I was sure I created a new patch. Will upload one soon, after I move the test under lucene/core.

        I think its good to make wrapComponents just return the components as a default.

        Ok, will do.

        the getWrappedAnalyzer is explained by its javadocs

        Duh, I should have read them before.

        Show
        Shai Erera added a comment - I dont see the test in the patch Hmm, I was sure I created a new patch. Will upload one soon, after I move the test under lucene/core. I think its good to make wrapComponents just return the components as a default. Ok, will do. the getWrappedAnalyzer is explained by its javadocs Duh, I should have read them before.
        Hide
        Shai Erera added a comment -

        Patch with discussed fixes and test.

        Show
        Shai Erera added a comment - Patch with discussed fixes and test.
        Hide
        Robert Muir added a comment -

        just a tiny improvement to the test (uses basetokenstream assert)

        Show
        Robert Muir added a comment - just a tiny improvement to the test (uses basetokenstream assert)
        Hide
        ASF subversion and git services added a comment -

        Commit 1508622 from Shai Erera in branch 'dev/trunk'
        [ https://svn.apache.org/r1508622 ]

        LUCENE-5153: Allow wrapping Reader from AnalyzerWrapper

        Show
        ASF subversion and git services added a comment - Commit 1508622 from Shai Erera in branch 'dev/trunk' [ https://svn.apache.org/r1508622 ] LUCENE-5153 : Allow wrapping Reader from AnalyzerWrapper
        Hide
        ASF subversion and git services added a comment -

        Commit 1508623 from Shai Erera in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1508623 ]

        LUCENE-5153: Allow wrapping Reader from AnalyzerWrapper

        Show
        ASF subversion and git services added a comment - Commit 1508623 from Shai Erera in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1508623 ] LUCENE-5153 : Allow wrapping Reader from AnalyzerWrapper
        Hide
        Shai Erera added a comment -

        Thanks Rob. I applied your improvement and committed.

        Show
        Shai Erera added a comment - Thanks Rob. I applied your improvement and committed.
        Hide
        Uwe Schindler added a comment -

        Thanks!

        Show
        Uwe Schindler added a comment - Thanks!
        Hide
        Adrien Grand added a comment -

        4.5 release -> bulk close

        Show
        Adrien Grand added a comment - 4.5 release -> bulk close

          People

          • Assignee:
            Shai Erera
            Reporter:
            Shai Erera
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development