Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7355

Leverage MultiTermAwareComponent in query parsers

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.2, 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      MultiTermAwareComponent is designed to make it possible to do the right thing in query parsers when in comes to analysis of multi-term queries. However, since query parsers just take an analyzer and since analyzers do not propagate the information about what to do for multi-term analysis, query parsers cannot do the right thing out of the box.

      1. LUCENE-7355.patch
        154 kB
        Adrien Grand
      2. LUCENE-7355.patch
        154 kB
        Adrien Grand
      3. LUCENE-7355.patch
        154 kB
        Adrien Grand
      4. LUCENE-7355.patch
        310 kB
        Adrien Grand
      5. LUCENE-7355.patch
        149 kB
        Adrien Grand
      6. LUCENE-7355.patch
        149 kB
        Adrien Grand
      7. LUCENE-7355.patch
        16 kB
        Adrien Grand
      8. LUCENE-7355.patch
        17 kB
        Adrien Grand

        Issue Links

          Activity

          Hide
          jpountz Adrien Grand added a comment -

          I propose the following plan:

          • add TokenStream tokenStreamMultiTerm(String fieldName, String text) to Analyzer.
          • change Analyzer.createComponents to take an additional boolean multiTerm parameter to know which parts of the analysis chain it should use when analyzing multi-term queries. For instance, the standard analyzer would apply a keyword tokenizer rather than a standard tokenizer, and only apply the standard and lowercase filters (no stop words). CustomAnalyzer would only apply the factories that implement MultiTermAwareComponent and pass them through MultiTermAwareComponent.getMultiTermComponent().
          • change query parsers to call tokenStreamMultiTerm rather than tokenStream when analyzing text for wildcard, regexp or fuzzy queries.
          Show
          jpountz Adrien Grand added a comment - I propose the following plan: add TokenStream tokenStreamMultiTerm(String fieldName, String text) to Analyzer . change Analyzer.createComponents to take an additional boolean multiTerm parameter to know which parts of the analysis chain it should use when analyzing multi-term queries. For instance, the standard analyzer would apply a keyword tokenizer rather than a standard tokenizer, and only apply the standard and lowercase filters (no stop words). CustomAnalyzer would only apply the factories that implement MultiTermAwareComponent and pass them through MultiTermAwareComponent.getMultiTermComponent() . change query parsers to call tokenStreamMultiTerm rather than tokenStream when analyzing text for wildcard, regexp or fuzzy queries.
          Hide
          jpountz Adrien Grand added a comment -

          Here is what the above plan would look like on Analyzer/StandardAnalyzer/CustomAnalyzer. Please comment if you do not like the idea or if you have suggestions as it would take time to update all analyzers.

          Show
          jpountz Adrien Grand added a comment - Here is what the above plan would look like on Analyzer/StandardAnalyzer/CustomAnalyzer. Please comment if you do not like the idea or if you have suggestions as it would take time to update all analyzers.
          Hide
          rcmuir Robert Muir added a comment -

          Instead of passing a boolean to createComponents, can we just have a separate method? This would avoid lots of if-then-else logic (which is ripe for bugs).

          Show
          rcmuir Robert Muir added a comment - Instead of passing a boolean to createComponents, can we just have a separate method? This would avoid lots of if-then-else logic (which is ripe for bugs).
          Hide
          jpountz Adrien Grand added a comment -

          Thanks for having a look. Does it look better this way? I also made Analyzer hold 2 storedValue s to make ReusableStrategy less complicated.

          Show
          jpountz Adrien Grand added a comment - Thanks for having a look. Does it look better this way? I also made Analyzer hold 2 storedValue s to make ReusableStrategy less complicated.
          Hide
          rcmuir Robert Muir added a comment -

          OK, my other suggestion would be to default the implementation to keywordtokenizer. This is already what is happening today, and I feel this is corner case functionality, we shouldn't make it any harder to make a new analyzer?

          Show
          rcmuir Robert Muir added a comment - OK, my other suggestion would be to default the implementation to keywordtokenizer. This is already what is happening today, and I feel this is corner case functionality, we shouldn't make it any harder to make a new analyzer?
          Hide
          jpountz Adrien Grand added a comment -

          This sounds good to me.

          Show
          jpountz Adrien Grand added a comment - This sounds good to me.
          Hide
          thetaphi Uwe Schindler added a comment -

          Hi,
          I have to think about this! Do we really need to change Analyzer's API? To me it sounds a bit strange to replace the Tokenizer by KeywordTokenizer by default...

          Show
          thetaphi Uwe Schindler added a comment - Hi, I have to think about this! Do we really need to change Analyzer's API? To me it sounds a bit strange to replace the Tokenizer by KeywordTokenizer by default...
          Hide
          jpountz Adrien Grand added a comment -

          We want a way to tell the analyzer to normalize a piece of text, so it should not tokenize (this is why it replaces the tokenizer) and apply all normalization filters (lowercasing, ascii folding, etc.) but not transformations (stop word removal, stemming, etc.). I don't think we can do it without adding a new API to the Analyzer class (or at least a parameter to an existing method)? The main use-case is the parsing of multi-term queries in query parsers. Once we have such an API, query parsers would not need the lowercaseExpandedTerms parameter as they could directly use this new method that would do the right thing out of the box, including not only lowercasing but also eg. ascii folding, which is something that there is no way to do currently. Now that I am thinking about it more, I don't think we need the low-level TokenStream API as a return value for this new method, so maybe we could make it just String normalize(String field, String text). That would probably make it easier to use?

          Show
          jpountz Adrien Grand added a comment - We want a way to tell the analyzer to normalize a piece of text, so it should not tokenize (this is why it replaces the tokenizer) and apply all normalization filters (lowercasing, ascii folding, etc.) but not transformations (stop word removal, stemming, etc.). I don't think we can do it without adding a new API to the Analyzer class (or at least a parameter to an existing method)? The main use-case is the parsing of multi-term queries in query parsers. Once we have such an API, query parsers would not need the lowercaseExpandedTerms parameter as they could directly use this new method that would do the right thing out of the box, including not only lowercasing but also eg. ascii folding, which is something that there is no way to do currently. Now that I am thinking about it more, I don't think we need the low-level TokenStream API as a return value for this new method, so maybe we could make it just String normalize(String field, String text) . That would probably make it easier to use?
          Hide
          thetaphi Uwe Schindler added a comment -

          I don't think we need the low-level TokenStream API as a return value for this new method, so maybe we could make it just String normalize(String field, String text). That would probably make it easier to use?

          I was thinking about the same. Then we won't even need a KeywordTokenizer! We could just populate the termAttribute with the full term and call the filters. This would allow to remove the dependency to analysis-common from Analyzer (core). Just use the one from the document/field API to generate a single-value tokenstream (we use it for non-tokenized fields). Of course this can only work if the tokenfilters don't split terms, which a multi-term aware filter should never do.

          These are just thoughts! We can implement the normalize method (like tokenStream) final taking a string and returning a string.

          Show
          thetaphi Uwe Schindler added a comment - I don't think we need the low-level TokenStream API as a return value for this new method, so maybe we could make it just String normalize(String field, String text). That would probably make it easier to use? I was thinking about the same. Then we won't even need a KeywordTokenizer! We could just populate the termAttribute with the full term and call the filters. This would allow to remove the dependency to analysis-common from Analyzer (core). Just use the one from the document/field API to generate a single-value tokenstream (we use it for non-tokenized fields). Of course this can only work if the tokenfilters don't split terms, which a multi-term aware filter should never do. These are just thoughts! We can implement the normalize method (like tokenStream) final taking a string and returning a string.
          Hide
          jpountz Adrien Grand added a comment -

          This sounded appealing so I gave it a try but I hit a couple problems:

          • some analyzers need to apply char filters too, so we cannot expect to have a String in all cases we need some sort of KeywordTokenizer
          • some consumers need to get the binary representation of terms, which depends on the AttributeFactory (LUCENE-4176). So maybe we should return a TokenStream rather than a String an let consumers decide whether they want to add a CharTermAttribute or a TermToBytesRefAttribute. Is there a better option?
          Show
          jpountz Adrien Grand added a comment - This sounded appealing so I gave it a try but I hit a couple problems: some analyzers need to apply char filters too, so we cannot expect to have a String in all cases we need some sort of KeywordTokenizer some consumers need to get the binary representation of terms, which depends on the AttributeFactory ( LUCENE-4176 ). So maybe we should return a TokenStream rather than a String an let consumers decide whether they want to add a CharTermAttribute or a TermToBytesRefAttribute. Is there a better option?
          Hide
          jpountz Adrien Grand added a comment -

          I think I have something better now:

          • the method is BytesRef normalize(String field, String text), it can be configured with a subset of the char filters / token filters of the default analysis chain, and uses the same AttributeFactory as the default analysis chain
          • setLowerCaseExpandedTerms has been removed from query parsers, which now use Analyzer.normalize to process range/prefix/fuzzy/wildcard/regexp queries
          • AnalyzingQueryParser and the classic QueryParser have been merged together
          • both SimpleQueryParser and the classic QueryParser now work with a non-default AttributeFactory that eg. uses a different encoding for terms (it was only the case before for wildcard queries and the classic QueryParser when analyzeRangeTerms was true). Other query parsers could be fixed too but it will require more work as they are using String representations for terms rather than binary.
          Show
          jpountz Adrien Grand added a comment - I think I have something better now: the method is BytesRef normalize(String field, String text) , it can be configured with a subset of the char filters / token filters of the default analysis chain, and uses the same AttributeFactory as the default analysis chain setLowerCaseExpandedTerms has been removed from query parsers, which now use Analyzer.normalize to process range/prefix/fuzzy/wildcard/regexp queries AnalyzingQueryParser and the classic QueryParser have been merged together both SimpleQueryParser and the classic QueryParser now work with a non-default AttributeFactory that eg. uses a different encoding for terms (it was only the case before for wildcard queries and the classic QueryParser when analyzeRangeTerms was true). Other query parsers could be fixed too but it will require more work as they are using String representations for terms rather than binary.
          Hide
          rcmuir Robert Muir added a comment -

          I think normalize should call `end` for consistency? Its defined on TokenStream, and its going to always be called in the ordinary case, so its strange if "for wildcards" its not called, i can see bugs from that.

          Show
          rcmuir Robert Muir added a comment - I think normalize should call `end` for consistency? Its defined on TokenStream, and its going to always be called in the ordinary case, so its strange if "for wildcards" its not called, i can see bugs from that.
          Hide
          jpountz Adrien Grand added a comment -

          Updated patch that calls TokenStream.end() in Analyzer.normalize().

          Show
          jpountz Adrien Grand added a comment - Updated patch that calls TokenStream.end() in Analyzer.normalize().
          Hide
          jpountz Adrien Grand added a comment -

          Any objections to the latest patch?

          Show
          jpountz Adrien Grand added a comment - Any objections to the latest patch?
          Hide
          dsmiley David Smiley added a comment -

          I suspect there may be a problem with your regexp:

          Pattern.compile("(\\.)|([?*]+)");

          Shouldn't those back slashes be doubled up yet again – escape Java String and escape Regexp:

          Pattern.compile("(\\\\.)|([?*]+)");

          When I applied the patch and set a breakpoint at the "continue" in the condition that looks for group(1) it never hit for TestQueryParser. When I updated the regexp as above it did. It's getting late for me so maybe I'm missing something obvious.

          The only other thing is that I'm surprised that the javadocs for normalize() don't mention anything about Wildcard/MultiTermQueries. Shouldn't it to clarify it's intended use?

          Show
          dsmiley David Smiley added a comment - I suspect there may be a problem with your regexp: Pattern.compile("(\\.)|([?*]+)"); Shouldn't those back slashes be doubled up yet again – escape Java String and escape Regexp: Pattern.compile("(\\\\.)|([?*]+)"); When I applied the patch and set a breakpoint at the "continue" in the condition that looks for group(1) it never hit for TestQueryParser. When I updated the regexp as above it did. It's getting late for me so maybe I'm missing something obvious. The only other thing is that I'm surprised that the javadocs for normalize() don't mention anything about Wildcard/MultiTermQueries. Shouldn't it to clarify it's intended use?
          Hide
          jpountz Adrien Grand added a comment -

          Thanks David for having a look. This regular expression comes from AnalyzingQueryParser. I'll check it but I suspect you're right and it's been broken for a long time. I'll add the javadocs too.

          Show
          jpountz Adrien Grand added a comment - Thanks David for having a look. This regular expression comes from AnalyzingQueryParser. I'll check it but I suspect you're right and it's been broken for a long time. I'll add the javadocs too.
          Hide
          jpountz Adrien Grand added a comment -

          I fixed the regular expression and added a test. Regarding javadocs, they were already mentioning wildcard queries, maybe you were looking at the wrong #normalize method (there is a public one for external consumption and a protected one that analyzers need to extend in order to set the list of token filters to apply). While I was looking at it I also added a mention about fuzzy queries to be clearer that it is not only about wildcards.

          Show
          jpountz Adrien Grand added a comment - I fixed the regular expression and added a test. Regarding javadocs, they were already mentioning wildcard queries, maybe you were looking at the wrong #normalize method (there is a public one for external consumption and a protected one that analyzers need to extend in order to set the list of token filters to apply). While I was looking at it I also added a mention about fuzzy queries to be clearer that it is not only about wildcards.
          Hide
          dsmiley David Smiley added a comment -

          Those changes look good Adrien.

          The patch grew by a lot; it appears you accidentally included other WIP in various places (benchmark module, some ivy files, ...)

          Looking at Analyzer.normalize()...

          • Why create a StringTokenStream; isn't KeywordTokenizer fine? Oh I see that's in another module... kinda seems like a generic utility that should be in core to me IMO.
          • An easy optimization is to check if initReaderForNormalization returns the input StringReader. If so, simply set filteredText to text.
          • It's a shame to call createComponents just to get the AttributeFactory. Perhaps some future TODO issue could be to add a createAttributeFactory method used here and by createComponents' impls? But then if some AnalyzerWrapper is in play then it's perhaps very cheap.

          I suppose a separate issue might be for Solr to do this when someone configures a custom Analyzer.

          No blockers really; just feedback/questions.

          Show
          dsmiley David Smiley added a comment - Those changes look good Adrien. The patch grew by a lot; it appears you accidentally included other WIP in various places (benchmark module, some ivy files, ...) Looking at Analyzer.normalize()... Why create a StringTokenStream; isn't KeywordTokenizer fine? Oh I see that's in another module... kinda seems like a generic utility that should be in core to me IMO. An easy optimization is to check if initReaderForNormalization returns the input StringReader. If so, simply set filteredText to text. It's a shame to call createComponents just to get the AttributeFactory. Perhaps some future TODO issue could be to add a createAttributeFactory method used here and by createComponents' impls? But then if some AnalyzerWrapper is in play then it's perhaps very cheap. I suppose a separate issue might be for Solr to do this when someone configures a custom Analyzer. No blockers really; just feedback/questions.
          Hide
          jpountz Adrien Grand added a comment -

          it appears you accidentally included other WIP

          Sorry I probably generated the patch against the wrong base commit, hence these unrelated changes.

          Why create a StringTokenStream; isn't KeywordTokenizer fine? Oh I see that's in another module... kinda seems like a generic utility that should be in core to me IMO.

          I'd be fine to have KeywordTokenizer in core too, let's discuss it in another issue and then potentially cut over to it if it makes it to core?

          An easy optimization is to check if initReaderForNormalization returns the input StringReader. If so, simply set filteredText to text.

          The way #normalize works is indeed not very efficient at the moment. In addition to this, it does not cache its analysis chain like we do for #tokenStream. But it's probably ok since this method should not be called as intensively as #tokenStream? (we can still improve in the future if this proves to be a bottleneck)

          It's a shame to call createComponents just to get the AttributeFactory

          Agreed, this one annoys me too. I initially wanted to add a method but this is a pity since this information is already available. That said, maybe the method approach is better since borrowing the attribute factory from the regular analysis chain makes us close the token stream before it has been consumed, which some analysis chains might not like. I updated the patch.

          I suppose a separate issue might be for Solr to do this when someone configures a custom Analyzer.

          Solr already solves this problem in a different way by having a different analyzer for multi-term queries which is computed using MultiTermAwareComponent. I agree it would be nice for it to switch to Analyzer#normalize but this would have the drawback that it would either require to drop support for configuring a custom multi-term analyzer or the integration would be a bit weird, ie. it would have to use Analyzer.tokenStream on the multiterm analyzer if it is configured or fall back to Analyzer.normalize on the default analyzer if no multi-term analyzer is configured - which might be controversial.

          Show
          jpountz Adrien Grand added a comment - it appears you accidentally included other WIP Sorry I probably generated the patch against the wrong base commit, hence these unrelated changes. Why create a StringTokenStream; isn't KeywordTokenizer fine? Oh I see that's in another module... kinda seems like a generic utility that should be in core to me IMO. I'd be fine to have KeywordTokenizer in core too, let's discuss it in another issue and then potentially cut over to it if it makes it to core? An easy optimization is to check if initReaderForNormalization returns the input StringReader. If so, simply set filteredText to text. The way #normalize works is indeed not very efficient at the moment. In addition to this, it does not cache its analysis chain like we do for #tokenStream. But it's probably ok since this method should not be called as intensively as #tokenStream? (we can still improve in the future if this proves to be a bottleneck) It's a shame to call createComponents just to get the AttributeFactory Agreed, this one annoys me too. I initially wanted to add a method but this is a pity since this information is already available. That said, maybe the method approach is better since borrowing the attribute factory from the regular analysis chain makes us close the token stream before it has been consumed, which some analysis chains might not like. I updated the patch. I suppose a separate issue might be for Solr to do this when someone configures a custom Analyzer. Solr already solves this problem in a different way by having a different analyzer for multi-term queries which is computed using MultiTermAwareComponent. I agree it would be nice for it to switch to Analyzer#normalize but this would have the drawback that it would either require to drop support for configuring a custom multi-term analyzer or the integration would be a bit weird, ie. it would have to use Analyzer.tokenStream on the multiterm analyzer if it is configured or fall back to Analyzer.normalize on the default analyzer if no multi-term analyzer is configured - which might be controversial.
          Hide
          dsmiley David Smiley added a comment -

          I like the new Analyzer.attributeFactory() method but I don't like that it documents that it's for #normalize – as if it should only be used for normalize. Wouldn't it be useful for createComponents() too? That would be a bigger change, however, since there are lots of times when a Tokenizer is created within the context of an Analyzer that would ideally be updated to call this method. That seems like it deserves its own issue? Or maybe for the time being we will accept that it's currently only used by normalize. It would be nice to see CustomAnalyzer have a customizable AttributeFactory for TokenStream and to be returned by this proposed method.

          That said, maybe the method approach is better since borrowing the attribute factory from the regular analysis chain makes us close the token stream before it has been consumed, which some analysis chains might not like.

          I think token streams should be tolerant of this or something in the TS chain is broken IMO.

          RE Solr, I only mean if there is an <analyzer class="..." type="query"> and thus the actual chain is opaque to Solr so it can't use it's normal means of determining the default multiTerm analysis chain. This is a bit of a fringe issue any way since in my experience setting class= is rare.

          BTW nice work on this issue; it's nice to see AnalyzingQueryParser go away and the lowercase options get removed.

          Show
          dsmiley David Smiley added a comment - I like the new Analyzer.attributeFactory() method but I don't like that it documents that it's for #normalize – as if it should only be used for normalize. Wouldn't it be useful for createComponents() too? That would be a bigger change, however, since there are lots of times when a Tokenizer is created within the context of an Analyzer that would ideally be updated to call this method. That seems like it deserves its own issue? Or maybe for the time being we will accept that it's currently only used by normalize. It would be nice to see CustomAnalyzer have a customizable AttributeFactory for TokenStream and to be returned by this proposed method. That said, maybe the method approach is better since borrowing the attribute factory from the regular analysis chain makes us close the token stream before it has been consumed, which some analysis chains might not like. I think token streams should be tolerant of this or something in the TS chain is broken IMO. RE Solr, I only mean if there is an <analyzer class="..." type="query"> and thus the actual chain is opaque to Solr so it can't use it's normal means of determining the default multiTerm analysis chain. This is a bit of a fringe issue any way since in my experience setting class= is rare. BTW nice work on this issue; it's nice to see AnalyzingQueryParser go away and the lowercase options get removed.
          Hide
          jpountz Adrien Grand added a comment -

          I'll fix the docs to not bo specific to #normalize. I agree using attributeFactory() in tokenStream() has a large scope and probably deserves its own issue...

          BTW nice work on this issue; it's nice to see AnalyzingQueryParser go away and the lowercase options get removed.

          Thanks!

          Show
          jpountz Adrien Grand added a comment - I'll fix the docs to not bo specific to #normalize. I agree using attributeFactory() in tokenStream() has a large scope and probably deserves its own issue... BTW nice work on this issue; it's nice to see AnalyzingQueryParser go away and the lowercase options get removed. Thanks!
          Hide
          jpountz Adrien Grand added a comment -

          Patch that updates javadocs of #attributeFactory to not be specific to normalization (even though it is only used for normalization in practice for now).

          Show
          jpountz Adrien Grand added a comment - Patch that updates javadocs of #attributeFactory to not be specific to normalization (even though it is only used for normalization in practice for now).
          Hide
          jpountz Adrien Grand added a comment -

          Fixing a typo.

          Show
          jpountz Adrien Grand added a comment - Fixing a typo.
          Hide
          dsmiley David Smiley added a comment -

          +1

          Show
          dsmiley David Smiley added a comment - +1
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e92a38af90d12e51390b4307ccbe0c24ac7b6b4e in lucene-solr's branch refs/heads/master from Adrien Grand
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e92a38a ]

          LUCENE-7355: Add Analyzer#normalize() and use it in query parsers.

          Show
          jira-bot ASF subversion and git services added a comment - Commit e92a38af90d12e51390b4307ccbe0c24ac7b6b4e in lucene-solr's branch refs/heads/master from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e92a38a ] LUCENE-7355 : Add Analyzer#normalize() and use it in query parsers.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 7c2e7a0fb80a5bf733cf710aee6cbf01d02629eb in lucene-solr's branch refs/heads/branch_6x from Adrien Grand
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7c2e7a0 ]

          LUCENE-7355: Add Analyzer#normalize() and use it in query parsers.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 7c2e7a0fb80a5bf733cf710aee6cbf01d02629eb in lucene-solr's branch refs/heads/branch_6x from Adrien Grand [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7c2e7a0 ] LUCENE-7355 : Add Analyzer#normalize() and use it in query parsers.
          Hide
          jpountz Adrien Grand added a comment -

          Thanks David for helping me iterate on this issue.

          On the 6.x branch, AnalyzingQueryParser uses the new Analyzer#normalize functionality while the classic QueryParser still relies on the lowercaseExpandedTerms option.

          Show
          jpountz Adrien Grand added a comment - Thanks David for helping me iterate on this issue. On the 6.x branch, AnalyzingQueryParser uses the new Analyzer#normalize functionality while the classic QueryParser still relies on the lowercaseExpandedTerms option.
          Hide
          thetaphi Uwe Schindler added a comment - - edited

          This broke the usage of Default attribute factory, see LUCENE-7382. I will fix this in a later commit. The default should be the same as the default as given by Tokenizers. The AttributeFactory as defined as default here is just "slow" and brings problems (e.g., LUCENE-7382), because it is not the one as used by Lucene as default otherwise. Sorry for not seeing the problem earlier!

          Show
          thetaphi Uwe Schindler added a comment - - edited This broke the usage of Default attribute factory, see LUCENE-7382 . I will fix this in a later commit. The default should be the same as the default as given by Tokenizers. The AttributeFactory as defined as default here is just "slow" and brings problems (e.g., LUCENE-7382 ), because it is not the one as used by Lucene as default otherwise. Sorry for not seeing the problem earlier!
          Hide
          thetaphi Uwe Schindler added a comment -

          I posted a patch to fix on LUCENE-7382.

          Show
          thetaphi Uwe Schindler added a comment - I posted a patch to fix on LUCENE-7382 .
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 2585c9f3ff750b8e551f261412625aef0e7d4a4b in lucene-solr's branch refs/heads/master from Uwe Schindler
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2585c9f ]

          LUCENE-7382: Fix bug introduced by LUCENE-7355 that used the wrong default AttributeFactory for new Tokenizers

          Show
          jira-bot ASF subversion and git services added a comment - Commit 2585c9f3ff750b8e551f261412625aef0e7d4a4b in lucene-solr's branch refs/heads/master from Uwe Schindler [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2585c9f ] LUCENE-7382 : Fix bug introduced by LUCENE-7355 that used the wrong default AttributeFactory for new Tokenizers
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit d71a358601ad7438d9052861b816d151d11d471b in lucene-solr's branch refs/heads/branch_6x from Uwe Schindler
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d71a358 ]

          LUCENE-7382: Fix bug introduced by LUCENE-7355 that used the wrong default AttributeFactory for new Tokenizers

          Show
          jira-bot ASF subversion and git services added a comment - Commit d71a358601ad7438d9052861b816d151d11d471b in lucene-solr's branch refs/heads/branch_6x from Uwe Schindler [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d71a358 ] LUCENE-7382 : Fix bug introduced by LUCENE-7355 that used the wrong default AttributeFactory for new Tokenizers
          Hide
          thetaphi Uwe Schindler added a comment -

          I fixed the bug with the AttributeSource in LUCENE-7382.

          Show
          thetaphi Uwe Schindler added a comment - I fixed the bug with the AttributeSource in LUCENE-7382 .
          Hide
          mikemccand Michael McCandless added a comment -

          Bulk close resolved issues after 6.2.0 release.

          Show
          mikemccand Michael McCandless added a comment - Bulk close resolved issues after 6.2.0 release.

            People

            • Assignee:
              jpountz Adrien Grand
              Reporter:
              jpountz Adrien Grand
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development