Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      While converting consumers over to using reusableTokenStream, I notice many don't call end() or close() on TokenStreams.

      Even if they are test TSs only created once, we should follow the defined usage pattern.

      1. LUCENE-3456.patch
        69 kB
        Robert Muir
      2. LUCENE-3456.patch
        113 kB
        Robert Muir

        Activity

        Hide
        Steve Rowe added a comment -

        Robert, Chris, Uwe: Can this be resolved now?

        Show
        Steve Rowe added a comment - Robert, Chris, Uwe: Can this be resolved now?
        Hide
        Robert Muir added a comment -

        ok, i committed my stuff, just test bugs (ill backport though)... go ahead!

        Show
        Robert Muir added a comment - ok, i committed my stuff, just test bugs (ill backport though)... go ahead!
        Hide
        Chris Male added a comment -

        Alright I'll stick to Lucene and modules for a bit.

        Show
        Chris Male added a comment - Alright I'll stick to Lucene and modules for a bit.
        Hide
        Robert Muir added a comment -

        i have a new patch i made on the airplane, but it found more bugs... ill fix them tonight and commit!

        Show
        Robert Muir added a comment - i have a new patch i made on the airplane, but it found more bugs... ill fix them tonight and commit!
        Hide
        Chris Male added a comment -

        Where are we at with this Robert? I'm about to complete converting over to reusableTokenStream() in Solr. Should I work on that or wait for you to finish here?

        Show
        Chris Male added a comment - Where are we at with this Robert? I'm about to complete converting over to reusableTokenStream() in Solr. Should I work on that or wait for you to finish here?
        Hide
        Robert Muir added a comment -

        attached is a patch cutting over all uses of whitespace->mocktokenizer in solr test configs, and fixing the errors this found.

        i'd like to commit shortly, we can improve the coverage more by looking at other configs and cutting over things like keywordtokenizer

        Show
        Robert Muir added a comment - attached is a patch cutting over all uses of whitespace->mocktokenizer in solr test configs, and fixing the errors this found. i'd like to commit shortly, we can improve the coverage more by looking at other configs and cutting over things like keywordtokenizer
        Hide
        Robert Muir added a comment -

        I'm iterating on this patch a bit, so we don't do duplicate work.

        it probably won't find all the problems, but it will fix some.

        Show
        Robert Muir added a comment - I'm iterating on this patch a bit, so we don't do duplicate work. it probably won't find all the problems, but it will fix some.
        Hide
        Robert Muir added a comment -

        here is a patch cutting over solr/core...

        this looks to have uncovered some bugs, i didnt fix any of them.

        additionally i think we should do the same thing for solrj, contrib/*, etc.

        Show
        Robert Muir added a comment - here is a patch cutting over solr/core... this looks to have uncovered some bugs, i didnt fix any of them. additionally i think we should do the same thing for solrj, contrib/*, etc.
        Hide
        Chris Male added a comment -

        Sorting this out for the current codebase isn't a huge challenge. I work from incrementToken() / reuseable/TokenStream() calls, and see whats being called in their vicinity. But going forward, I agree using Mock* in Solr more makes sense.

        Show
        Chris Male added a comment - Sorting this out for the current codebase isn't a huge challenge. I work from incrementToken() / reuseable/TokenStream() calls, and see whats being called in their vicinity. But going forward, I agree using Mock* in Solr more makes sense.
        Hide
        Robert Muir added a comment -

        The problem my approach would solve is that Solr would be asserted all the time, not only if it uses Mock*, which is rarely does?

        But perhaps a better solution to this problem would be to use MockTokenizer etc in Solr test configs (there are already enough tests testing the example config)

        Show
        Robert Muir added a comment - The problem my approach would solve is that Solr would be asserted all the time, not only if it uses Mock*, which is rarely does? But perhaps a better solution to this problem would be to use MockTokenizer etc in Solr test configs (there are already enough tests testing the example config)
        Hide
        Uwe Schindler added a comment -

        Yes it's down low, my idea was to do it top-level. But you are right, it's not really needed. And we cannot 100% handle end/close unless we reuse.

        The problem my approach would solve is that Solr would be asserted all the time, not only if it uses Mock*, which is rarely does?

        Show
        Uwe Schindler added a comment - Yes it's down low, my idea was to do it top-level. But you are right, it's not really needed. And we cannot 100% handle end/close unless we reuse. The problem my approach would solve is that Solr would be asserted all the time, not only if it uses Mock*, which is rarely does?
        Hide
        Robert Muir added a comment -

        I could hack something together (yes, it is a hack).

        We already do this in MockTokenizer?

        The problem is that many disable the assertions (MockTokenizer.setEnableChecks)

        So, I dont think we need a new tokenstream to test this...

        Show
        Robert Muir added a comment - I could hack something together (yes, it is a hack). We already do this in MockTokenizer? The problem is that many disable the assertions (MockTokenizer.setEnableChecks) So, I dont think we need a new tokenstream to test this...
        Hide
        Chris Male added a comment -

        How would you track the end and close invocations? (or lack thereof)?

        Show
        Chris Male added a comment - How would you track the end and close invocations? (or lack thereof)?
        Hide
        Uwe Schindler added a comment - - edited

        I have some idea how to assert this is tests (now that Analyzer.(reuseable)tokenStream is final everywhere):
        We can wrap the TokenStream reaturned by those two methods in Analyzer using a AssertingTokenFilter that simply tracks in its methods the correct usage of reset() [throw AssertionFailed if missing], increment(), end(), close() [this last one is hard to track].

        The idea is:
        If assertions are enabled, the tokenstream-returning methods in Analyzer.java should check the desiredAssertionStatus of this' class and wrap the TokenStream using that TokenFilter described before. If a consumer then forgets to call any of these methods oir does this in wrong order, an AssertionError is thrown.

        I could hack something together (yes, it is a hack).

        Show
        Uwe Schindler added a comment - - edited I have some idea how to assert this is tests (now that Analyzer.(reuseable)tokenStream is final everywhere): We can wrap the TokenStream reaturned by those two methods in Analyzer using a AssertingTokenFilter that simply tracks in its methods the correct usage of reset() [throw AssertionFailed if missing] , increment(), end(), close() [this last one is hard to track] . The idea is: If assertions are enabled, the tokenstream-returning methods in Analyzer.java should check the desiredAssertionStatus of this' class and wrap the TokenStream using that TokenFilter described before. If a consumer then forgets to call any of these methods oir does this in wrong order, an AssertionError is thrown. I could hack something together (yes, it is a hack).

          People

          • Assignee:
            Unassigned
            Reporter:
            Chris Male
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development