Lucene - Core
  1. Lucene - Core
  2. LUCENE-5235

throw illegalstate from Tokenizer (instead of NPE/IIOBE) if reset not called

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.6, 5.0
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We added these best effort checks, but it would be much better if we somehow gave a clear exception... this comes up often

      1. LUCENE-5235.patch
        134 kB
        Robert Muir
      2. LUCENE-5235.patch
        133 kB
        Robert Muir
      3. LUCENE-5235.patch
        21 kB
        Uwe Schindler
      4. LUCENE-5235.patch
        13 kB
        Uwe Schindler
      5. LUCENE-5235.patch
        10 kB
        Robert Muir
      6. LUCENE-5235_test.patch
        3 kB
        Robert Muir
      7. LUCENE-5235.patch
        9 kB
        Uwe Schindler

        Activity

        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Robert Muir added a comment -

        modification of BaseTokenSTreamTestCase to check for this... I think its correct...

        Show
        Robert Muir added a comment - modification of BaseTokenSTreamTestCase to check for this... I think its correct...
        Hide
        Uwe Schindler added a comment -

        First stab with fixing common Tokenizers. Robert will upload a test patch soon.

        Show
        Uwe Schindler added a comment - First stab with fixing common Tokenizers. Robert will upload a test patch soon.
        Hide
        Robert Muir added a comment -

        here's an improved patch. I think this is easier and works for custom tokenizers too.

        some tests should fail: not all tokenizers fixed yet (we have to revert the null/AIOOBE hacks and call super.reset)

        Show
        Robert Muir added a comment - here's an improved patch. I think this is easier and works for custom tokenizers too. some tests should fail: not all tokenizers fixed yet (we have to revert the null/AIOOBE hacks and call super.reset)
        Hide
        Michael McCandless added a comment -

        +1, what a delightful solution!!

        Show
        Michael McCandless added a comment - +1, what a delightful solution!!
        Hide
        Uwe Schindler added a comment -

        I fixed some more Tokenizers + improved tests. We can now also detect double reset() or things like missing close().

        I am now stuck on errors with MockTokenizer sometimes complaining with IllegalState but on other times failing to report IllegalState. Just run tests in analyzers/common. Need more debugging but the corrumtion beer was too heavy... https://twitter.com/rcmuir/status/381508554798018560

        Show
        Uwe Schindler added a comment - I fixed some more Tokenizers + improved tests. We can now also detect double reset() or things like missing close(). I am now stuck on errors with MockTokenizer sometimes complaining with IllegalState but on other times failing to report IllegalState. Just run tests in analyzers/common. Need more debugging but the corrumtion beer was too heavy... https://twitter.com/rcmuir/status/381508554798018560
        Hide
        Uwe Schindler added a comment -

        More Tokenizers fixed.

        Show
        Uwe Schindler added a comment - More Tokenizers fixed.
        Hide
        Robert Muir added a comment -

        current patch, now tests are passing in lucene, but there are problems in solr tests.

        The test is big because we got confused with the assertAnalyzesToReuse vs assertAnalyzesTo (which is now useless) so we merged these.

        Show
        Robert Muir added a comment - current patch, now tests are passing in lucene, but there are problems in solr tests. The test is big because we got confused with the assertAnalyzesToReuse vs assertAnalyzesTo (which is now useless) so we merged these.
        Hide
        Robert Muir added a comment -

        updated patch: all tests pass

        patch here: http://pastebin.com/raw.php?i=SyvM6dCZ

        Show
        Robert Muir added a comment - updated patch: all tests pass patch here: http://pastebin.com/raw.php?i=SyvM6dCZ
        Hide
        ASF subversion and git services added a comment -

        Commit 1525362 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1525362 ]

        LUCENE-5235: Tokenizers now throw an IllegalStateException if the consumer does not call reset() before consuming the stream. Previous versions throwed NullPointerException or ArrayIndexOutOfBoundsException on best effort which was not user-friendly.

        Show
        ASF subversion and git services added a comment - Commit 1525362 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1525362 ] LUCENE-5235 : Tokenizers now throw an IllegalStateException if the consumer does not call reset() before consuming the stream. Previous versions throwed NullPointerException or ArrayIndexOutOfBoundsException on best effort which was not user-friendly.
        Hide
        ASF subversion and git services added a comment -

        Commit 1525377 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1525377 ]

        Merged revision(s) 1525376 from lucene/dev/trunk:
        LUCENE-5235: Add more information to backwards break changes section.

        Show
        ASF subversion and git services added a comment - Commit 1525377 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1525377 ] Merged revision(s) 1525376 from lucene/dev/trunk: LUCENE-5235 : Add more information to backwards break changes section.
        Hide
        ASF subversion and git services added a comment -

        Commit 1525825 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1525825 ]

        LUCENE-5235: Update Javadocs to make it explicit that super.reset(), super.close() and super.end() must be called.

        Show
        ASF subversion and git services added a comment - Commit 1525825 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1525825 ] LUCENE-5235 : Update Javadocs to make it explicit that super.reset(), super.close() and super.end() must be called.
        Hide
        ASF subversion and git services added a comment -

        Commit 1525826 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1525826 ]

        Merged revision(s) 1525825 from lucene/dev/trunk:
        LUCENE-5235: Update Javadocs to make it explicit that super.reset(), super.close() and super.end() must be called.

        Show
        ASF subversion and git services added a comment - Commit 1525826 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1525826 ] Merged revision(s) 1525825 from lucene/dev/trunk: LUCENE-5235 : Update Javadocs to make it explicit that super.reset(), super.close() and super.end() must be called.
        Hide
        ASF subversion and git services added a comment -

        Commit 1526155 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1526155 ]

        LUCENE-5235: This is according to Clover never hit, so be more strict, may help Robert

        Show
        ASF subversion and git services added a comment - Commit 1526155 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1526155 ] LUCENE-5235 : This is according to Clover never hit, so be more strict, may help Robert
        Hide
        ASF subversion and git services added a comment -

        Commit 1526158 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1526158 ]

        Merged revision(s) 1526155 from lucene/dev/trunk:
        LUCENE-5235: This is according to Clover never hit, so be more strict, may help Robert

        Show
        ASF subversion and git services added a comment - Commit 1526158 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1526158 ] Merged revision(s) 1526155 from lucene/dev/trunk: LUCENE-5235 : This is according to Clover never hit, so be more strict, may help Robert

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development