Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1163

CharArraySet.contains(char[] text, int off, int len) does not work

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.4
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.

      The method public boolean contains(char[] text, int off, int len) seems not to work.

      When I do

      if (set.contains(buffer,offset,length) {
        ...
      }
      

      my code fails.

      But when I do

      if (set.contains(new String(buffer,offset,length)) {
         ...
      }
      

      everything works as expected.

      Both variants should behave the same. I attach a small piece of code to show the problem.

      1. LUCENE-1163.patch
        3 kB
        Michael McCandless
      2. CharArraySetShowBug.java
        0.6 kB
        Thomas Peuss

        Activity

        Hide
        tpeuss Thomas Peuss added a comment -

        A simple piece of code that shows the problem.

        Show
        tpeuss Thomas Peuss added a comment - A simple piece of code that shows the problem.
        Hide
        mikemccand Michael McCandless added a comment -

        Indeed it's really a bug – thank you for finding this & reporting it Thomas!

        We were ignoring the offset when computing the hash code internally.

        Lucene always passes '0' for this offset (only used in StopFilter currently) so it wasn't hitting any existing Lucene test cases.

        I turned your example into a test case in the attached patch. I will commit shortly.

        Show
        mikemccand Michael McCandless added a comment - Indeed it's really a bug – thank you for finding this & reporting it Thomas! We were ignoring the offset when computing the hash code internally. Lucene always passes '0' for this offset (only used in StopFilter currently) so it wasn't hitting any existing Lucene test cases. I turned your example into a test case in the attached patch. I will commit shortly.
        Hide
        tpeuss Thomas Peuss added a comment -

        Thanks for the quick response. I can confirm that the patch fixes the problem.

        Show
        tpeuss Thomas Peuss added a comment - Thanks for the quick response. I can confirm that the patch fixes the problem.
        Hide
        mikemccand Michael McCandless added a comment -

        Super, thanks Thomas! I just committed this.

        Show
        mikemccand Michael McCandless added a comment - Super, thanks Thomas! I just committed this.
        Hide
        mikemccand Michael McCandless added a comment -

        I'll port this one to 2.3.1 as well.

        Show
        mikemccand Michael McCandless added a comment - I'll port this one to 2.3.1 as well.
        Hide
        mikemccand Michael McCandless added a comment -

        Backported to 2.3

        Show
        mikemccand Michael McCandless added a comment - Backported to 2.3

          People

          • Assignee:
            mikemccand Michael McCandless
            Reporter:
            tpeuss Thomas Peuss
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development