Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: modules/other
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      • changes RegexCapabilities match(String) to match(BytesRef)
      • the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
      • i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ
      1. LUCENE-2606.patch
        14 kB
        Robert Muir
      2. LUCENE-2606.patch
        7 kB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        simple patch, we will have to list the break (matches(String) -> matches(BytesRef) in
        contrib/changes because RegexCapabilities is an interface, no way to do any back compat.

        Show
        Robert Muir added a comment - simple patch, we will have to list the break (matches(String) -> matches(BytesRef) in contrib/changes because RegexCapabilities is an interface, no way to do any back compat.
        Hide
        Robert Muir added a comment -

        attached is another iteration:

        • because the Query stores RegexCapabilities, i pulled the 'matcher' stuff out so the enum just calls matcher = capability.compile(pattern);
          This way the capabilities stores no real state, only the matcher which is created in the TermsEnum.
        • the RegexCapabilities is also marked serializable (LUCENE-961)
        Show
        Robert Muir added a comment - attached is another iteration: because the Query stores RegexCapabilities, i pulled the 'matcher' stuff out so the enum just calls matcher = capability.compile(pattern); This way the capabilities stores no real state, only the matcher which is created in the TermsEnum. the RegexCapabilities is also marked serializable ( LUCENE-961 )
        Hide
        Uwe Schindler added a comment -

        Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads.

        Show
        Uwe Schindler added a comment - Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads.
        Hide
        Robert Muir added a comment -

        Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads.

        True, I think we have the opportunity to fix it in 4.x since we have to break the interface anyway.

        Should we do anything about 3.x? It seems good to fix bugs, but it would be frustrating (if someone has a custom RegexCapabilities) to break the API in 3.x, then in 4.x again!

        Show
        Robert Muir added a comment - Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads. True, I think we have the opportunity to fix it in 4.x since we have to break the interface anyway. Should we do anything about 3.x? It seems good to fix bugs, but it would be frustrating (if someone has a custom RegexCapabilities) to break the API in 3.x, then in 4.x again!
        Hide
        Robert Muir added a comment -

        Committed revision 987129.

        Show
        Robert Muir added a comment - Committed revision 987129.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development