Lucene - Core
  1. Lucene - Core
  2. LUCENE-4585

Spatial RecursivePrefixTreeFilter has some bugs with indexing non-point shapes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0
    • Fix Version/s: 4.1, 5.0
    • Component/s: modules/spatial
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      RecursivePrefixTreeFilter has some bugs that can occur when searching indexed shapes. One bug is an unpositioned termsEnum. It through an exception in testing; I'm not sure what its effects would be in production. The other couple bugs are hard to describe here but were rare to occur in extensive testing. The effects were probably a slim chance of matching an indexed shape near the query shape. And SpatialPrefixTree does not support an indexed shape that covers the entire globe.

      These bugs were discovered during development of tests for RPTF LUCENE-4419 which I will submit shortly.

      1. LUCENE-4585_PrefixTree_bugs.patch
        11 kB
        David Smiley
      2. LUCENE-4585_PrefixTree_bugs.patch
        16 kB
        David Smiley

        Issue Links

          Activity

          Gavin made changes -
          Link This issue is depended upon by LUCENE-4419 [ LUCENE-4419 ]
          Gavin made changes -
          Link This issue blocks LUCENE-4419 [ LUCENE-4419 ]
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] David Wayne Smiley
          http://svn.apache.org/viewvc?view=revision&revision=1418009

          LUCENE-4585: PrefixTree bugs

          Show
          Commit Tag Bot added a comment - [branch_4x commit] David Wayne Smiley http://svn.apache.org/viewvc?view=revision&revision=1418009 LUCENE-4585 : PrefixTree bugs
          Steve Rowe made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] Doron Cohen
          http://svn.apache.org/viewvc?view=revision&revision=1418834

          LUCENE-4585: (Spatial PrefixTree) bring back few CHANGES.txt entries.

          Show
          Commit Tag Bot added a comment - [trunk commit] Doron Cohen http://svn.apache.org/viewvc?view=revision&revision=1418834 LUCENE-4585 : (Spatial PrefixTree) bring back few CHANGES.txt entries.
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] David Wayne Smiley
          http://svn.apache.org/viewvc?view=revision&revision=1418006

          LUCENE-4585: PrefixTree bugs

          Show
          Commit Tag Bot added a comment - [trunk commit] David Wayne Smiley http://svn.apache.org/viewvc?view=revision&revision=1418006 LUCENE-4585 : PrefixTree bugs
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] David Wayne Smiley
          http://svn.apache.org/viewvc?view=revision&revision=1418008

          LUCENE-4585: Forgot another CHANGES.txt entry

          Show
          Commit Tag Bot added a comment - [trunk commit] David Wayne Smiley http://svn.apache.org/viewvc?view=revision&revision=1418008 LUCENE-4585 : Forgot another CHANGES.txt entry
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] David Wayne Smiley
          http://svn.apache.org/viewvc?view=revision&revision=1418009

          LUCENE-4585: PrefixTree bugs

          Show
          Commit Tag Bot added a comment - [branch_4x commit] David Wayne Smiley http://svn.apache.org/viewvc?view=revision&revision=1418009 LUCENE-4585 : PrefixTree bugs
          David Smiley made changes -
          Fix Version/s 4.1 [ 12321140 ]
          Fix Version/s 5.0 [ 12321663 ]
          David Smiley made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          David Smiley added a comment -

          I'll commit this tomorrow if there are no objections.

          For those of you that didn't look at the patch, note that it adds an entry up in the API compatibility section:

          • LUCENE-4585: Spatial PrefixTree based Strategies (either TermQuery or
            RecursivePrefix based) MAY want to re-index if used for point data. If a
            re-index is not done, then an indexed point is ~1/2 the smallest grid cell
            larger and as such is slightly more likely to match a query shape.
            (David Smiley)

          Strictly speaking this is not an API change but there wasn't a upgrade section. This note I added is also quite minor (in fact the behavior of a point matching its entire grid cell might be desirable) so arguably I could simply not mention it; but I'm err'ing on the side of full disclosure.

          Show
          David Smiley added a comment - I'll commit this tomorrow if there are no objections. For those of you that didn't look at the patch, note that it adds an entry up in the API compatibility section: LUCENE-4585 : Spatial PrefixTree based Strategies (either TermQuery or RecursivePrefix based) MAY want to re-index if used for point data. If a re-index is not done, then an indexed point is ~1/2 the smallest grid cell larger and as such is slightly more likely to match a query shape. (David Smiley) Strictly speaking this is not an API change but there wasn't a upgrade section. This note I added is also quite minor (in fact the behavior of a point matching its entire grid cell might be desirable) so arguably I could simply not mention it; but I'm err'ing on the side of full disclosure.
          David Smiley made changes -
          Assignee David Smiley [ dsmiley ]
          David Smiley made changes -
          Link This issue blocks LUCENE-4419 [ LUCENE-4419 ]
          David Smiley made changes -
          Attachment LUCENE-4585_PrefixTree_bugs.patch [ 12555959 ]
          Hide
          David Smiley added a comment -

          Updated patch with the correct set of files.

          Show
          David Smiley added a comment - Updated patch with the correct set of files.
          David Smiley made changes -
          Field Original Value New Value
          Attachment LUCENE-4585_PrefixTree_bugs.patch [ 12555950 ]
          Hide
          David Smiley added a comment - - edited

          The attached patch resolves the aforementioned problems.

          However note that when a point is indexed, the final full-length token is not indexed with a trailing leaf '+' byte variant; it was before. The up-side is that we save one token per indexed point (~1/12th savings if 11 maxLevels). The semantics of that '+' are intended to be that the entire grid cell represents an indexed shape for matching, so compare the rectangle for it with the query shape. But for points, it should be the center point of the cell, so no '+'.

          If a user doesn't re-index, then an indexed point is 1/2 the smallest grid cell closer to a query shape and as such might match when it didn't before. Quite minor I think but worth mentioning.

          Show
          David Smiley added a comment - - edited The attached patch resolves the aforementioned problems. However note that when a point is indexed, the final full-length token is not indexed with a trailing leaf '+' byte variant; it was before. The up-side is that we save one token per indexed point (~1/12th savings if 11 maxLevels). The semantics of that '+' are intended to be that the entire grid cell represents an indexed shape for matching, so compare the rectangle for it with the query shape. But for points, it should be the center point of the cell, so no '+'. If a user doesn't re-index, then an indexed point is 1/2 the smallest grid cell closer to a query shape and as such might match when it didn't before. Quite minor I think but worth mentioning.
          David Smiley created issue -

            People

            • Assignee:
              David Smiley
              Reporter:
              David Smiley
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development