Lucene - Core
  1. Lucene - Core
  2. LUCENE-6704

GeoPointInBBox/Distance queries can throw OOME

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      After investigating LUCENE-6685 the performance issues and improvements related to GeoPointInBBox/Distance queries could be categorized into two separate issues:

      1. OOME caused by an unnecessary number of ranges computed for Point Distance Queries (bug in the GeoPointTermEnum base class where the bounding box was used for intersections instead of the point radius)
      2. API improvements providing configurable detail parameters.

      This issue addresses 1. LUCENE-6685 will further investigate the need for 2 (after working 1 and correcting for unnecessary range detail, it looks like 2 may not be needed)

      1. LUCENE-6704.patch
        19 kB
        Nicholas Knize
      2. LUCENE-6704.patch
        18 kB
        Nicholas Knize

        Activity

        Hide
        Nicholas Knize added a comment -

        Patch attached with the following enhancements:

        1. Leaves DETAIL_LEVEL fixed at 16 w/ GeoPointField.PRECISION_STEP fixed at 6.
        2. Updates relation logic to relate candidate cells to the appropriate shape.
        3. Changes mortonEncoding logic to use full 32 bits per lat/lon (full 64 bit long precision). This fixes an accuracy bug where the max lon/lat correctly decodes to 180,90 (instead of 179.999999, 89.999999).

        Note that this patch will break the current patch for LUCENE-6647. Updated patch for that coming next.

        Show
        Nicholas Knize added a comment - Patch attached with the following enhancements: 1. Leaves DETAIL_LEVEL fixed at 16 w/ GeoPointField.PRECISION_STEP fixed at 6. 2. Updates relation logic to relate candidate cells to the appropriate shape. 3. Changes mortonEncoding logic to use full 32 bits per lat/lon (full 64 bit long precision). This fixes an accuracy bug where the max lon/lat correctly decodes to 180,90 (instead of 179.999999, 89.999999). Note that this patch will break the current patch for LUCENE-6647 . Updated patch for that coming next.
        Hide
        Michael McCandless added a comment -

        I'm a little lost on this issue. It's focusing on fixing the OOME? But the patch seems to do much more (good things!).

        Can you add a dedicated standalone test that today hits OOME and then with this patch doesn't? To confirm the patch is fixing OOMEs in both GeoPointInBBoxQuery and GeoPointDistanceQuery...

        Why is GeoPointDistanceQuery not final anymore? Seems like nothing wants to extend it...

        I like the added javadocs, and BigInteger usage is always fun

        Show
        Michael McCandless added a comment - I'm a little lost on this issue. It's focusing on fixing the OOME? But the patch seems to do much more (good things!). Can you add a dedicated standalone test that today hits OOME and then with this patch doesn't? To confirm the patch is fixing OOMEs in both GeoPointInBBoxQuery and GeoPointDistanceQuery... Why is GeoPointDistanceQuery not final anymore? Seems like nothing wants to extend it... I like the added javadocs, and BigInteger usage is always fun
        Hide
        Nicholas Knize added a comment -

        Can you add a dedicated standalone test that today hits OOME and then with this patch doesn't?

        Sure thing! I'll isolate one of the OOME failures from testRandom() so before/after patch validation can be run.

        Why is GeoPointDistanceQuery not final anymore? Seems like nothing wants to extend it...

        That was a change intended for a new issue. I've added a GeoPointDistanceRangeQuery that extends GeoPointDistanceQuery. I'll remove it from this patch and apply it in a new issue. Sorry for that confusion

        Show
        Nicholas Knize added a comment - Can you add a dedicated standalone test that today hits OOME and then with this patch doesn't? Sure thing! I'll isolate one of the OOME failures from testRandom() so before/after patch validation can be run. Why is GeoPointDistanceQuery not final anymore? Seems like nothing wants to extend it... That was a change intended for a new issue. I've added a GeoPointDistanceRangeQuery that extends GeoPointDistanceQuery. I'll remove it from this patch and apply it in a new issue. Sorry for that confusion
        Hide
        Nicholas Knize added a comment - - edited

        Updated patch includes the following:

        • reduces number of ranges by up to 100x (for very large queries)
        • adds unit test that causes an OOME on current trunk. This test is marked as NIGHTLY because it is a large query that can slow each beast iteration by a few seconds.
        • moves 32 bit encoding changes to LUCENE-6710
        • resets GeoPointDistanceQuery to final
        Show
        Nicholas Knize added a comment - - edited Updated patch includes the following: reduces number of ranges by up to 100x (for very large queries) adds unit test that causes an OOME on current trunk. This test is marked as NIGHTLY because it is a large query that can slow each beast iteration by a few seconds. moves 32 bit encoding changes to LUCENE-6710 resets GeoPointDistanceQuery to final
        Hide
        Michael McCandless added a comment -

        Thanks Nicholas Knize, patch looks good, and thank you for the added test and breaking out LUCENE-6710 ... I'll commit shortly.

        Show
        Michael McCandless added a comment - Thanks Nicholas Knize , patch looks good, and thank you for the added test and breaking out LUCENE-6710 ... I'll commit shortly.
        Hide
        ASF subversion and git services added a comment -

        Commit 1693690 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1693690 ]

        LUCENE-6704: GeoPointDistanceQuery was visiting too many term ranges

        Show
        ASF subversion and git services added a comment - Commit 1693690 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1693690 ] LUCENE-6704 : GeoPointDistanceQuery was visiting too many term ranges
        Hide
        ASF subversion and git services added a comment -

        Commit 1693692 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1693692 ]

        LUCENE-6704: GeoPointDistanceQuery was visiting too many term ranges

        Show
        ASF subversion and git services added a comment - Commit 1693692 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1693692 ] LUCENE-6704 : GeoPointDistanceQuery was visiting too many term ranges
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Unassigned
            Reporter:
            Nicholas Knize
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development