Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.1, 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      GeoUtils contains a lot of common spatial mathematics that can be reused across multiple packages. As discussed in LUCENE-7150 this issue will refactor GeoUtils to a new o.a.l.util.geo package in core that can be the home for other reusable spatial utility classes required by field and query implementations.

      1. LUCENE-7152.patch
        43 kB
        Nicholas Knize
      2. LUCENE-7152.patch
        42 kB
        Nicholas Knize

        Activity

        Hide
        dsmiley David Smiley added a comment -

        -1 sorry. I've been following LUCENE-7150 and it's unclear to me why GeoUtils should be added to core. It seems reasonable that the spatial3d module have a dependency on the spatial module. Likewise I expect spatial-extras will depend on the spatial module in the future too when a GeoPointSpatialStrategy gets written. If in the future for reasons we might not forsee today we would like to use GeoUtils in non-spatial modules (perhaps expressions?), then I could understand putting it in core. I just think it's better organized to keep spatial code in our spatial modules.

        Show
        dsmiley David Smiley added a comment - -1 sorry. I've been following LUCENE-7150 and it's unclear to me why GeoUtils should be added to core. It seems reasonable that the spatial3d module have a dependency on the spatial module. Likewise I expect spatial-extras will depend on the spatial module in the future too when a GeoPointSpatialStrategy gets written. If in the future for reasons we might not forsee today we would like to use GeoUtils in non-spatial modules (perhaps expressions?), then I could understand putting it in core. I just think it's better organized to keep spatial code in our spatial modules.
        Hide
        mikemccand Michael McCandless added a comment -

        David Smiley what if we limit it to very basic argument validation? I would start with just checkLatitude and checkLongitude, in core.

        I think that's the lesser evil, vs. additional dependencies between modules?

        Show
        mikemccand Michael McCandless added a comment - David Smiley what if we limit it to very basic argument validation? I would start with just checkLatitude and checkLongitude , in core. I think that's the lesser evil, vs. additional dependencies between modules?
        Hide
        dsmiley David Smiley added a comment -

        -0 ok and mark @lucene.internal Possibly rename to LatLonCheck?

        Show
        dsmiley David Smiley added a comment - -0 ok and mark @lucene.internal Possibly rename to LatLonCheck?
        Hide
        nknize Nicholas Knize added a comment -

        Initial quick patch includes the following:

        • refactors o.a.l.spatial.util.GeoUtils from spatial to core module in package o.a.l.geo
        • Keeps GeoRect in spatial module and refactors circleToBBox polyToBBox to GeoRect.fromPointDistance GeoRect.fromPolygon, respectively
        • Refactors axisLat and error constant to GeoRect
        • Removes some unnecessary sloppy math (e.g., sloppySin)
        Show
        nknize Nicholas Knize added a comment - Initial quick patch includes the following: refactors o.a.l.spatial.util.GeoUtils from spatial to core module in package o.a.l.geo Keeps GeoRect in spatial module and refactors circleToBBox polyToBBox to GeoRect.fromPointDistance GeoRect.fromPolygon , respectively Refactors axisLat and error constant to GeoRect Removes some unnecessary sloppy math (e.g., sloppySin)
        Hide
        rcmuir Robert Muir added a comment -

        I dont think we need to bring complexity to the user just to have sane apis. Making sandbox/ and geo3d/ and who knows what else depend on spatial/ is confusing.

        The main advantage of stuff like checkLatitude and checkLongitude is that it helps detect and prevent a very common trap caused by the immaturity of the geospatial world. I know it makes sense to them somehow in their insane brains, that some things should take lat/lon and others lon/lat, but in practice this causes major issues. So adding these parameter checks is a huge win for usability, and I do not think we should have to incur a huge usability loss (crazy module dependencies) to have it.

        I don't think we are talking about a lot of code here. For example the entirety of LatLonPoint/GeoPointField 2D geo just needs to be able to do basic stuff like form a bounding box (for distance query and efficient sorting), compute if something is in a polygon, etc. Its really just a few methods and actually the api surface area can be reduced more: GeoPointField can be refactored to look more like LatLonPoint: just one public class to do everything, and we can make all its queries etc package-private too.

        All in all, what are we talking about here? like 5 public classes to have this in a .geo package in core? I think in this day and age it is reasonable to treat this stuff as a core capability, it covers the majority use case and extra modules can have more esoteric features? I also think its important that other modules can do basic things efficiently without all of them depending on spatial. For example expressions lets you incorporate distance but it does not do it as efficiently as LatLonPoint's distance sort, but maybe it could with some refactoring. Why should it have to depend on spatial/ to compute a simple bounding box? This kind of thing leads to inconsistencies and code duplication.

        Show
        rcmuir Robert Muir added a comment - I dont think we need to bring complexity to the user just to have sane apis. Making sandbox/ and geo3d/ and who knows what else depend on spatial/ is confusing. The main advantage of stuff like checkLatitude and checkLongitude is that it helps detect and prevent a very common trap caused by the immaturity of the geospatial world. I know it makes sense to them somehow in their insane brains, that some things should take lat/lon and others lon/lat, but in practice this causes major issues. So adding these parameter checks is a huge win for usability, and I do not think we should have to incur a huge usability loss (crazy module dependencies) to have it. I don't think we are talking about a lot of code here. For example the entirety of LatLonPoint/GeoPointField 2D geo just needs to be able to do basic stuff like form a bounding box (for distance query and efficient sorting), compute if something is in a polygon, etc. Its really just a few methods and actually the api surface area can be reduced more: GeoPointField can be refactored to look more like LatLonPoint: just one public class to do everything, and we can make all its queries etc package-private too. All in all, what are we talking about here? like 5 public classes to have this in a .geo package in core? I think in this day and age it is reasonable to treat this stuff as a core capability, it covers the majority use case and extra modules can have more esoteric features? I also think its important that other modules can do basic things efficiently without all of them depending on spatial. For example expressions lets you incorporate distance but it does not do it as efficiently as LatLonPoint's distance sort, but maybe it could with some refactoring. Why should it have to depend on spatial/ to compute a simple bounding box? This kind of thing leads to inconsistencies and code duplication.
        Hide
        mikemccand Michael McCandless added a comment -

        I think in this day and age it is reasonable to treat this stuff as a core capability

        +1

        Show
        mikemccand Michael McCandless added a comment - I think in this day and age it is reasonable to treat this stuff as a core capability +1
        Hide
        mikemccand Michael McCandless added a comment -

        Initial quick patch includes the following:

        +1, thanks Nicholas Knize! This is really a bare minimal geo methods and one constant.

        Show
        mikemccand Michael McCandless added a comment - Initial quick patch includes the following: +1, thanks Nicholas Knize ! This is really a bare minimal geo methods and one constant.
        Hide
        mikemccand Michael McCandless added a comment -

        Removes some unnecessary sloppy math (e.g., sloppySin)

        Hmm, it looks like you replaced sloppySin with Math.sin? Should we maybe do that separately, and keep this change rote? I'm just not sure if there are performance implications with that change ...

        Show
        mikemccand Michael McCandless added a comment - Removes some unnecessary sloppy math (e.g., sloppySin) Hmm, it looks like you replaced sloppySin with Math.sin ? Should we maybe do that separately, and keep this change rote? I'm just not sure if there are performance implications with that change ...
        Hide
        rcmuir Robert Muir added a comment -

        I think we can do it separately, actually a "proper" sloppy sin is not really that bad, its more about api visibility, i hate that cos is already public and i don't think we should expose a trig library If e.g. bounding box code is in the same package as haversin, then this can be done package-private and its another story. But yeah, followup I think?

        Show
        rcmuir Robert Muir added a comment - I think we can do it separately, actually a "proper" sloppy sin is not really that bad, its more about api visibility, i hate that cos is already public and i don't think we should expose a trig library If e.g. bounding box code is in the same package as haversin, then this can be done package-private and its another story. But yeah, followup I think?
        Hide
        rcmuir Robert Muir added a comment -

        Just wasted some fun time fighting the build system because my patch in LUCENE-7153 exposes a class in a sandbox/ public api from spatial/

        These kinds of things are important to think about when considering "spatial module madness" versus just e.g. having 5 key classes in core. It is not just harder on the user, it is harder on us too. Fortunately, I know the necessary voodoo.

        Show
        rcmuir Robert Muir added a comment - Just wasted some fun time fighting the build system because my patch in LUCENE-7153 exposes a class in a sandbox/ public api from spatial/ These kinds of things are important to think about when considering "spatial module madness" versus just e.g. having 5 key classes in core. It is not just harder on the user, it is harder on us too. Fortunately, I know the necessary voodoo.
        Hide
        nknize Nicholas Knize added a comment -

        Updated patch:

        • rebased with polygon updates
        • rebased with mean radius
        • added back sloppy sin (to be changed in a separate issue)
        Show
        nknize Nicholas Knize added a comment - Updated patch: rebased with polygon updates rebased with mean radius added back sloppy sin (to be changed in a separate issue)
        Hide
        dsmiley David Smiley added a comment -

        +1 to your patch Nick.

        I think in this day and age it is reasonable to treat this stuff as a core capability

        If highlighting, suggesting, etc. aren't a "core" part of search (thus Lucene core) yet spatial somehow is... well, ... I'm rendered speechless. Spatial is cool and important; many (but not all!) users want it. But this is so for just about everything else we have. I'm trying to keep things organized. It'd be confusing to have some spatial in core and some not in core. And if somehow it makes things harder for spandbox particularly, I think that's no reason to base decisions on.

        "spatial module madness"

        I don't think the current situation is "madness" but you do... okay... yet isn't it self-imposed? I was always a fan of keeping all our spatial code in one spatial module. Clearly that's simple for users to grok too.

        Show
        dsmiley David Smiley added a comment - +1 to your patch Nick. I think in this day and age it is reasonable to treat this stuff as a core capability If highlighting, suggesting, etc. aren't a "core" part of search (thus Lucene core) yet spatial somehow is... well, ... I'm rendered speechless. Spatial is cool and important; many (but not all!) users want it. But this is so for just about everything else we have. I'm trying to keep things organized. It'd be confusing to have some spatial in core and some not in core. And if somehow it makes things harder for spandbox particularly, I think that's no reason to base decisions on. "spatial module madness" I don't think the current situation is "madness" but you do... okay... yet isn't it self-imposed? I was always a fan of keeping all our spatial code in one spatial module. Clearly that's simple for users to grok too.
        Hide
        rcmuir Robert Muir added a comment -

        I don't think the current situation is "madness" but you do... okay... yet isn't it self-imposed? I was always a fan of keeping all our spatial code in one spatial module. Clearly that's simple for users to grok too.

        Because putting all stuff (both 99% case and more esoteric stuff) in a module does not solve the problem. It means as i just mentioned above, things like expressions have to rely on spatial/ just to do very basic things like make a bounding box.

        This is why i think the 99% basic use case should be in core. It keeps things from being a rat's nest.

        Show
        rcmuir Robert Muir added a comment - I don't think the current situation is "madness" but you do... okay... yet isn't it self-imposed? I was always a fan of keeping all our spatial code in one spatial module. Clearly that's simple for users to grok too. Because putting all stuff (both 99% case and more esoteric stuff) in a module does not solve the problem. It means as i just mentioned above, things like expressions have to rely on spatial/ just to do very basic things like make a bounding box. This is why i think the 99% basic use case should be in core. It keeps things from being a rat's nest.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 3f217aba6d4422d829be5ad77b02068c130dc7d3 in lucene-solr's branch refs/heads/master from nknize
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3f217ab ]

        LUCENE-7152: Refactor GeoUtils from lucene-spatial to core module.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 3f217aba6d4422d829be5ad77b02068c130dc7d3 in lucene-solr's branch refs/heads/master from nknize [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3f217ab ] LUCENE-7152 : Refactor GeoUtils from lucene-spatial to core module.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 637dce83e29fe23735256f25f0c9881ed2546520 in lucene-solr's branch refs/heads/branch_6x from nknize
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=637dce8 ]

        LUCENE-7152: Refactor GeoUtils from lucene-spatial to core module.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 637dce83e29fe23735256f25f0c9881ed2546520 in lucene-solr's branch refs/heads/branch_6x from nknize [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=637dce8 ] LUCENE-7152 : Refactor GeoUtils from lucene-spatial to core module.
        Hide
        hossman Hoss Man added a comment -

        Manually correcting fixVersion per Step #S5 of LUCENE-7271

        Show
        hossman Hoss Man added a comment - Manually correcting fixVersion per Step #S5 of LUCENE-7271

          People

          • Assignee:
            Unassigned
            Reporter:
            nknize Nicholas Knize
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development