Lucene - Core
  1. Lucene - Core
  2. LUCENE-2139

Cleanup and Improvement of Spatial Contrib

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 3.1
    • Fix Version/s: None
    • Component/s: modules/spatial
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The current spatial contrib can be improved by adding documentation, tests, removing unused classes and code, repackaging the classes and improving the performance of the distance filtering. The latter will incorporate the multi-threaded functionality introduced in LUCENE-1732.

      Other improvements involve adding better support for different distance units, different distance calculators and different data formats (whether it be lat/long fields, geohashes, or something else in the future).

      Patch to be added soon.

      1. LUCENE-2139.patch
        252 kB
        Chris Male
      2. LUCENE-2139-Java5Only.patch
        254 kB
        Chris Male
      3. LUCENE-2139-svnScript.sh
        4 kB
        Chris Male

        Issue Links

          Activity

          Gavin made changes -
          Link This issue depends upon LUCENE-2149 [ LUCENE-2149 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2149 [ LUCENE-2149 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2151 [ LUCENE-2151 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2151 [ LUCENE-2151 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2148 [ LUCENE-2148 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2148 [ LUCENE-2148 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2152 [ LUCENE-2152 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2152 [ LUCENE-2152 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2174 [ LUCENE-2174 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2174 [ LUCENE-2174 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2173 [ LUCENE-2173 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2173 [ LUCENE-2173 ]
          Gavin made changes -
          Link This issue depends upon LUCENE-2175 [ LUCENE-2175 ]
          Gavin made changes -
          Link This issue depends on LUCENE-2175 [ LUCENE-2175 ]
          Mark Thomas made changes -
          Workflow Default workflow, editable Closed status [ 12564667 ] jira [ 12584666 ]
          Mark Thomas made changes -
          Workflow jira [ 12484200 ] Default workflow, editable Closed status [ 12564667 ]
          Hide
          Chris Male added a comment -

          See LUCENE-2599 (which I'd opened up awhile back apparently) for work on deprecating / removing the code.

          Show
          Chris Male added a comment - See LUCENE-2599 (which I'd opened up awhile back apparently) for work on deprecating / removing the code.
          Chris Male made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Assignee Simon Willnauer [ simonw ] Chris Male [ cmale ]
          Resolution Won't Fix [ 2 ]
          Hide
          Chris Male added a comment -

          Closing as we will look into improvements in the future spatial module.

          Show
          Chris Male added a comment - Closing as we will look into improvements in the future spatial module.
          Hide
          Chris Male added a comment -

          I have to echo Simon's comments here. Furthermore, we are deprecating in 3x so we do give users time to upgrade.

          I'll close this out now.

          Show
          Chris Male added a comment - I have to echo Simon's comments here. Furthermore, we are deprecating in 3x so we do give users time to upgrade. I'll close this out now.
          Hide
          Simon Willnauer added a comment -

          Pity to remove code before replacing one is still not here : what will the users do ?

          While I can understand you reasons why this is a pity i have to say that the code you are referring to is a big mess currently. In 4.0 we have the possibility to get rid of that stuff an let folks develop something cleaner, maybe better, tested, documented and already integrated (Solr). AFAIK there is already a whole bunch of stuff in solr related to geo search and we can move that stuff out into a module to make it available to Lucene users too.

          so here is my +1 for closing the issues, deprecate everything in 3.0 nuke spatial in 4.0. Chris, do you want to go ahead and create the issues?

          simon

          Show
          Simon Willnauer added a comment - Pity to remove code before replacing one is still not here : what will the users do ? While I can understand you reasons why this is a pity i have to say that the code you are referring to is a big mess currently. In 4.0 we have the possibility to get rid of that stuff an let folks develop something cleaner, maybe better, tested, documented and already integrated (Solr). AFAIK there is already a whole bunch of stuff in solr related to geo search and we can move that stuff out into a module to make it available to Lucene users too. so here is my +1 for closing the issues, deprecate everything in 3.0 nuke spatial in 4.0. Chris, do you want to go ahead and create the issues? simon
          Hide
          Nicolas Helleringer added a comment -

          Yep - but they are normally tri-range (i.e. Lucene Numeric) ranges, so pretty efficient anyway.

          From what I saw of the existing tier code, I can't say that it would have been more efficient... and I can say that at least for some spatial queries, it would have been less efficient (since it always seemed to use one tier level for a query, not multiple).

          From personnal experience I do not aggre as I deployed lucene tier spatial to solve performance issue with a double range query on two fields
          But that is not factual : i do aggree with Chris that the better is to work on benchmarking to 'prove' this and movve forward.

          @deprecate the contrib in 3x, nuke in 4x

          Pity to remove code before replacing one is still not here : what will the users do ?

          The more important thus is to move forward : I will be pleased to help on trying to port tier support if anybody is interessed in it to the new section where spatial lives

          Show
          Nicolas Helleringer added a comment - Yep - but they are normally tri-range (i.e. Lucene Numeric) ranges, so pretty efficient anyway. From what I saw of the existing tier code, I can't say that it would have been more efficient... and I can say that at least for some spatial queries, it would have been less efficient (since it always seemed to use one tier level for a query, not multiple). From personnal experience I do not aggre as I deployed lucene tier spatial to solve performance issue with a double range query on two fields But that is not factual : i do aggree with Chris that the better is to work on benchmarking to 'prove' this and movve forward. @deprecate the contrib in 3x, nuke in 4x Pity to remove code before replacing one is still not here : what will the users do ? The more important thus is to move forward : I will be pleased to help on trying to port tier support if anybody is interessed in it to the new section where spatial lives
          Hide
          Chris Male added a comment -

          I'd say that thats what we want to determine some way as part of a redo of the spatial code. Maybe we need to include some benchmarking functionality.

          Show
          Chris Male added a comment - I'd say that thats what we want to determine some way as part of a redo of the spatial code. Maybe we need to include some benchmarking functionality.
          Hide
          Yonik Seeley added a comment -

          As far as I understand the spatial search is done by two range queries ANDed

          Yep - but they are normally tri-range (i.e. Lucene Numeric) ranges, so pretty efficient anyway.
          From what I saw of the existing tier code, I can't say that it would have been more efficient... and I can say that at least for some spatial queries, it would have been less efficient (since it always seemed to use one tier level for a query, not multiple).

          Show
          Yonik Seeley added a comment - As far as I understand the spatial search is done by two range queries ANDed Yep - but they are normally tri-range (i.e. Lucene Numeric) ranges, so pretty efficient anyway. From what I saw of the existing tier code, I can't say that it would have been more efficient... and I can say that at least for some spatial queries, it would have been less efficient (since it always seemed to use one tier level for a query, not multiple).
          Hide
          Chris Male added a comment -

          So we're good with closing? I'll close all associated issues too and open a new one for the deprecation/nuking/re-birth work.

          Show
          Chris Male added a comment - So we're good with closing? I'll close all associated issues too and open a new one for the deprecation/nuking/re-birth work.
          Hide
          Chris Male added a comment -

          +1 on deprecation (I thought we already decided that!)

          Ha yes quite. Good to be clear about it again

          As for tier search, there is an open issue to the use the more or less standard (or, at least, shall we say the well documented) military grid system.

          LUCENE-2646

          Show
          Chris Male added a comment - +1 on deprecation (I thought we already decided that!) Ha yes quite. Good to be clear about it again As for tier search, there is an open issue to the use the more or less standard (or, at least, shall we say the well documented) military grid system. LUCENE-2646
          Hide
          Grant Ingersoll added a comment -

          +1 on deprecation (I thought we already decided that!)

          As for tier search, there is an open issue to the use the more or less standard (or, at least, shall we say the well documented) military grid system.

          Show
          Grant Ingersoll added a comment - +1 on deprecation (I thought we already decided that!) As for tier search, there is an open issue to the use the more or less standard (or, at least, shall we say the well documented) military grid system.
          Hide
          Chris Male added a comment -

          Nicolas,

          Its true that Solr does not support spatial tier search but this was due to the code in spatial lucene being so buggy. I think everybody is happy to revisit the issue and add spatial tier support to Solr if we can come up with a clear, test and bug-free underlying implementation.

          I think whatever work we do we should do in a spatial module.

          Simon,
          Perhaps we should @deprecate the contrib in 3x, nuke in 4x and start up the module?

          Show
          Chris Male added a comment - Nicolas, Its true that Solr does not support spatial tier search but this was due to the code in spatial lucene being so buggy. I think everybody is happy to revisit the issue and add spatial tier support to Solr if we can come up with a clear, test and bug-free underlying implementation. I think whatever work we do we should do in a spatial module. Simon, Perhaps we should @deprecate the contrib in 3x, nuke in 4x and start up the module?
          Hide
          Nicolas Helleringer added a comment -

          What I do not see today in Solr spatial awareness is spatial indexation with tiers

          As far as I understand the spatial search is done by two range queries ANDed

          Such a search could lead to less performance on big set of data

          I am no sure if a two phase filtering is possible on the Solr lvl with phase 1 been a Tier search a phase 2 a distance refining.

          Show
          Nicolas Helleringer added a comment - What I do not see today in Solr spatial awareness is spatial indexation with tiers As far as I understand the spatial search is done by two range queries ANDed Such a search could lead to less performance on big set of data I am no sure if a two phase filtering is possible on the Solr lvl with phase 1 been a Tier search a phase 2 a distance refining.
          Hide
          Simon Willnauer added a comment -

          I say we close these issues and begin moving what code we feel is salvagable to a spatial module where we can work closely with the Solr code.

          I was actually hoping somebody says that before I write to the list and get my head cut off I think that is the best idea. we should nuke the spatial contrib entirely and start from zero in the modules area. IMO we should move all existing stuff from solr into there and let glue code reside under solr.

          I hope I have time to write that up and propose it on the list.

          simon

          Show
          Simon Willnauer added a comment - I say we close these issues and begin moving what code we feel is salvagable to a spatial module where we can work closely with the Solr code. I was actually hoping somebody says that before I write to the list and get my head cut off I think that is the best idea. we should nuke the spatial contrib entirely and start from zero in the modules area. IMO we should move all existing stuff from solr into there and let glue code reside under solr. I hope I have time to write that up and propose it on the list. simon
          Hide
          Chris Male added a comment -

          I think the overall direction of Spatial Lucene has changed considerably since this work was done. There have been considerable issues found with the Cartesian Tier code which this patch doesnt address. Equally, issues of stateful Filters has brought to attention that the SpatialFilter breaks alot of Filter principals.

          I say we close these issues and begin moving what code we feel is salvagable to a spatial module where we can work closely with the Solr code.

          Show
          Chris Male added a comment - I think the overall direction of Spatial Lucene has changed considerably since this work was done. There have been considerable issues found with the Cartesian Tier code which this patch doesnt address. Equally, issues of stateful Filters has brought to attention that the SpatialFilter breaks alot of Filter principals. I say we close these issues and begin moving what code we feel is salvagable to a spatial module where we can work closely with the Solr code.
          Hide
          Nicolas Helleringer added a comment -

          Simon,

          I am willing to work on this and i did.

          I did post some patches in April that were commited by Grant and then reverted waiting for a global status.

          Some isssues are still marked as resolved but the code is not commited anymore (LUCENE-2366).

          Fact : I did apply the patchs I had proposed to one of my client spatial lucene installtion and it is still in production with success.

          I do want help on that spatial subject

          Just tell me want you want to be addressed and I ll work on it.

          Should I update the patches I did propose ?
          It is the good way to do ? Grant seemed not to be happy with them back in April.

          Show
          Nicolas Helleringer added a comment - Simon, I am willing to work on this and i did. I did post some patches in April that were commited by Grant and then reverted waiting for a global status. Some isssues are still marked as resolved but the code is not commited anymore ( LUCENE-2366 ). Fact : I did apply the patchs I had proposed to one of my client spatial lucene installtion and it is still in production with success. I do want help on that spatial subject Just tell me want you want to be addressed and I ll work on it. Should I update the patches I did propose ? It is the good way to do ? Grant seemed not to be happy with them back in April.
          Hide
          Simon Willnauer added a comment -

          Are we still working on this issue? It seems to be somewhat outdated though and I am not sure about the geospatial status in solr - any ideas? If we do not work on this I would like to close as won't fix.

          Show
          Simon Willnauer added a comment - Are we still working on this issue? It seems to be somewhat outdated though and I am not sure about the geospatial status in solr - any ideas? If we do not work on this I would like to close as won't fix.
          Grant Ingersoll made changes -
          Link This issue relates to SOLR-773 [ SOLR-773 ]
          Uwe Schindler made changes -
          Affects Version/s 3.1 [ 12314822 ]
          Affects Version/s 4.0 [ 12314025 ]
          Hide
          Chris Male added a comment -

          Linking deprecations issue.

          Show
          Chris Male added a comment - Linking deprecations issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2175 [ LUCENE-2175 ]
          Hide
          Chris Male added a comment -

          Linking SpatialFilter and SortSource issue.

          Show
          Chris Male added a comment - Linking SpatialFilter and SortSource issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2174 [ LUCENE-2174 ]
          Hide
          Chris Male added a comment -

          Linked cartesian tier code issue.

          Show
          Chris Male added a comment - Linked cartesian tier code issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2173 [ LUCENE-2173 ]
          Hide
          Chris Male added a comment -

          Linked in distance filtering and data format issue.

          Show
          Chris Male added a comment - Linked in distance filtering and data format issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2152 [ LUCENE-2152 ]
          Hide
          Chris Male added a comment -

          Linked distance calculator issue.

          Show
          Chris Male added a comment - Linked distance calculator issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2151 [ LUCENE-2151 ]
          Hide
          Chris Male added a comment -

          Linked LatLng issue.

          Show
          Chris Male added a comment - Linked LatLng issue.
          Chris Male made changes -
          Link This issue depends on LUCENE-2149 [ LUCENE-2149 ]
          Hide
          Chris Male added a comment -

          Linked Point2D and Rectangle issue

          Show
          Chris Male added a comment - Linked Point2D and Rectangle issue
          Chris Male made changes -
          Link This issue depends on LUCENE-2148 [ LUCENE-2148 ]
          Hide
          Chris Male added a comment -

          Linked util issue

          Show
          Chris Male added a comment - Linked util issue
          Chris Male made changes -
          Link This issue depends upon LUCENE-2147 [ LUCENE-2147 ]
          Hide
          Chris Male added a comment -

          I will now create a version of the patch that does not move, rename or delete any classes as that dilutes the patches. I will break the patch down into some logical pieces with separate sub issues and link them here.

          Show
          Chris Male added a comment - I will now create a version of the patch that does not move, rename or delete any classes as that dilutes the patches. I will break the patch down into some logical pieces with separate sub issues and link them here.
          Chris Male made changes -
          Attachment LUCENE-2139-svnScript.sh [ 12427704 ]
          Hide
          Chris Male added a comment -

          Attaching simple bash script which executes a series of SVN commands to clean out the existing module and rename most of the files to what they will be in the next patch. The only file that still needs to be renamed later in TestCartesian.java that will be renamed to TestSpatialFilter.java once SpatialFilter exists.

          Script assumes you are existing it from the root directory of lucene checkout.

          Show
          Chris Male added a comment - Attaching simple bash script which executes a series of SVN commands to clean out the existing module and rename most of the files to what they will be in the next patch. The only file that still needs to be renamed later in TestCartesian.java that will be renamed to TestSpatialFilter.java once SpatialFilter exists. Script assumes you are existing it from the root directory of lucene checkout.
          Hide
          Michael McCandless added a comment -

          This looks like a great step forward for spatial; thanks Chris!

          Show
          Michael McCandless added a comment - This looks like a great step forward for spatial; thanks Chris!
          Hide
          Chris Male added a comment -

          Hi Simon,

          Alright that sounds great. I will work on creating a list of SVN move / copies commands followed by a patch to get the existing contrib into the structure that I want. I will then break this big patch down into a handful of smaller patches.

          Show
          Chris Male added a comment - Hi Simon, Alright that sounds great. I will work on creating a list of SVN move / copies commands followed by a patch to get the existing contrib into the structure that I want. I will then break this big patch down into a handful of smaller patches.
          Hide
          Simon Willnauer added a comment - - edited

          Thank you Chris!
          Lets go with small issues from here. I suggest we start with a svn move / copy issue that bring names and classes in the right place. With the size and the number of changes in patch this make the most sense to me. The good thing with spatial is that we can change class names and locations / packages as all the API is marked experimental. Once we have this in place people can easily apply the patch and join the issues.
          If this is done I suggest to start from top to bottom package wise which will guarantee that we have a solid package strucuture and will make patches and dependency handling way easier.
          I want to drive towards a solid contrib module (you code is a huge step towards) so making small steps will eventually bring us that way. IMO it will also speed things up as it is easier to review and to commit.

          Once we fix the smaller issues we can add entries to CHANGES.TXT for each one and combine them towards the end into just a couple. That way we do not miss anything which is important for users even if the API is still in flux.
          We did similar things for moving from Java 1.4 to Java 1.5 and with all the Unicode features.

          Thoughts?

          Show
          Simon Willnauer added a comment - - edited Thank you Chris! Lets go with small issues from here. I suggest we start with a svn move / copy issue that bring names and classes in the right place. With the size and the number of changes in patch this make the most sense to me. The good thing with spatial is that we can change class names and locations / packages as all the API is marked experimental. Once we have this in place people can easily apply the patch and join the issues. If this is done I suggest to start from top to bottom package wise which will guarantee that we have a solid package strucuture and will make patches and dependency handling way easier. I want to drive towards a solid contrib module (you code is a huge step towards) so making small steps will eventually bring us that way. IMO it will also speed things up as it is easier to review and to commit. Once we fix the smaller issues we can add entries to CHANGES.TXT for each one and combine them towards the end into just a couple. That way we do not miss anything which is important for users even if the API is still in flux. We did similar things for moving from Java 1.4 to Java 1.5 and with all the Unicode features. Thoughts?
          Chris Male made changes -
          Attachment LUCENE-2139-Java5Only.patch [ 12427594 ]
          Hide
          Chris Male added a comment -

          Attached updated version of the patch which is now compiled and tested with Java 5. Difference is replacing the LinkedBlockingDeque with a LinkedBlockingQueue, and fixing some weirdness with a single threaded ThreadPoolExecutor.

          Will tackle the structural problems now.

          Show
          Chris Male added a comment - Attached updated version of the patch which is now compiled and tested with Java 5. Difference is replacing the LinkedBlockingDeque with a LinkedBlockingQueue, and fixing some weirdness with a single threaded ThreadPoolExecutor. Will tackle the structural problems now.
          Hide
          Simon Willnauer added a comment -

          Chris, I have a couple of issues with your patch. It seems that you renamed a couple of files which doesn't work well with patches for some reason. I will comment on this later again.
          The other thing is that you use 1.6 classes like [http://java.sun.com/javase/6/docs/api/java/util/concurrent/LinkedBlockingDeque.html|LinkedBlockingDeque] but we should try to keep the contrib 1.5 dependent.
          could you fix those 1.6 references please.

          simon

          Show
          Simon Willnauer added a comment - Chris, I have a couple of issues with your patch. It seems that you renamed a couple of files which doesn't work well with patches for some reason. I will comment on this later again. The other thing is that you use 1.6 classes like [http://java.sun.com/javase/6/docs/api/java/util/concurrent/LinkedBlockingDeque.html|LinkedBlockingDeque] but we should try to keep the contrib 1.5 dependent. could you fix those 1.6 references please. simon
          Chris Male made changes -
          Link This issue is related to LUCENE-1930 [ LUCENE-1930 ]
          Chris Male made changes -
          Link This issue is related to LUCENE-1934 [ LUCENE-1934 ]
          Hide
          Chris Male added a comment -

          I have also included LUCENE-1934 in this, and tried to include LUCENE-1930 but was unable to get 1930 to work.

          Show
          Chris Male added a comment - I have also included LUCENE-1934 in this, and tried to include LUCENE-1930 but was unable to get 1930 to work.
          Chris Male made changes -
          Attachment LUCENE-2139.patch [ 12427487 ]
          Hide
          Chris Male added a comment -

          Added patch.

          Couple of TODOs still noted in the patch related to distances. We need to decide what distances we are going to use for the radius and circumference of the Earth and then use them in SpatialConstants. Currently the SpatialConstants values are taken from Wikipedia and other sites, yet differ from some of the distances in the coded.

          Also the patch doesn't seem to remove a couple of empty packages. Too many changes in 1 patch confusing the IDE I think. Help cleaning this up would be appreciated.

          Show
          Chris Male added a comment - Added patch. Couple of TODOs still noted in the patch related to distances. We need to decide what distances we are going to use for the radius and circumference of the Earth and then use them in SpatialConstants. Currently the SpatialConstants values are taken from Wikipedia and other sites, yet differ from some of the distances in the coded. Also the patch doesn't seem to remove a couple of empty packages. Too many changes in 1 patch confusing the IDE I think. Help cleaning this up would be appreciated.
          Simon Willnauer made changes -
          Link This issue is related to SOLR-1568 [ SOLR-1568 ]
          Simon Willnauer made changes -
          Link This issue is related to LUCENE-2081 [ LUCENE-2081 ]
          Hide
          Simon Willnauer added a comment -

          linked related issues

          Show
          Simon Willnauer added a comment - linked related issues
          Simon Willnauer made changes -
          Link This issue is related to LUCENE-1747 [ LUCENE-1747 ]
          Hide
          Simon Willnauer added a comment -

          cant wait to see you patch - its gonna be huge I guess

          I will be here to help you splitting it apart and get you good work into contrib/spatial

          Show
          Simon Willnauer added a comment - cant wait to see you patch - its gonna be huge I guess I will be here to help you splitting it apart and get you good work into contrib/spatial
          Simon Willnauer made changes -
          Field Original Value New Value
          Assignee Simon Willnauer [ simonw ]
          Chris Male created issue -

            People

            • Assignee:
              Chris Male
              Reporter:
              Chris Male
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development