Lucene - Core
  1. Lucene - Core
  2. LUCENE-6647

Add GeoHash String Utilities to core GeoUtils

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      GeoPointField uses morton encoding to efficiently pack lat/lon values into a single long. GeoHashing effectively does the same thing but uses base 32 encoding to represent this long value as a "human readable" string. Many user applications already use the string representation of the hash. This issue simply adds the base32 string representation of the already computed morton code.

      1. LUCENE-6647.patch
        16 kB
        Nicholas Knize
      2. LUCENE-6647.patch
        16 kB
        Nicholas Knize
      3. LUCENE-6647.patch
        17 kB
        Nicholas Knize
      4. LUCENE-6647.patch
        4 kB
        Nicholas Knize

        Activity

        Hide
        Nicholas Knize added a comment -

        Initial patch that adds GeoHash string utilities to GeoUtils.java

        Currently only tested and validated against Elasticsearch. Will add unit tests to next patch.

        Show
        Nicholas Knize added a comment - Initial patch that adds GeoHash string utilities to GeoUtils.java Currently only tested and validated against Elasticsearch. Will add unit tests to next patch.
        Hide
        Nicholas Knize added a comment -

        Updated GeoHash patch with unit tests.

        Show
        Nicholas Knize added a comment - Updated GeoHash patch with unit tests.
        Hide
        Michael McCandless added a comment -

        Thanks Nicholas Knize, the geohash utilities and tests look good.

        But I hit this test failure:

           [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery
           [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery -Dtests.method=testWholeMap -Dtests.seed=4949D67148502A2 -Dtests.locale=it -Dtests.timezone=Australia/Canberra -Dtests.asserts=true -Dtests.file.encoding=UTF-8
           [junit4] FAILURE 1.79s J3 | TestGeoPointQuery.testWholeMap <<<
           [junit4]    > Throwable #1: java.lang.AssertionError: testWholeMap failed expected:<15> but was:<16>
           [junit4]    > 	at __randomizedtesting.SeedInfo.seed([4949D67148502A2:825F170DAFB39C04]:0)
           [junit4]    > 	at org.apache.lucene.search.TestGeoPointQuery.testWholeMap(TestGeoPointQuery.java:181)
           [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
           [junit4] IGNOR/A 0.02s J3 | TestGeoPointQuery.testRandomBig
           [junit4]    > Assumption #1: 'nightly' test group is disabled (@Nightly())
           [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): {id=BlockTreeOrds(blocksize=128), geoField=Lucene50(blocksize=128)}, docValues:{id=DocValuesFormat(name=Lucene50)}, sim=RandomSimilarityProvider(queryNorm=false,coord=crazy): {}, locale=it, timezone=Australia/Canberra
           [junit4]   2> NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=310567944,total=451936256
           [junit4]   2> NOTE: All tests run in this JVM: [TestSlowFuzzyQuery, TestDocValuesNumbersQuery, TestJakartaRegexpCapabilities, TestDocValuesTermsQuery, TestGeoPointQuery]
           [junit4] Completed [14/15] on J3 in 3.40s, 12 tests, 1 failure, 1 skipped <<< FAILURES!
        
        Show
        Michael McCandless added a comment - Thanks Nicholas Knize , the geohash utilities and tests look good. But I hit this test failure: [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testWholeMap -Dtests.seed=4949D67148502A2 -Dtests.locale=it -Dtests.timezone=Australia/Canberra -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 1.79s J3 | TestGeoPointQuery.testWholeMap <<< [junit4] > Throwable #1: java.lang.AssertionError: testWholeMap failed expected:<15> but was:<16> [junit4] > at __randomizedtesting.SeedInfo.seed([4949D67148502A2:825F170DAFB39C04]:0) [junit4] > at org.apache.lucene.search.TestGeoPointQuery.testWholeMap(TestGeoPointQuery.java:181) [junit4] > at java.lang.Thread.run(Thread.java:745) [junit4] IGNOR/A 0.02s J3 | TestGeoPointQuery.testRandomBig [junit4] > Assumption #1: 'nightly' test group is disabled (@Nightly()) [junit4] 2> NOTE: test params are: codec=Asserting(Lucene53): {id=BlockTreeOrds(blocksize=128), geoField=Lucene50(blocksize=128)}, docValues:{id=DocValuesFormat(name=Lucene50)}, sim=RandomSimilarityProvider(queryNorm=false,coord=crazy): {}, locale=it, timezone=Australia/Canberra [junit4] 2> NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=310567944,total=451936256 [junit4] 2> NOTE: All tests run in this JVM: [TestSlowFuzzyQuery, TestDocValuesNumbersQuery, TestJakartaRegexpCapabilities, TestDocValuesTermsQuery, TestGeoPointQuery] [junit4] Completed [14/15] on J3 in 3.40s, 12 tests, 1 failure, 1 skipped <<< FAILURES!
        Hide
        Nicholas Knize added a comment - - edited

        Latest patch for LUCENE-6704 changes mortonEncoding to use full 32bit precision for lat/lon values. This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above. A patch will be posted here that is compatible with the changes from LUCENE-6704.

        Show
        Nicholas Knize added a comment - - edited Latest patch for LUCENE-6704 changes mortonEncoding to use full 32bit precision for lat/lon values. This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above. A patch will be posted here that is compatible with the changes from LUCENE-6704 .
        Hide
        Nicholas Knize added a comment -

        Updated patch that depends on LUCENE-6704 - changes morton encoding to use full 64 bits, 32 bits per lat/lon.

        Show
        Nicholas Knize added a comment - Updated patch that depends on LUCENE-6704 - changes morton encoding to use full 64 bits, 32 bits per lat/lon.
        Hide
        Michael McCandless added a comment -

        This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above.

        Hmm can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon? LUCENE-6704 is about avoiding OOME (or is the full 32 precision necessary to avoid OOME?) ... then we can decouple these issues? It's hard enough keeping track of all the in-flight patches without some depending on others...

        Show
        Michael McCandless added a comment - This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above. Hmm can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon? LUCENE-6704 is about avoiding OOME (or is the full 32 precision necessary to avoid OOME?) ... then we can decouple these issues? It's hard enough keeping track of all the in-flight patches without some depending on others...
        Hide
        Nicholas Knize added a comment -

        can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon?

        LUCENE-6710 adds full 32 bit precision decoupling this issue from LUCENE-6704.

        Patch attached to make GeoHashUtils bit precision independent. Unit test provided.

        Show
        Nicholas Knize added a comment - can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon? LUCENE-6710 adds full 32 bit precision decoupling this issue from LUCENE-6704 . Patch attached to make GeoHashUtils bit precision independent. Unit test provided.
        Hide
        Michael McCandless added a comment -

        Thanks Nicholas Knize, I'll commit shortly...

        Show
        Michael McCandless added a comment - Thanks Nicholas Knize , I'll commit shortly...
        Hide
        ASF subversion and git services added a comment -

        Commit 1693700 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1693700 ]

        LUCENE-6647: add GeoHash string utility APIs

        Show
        ASF subversion and git services added a comment - Commit 1693700 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1693700 ] LUCENE-6647 : add GeoHash string utility APIs
        Hide
        ASF subversion and git services added a comment -

        Commit 1693702 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1693702 ]

        LUCENE-6647: add GeoHash string utility APIs

        Show
        ASF subversion and git services added a comment - Commit 1693702 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1693702 ] LUCENE-6647 : add GeoHash string utility APIs
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Unassigned
            Reporter:
            Nicholas Knize
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development