Lucene - Core
  1. Lucene - Core
  2. LUCENE-1815

Geohash encode/decode floating point problems

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.9
    • Fix Version/s: None
    • Component/s: modules/spatial
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      i'm finding the Geohash support in the spatial package to be rather unreliable.
      Here is the outcome of a test that encodes/decodes the same lat/lon and geohash a few times.
      the format:
      action geohash=(latitude, longitude)

      the result:
      encode u173zq37x014=(52.3738007,4.8909347)
      decode u173zq37x014=(52.373799999999996,4.890934)
      encode u173zq37rpbw=(52.373799999999996,4.890934)
      decode u173zq37rpbw=(52.373799999999996,4.8909329999999995)
      encode u173zq37qzzy=(52.373799999999996,4.8909329999999995)

      if I now change to the google code implementation:

      encode u173zq37x014=(52.3738007,4.8909347)
      decode u173zq37x014=(52.37380061298609,4.890934377908707)
      encode u173zq37x014=(52.37380061298609,4.890934377908707)
      decode u173zq37x014=(52.37380061298609,4.890934377908707)
      encode u173zq37x014=(52.37380061298609,4.890934377908707)

      Note the differences between the geohashes in both situations and the lat/lon's!
      Now things get worse if you work on low-precision geohashes:

      decode u173=(52.0,4.0)
      encode u14zg429yy84=(52.0,4.0)
      decode u14zg429yy84=(52.0,3.999999)
      encode u14zg429ywx6=(52.0,3.999999)

      and google:

      decode u173=(52.20703125,4.5703125)
      encode u17300000000=(52.20703125,4.5703125)
      decode u17300000000=(52.20703125,4.5703125)
      encode u17300000000=(52.20703125,4.5703125)

      We are using geohashes extensively and will now use the google code version unfortunately.

        Activity

        Hide
        Michael McCandless added a comment -

        Wouter, or anyone, do you have an idea on where the problem is, or how to fix it?

        Show
        Michael McCandless added a comment - Wouter, or anyone, do you have an idea on where the problem is, or how to fix it?
        Hide
        Simon Willnauer added a comment -

        Wouter, or anyone, do you have an idea on where the problem is, or how to fix it?

        I'm not sure if there is something to fix. Spatial uses error correction if you use GeoHashUtils#decode. It calculates a precision values and rounds the result accordingly. If you use GeoHashUtils#decode_exactly the result looks much better though if you expect the result to be very very precise.

        don't know if this is a huge issue. I could change the implementation to ignore decode and encode precision maybe that makes our impl closer to the one on google code. Again don't know if that is really an issue.
        The lat values 52.3738007 and 52.373799999999996 are very very close so I guess you won't even realize it on a map.

        simon

        Show
        Simon Willnauer added a comment - Wouter, or anyone, do you have an idea on where the problem is, or how to fix it? I'm not sure if there is something to fix. Spatial uses error correction if you use GeoHashUtils#decode. It calculates a precision values and rounds the result accordingly. If you use GeoHashUtils#decode_exactly the result looks much better though if you expect the result to be very very precise. don't know if this is a huge issue. I could change the implementation to ignore decode and encode precision maybe that makes our impl closer to the one on google code. Again don't know if that is really an issue. The lat values 52.3738007 and 52.373799999999996 are very very close so I guess you won't even realize it on a map. simon
        Hide
        Wouter Heijke added a comment -

        No, I don't have a solution, but I've noticed that 'decode_exactly' is less 'lossy' then 'decode' but still google code is 'lossless'.

        Show
        Wouter Heijke added a comment - No, I don't have a solution, but I've noticed that 'decode_exactly' is less 'lossy' then 'decode' but still google code is 'lossless'.
        Hide
        Simon Willnauer added a comment -

        I don't think this shouldn't be major!

        Show
        Simon Willnauer added a comment - I don't think this shouldn't be major!
        Hide
        Wouter Heijke added a comment -

        To me it was major since geohashes are THE way for us to search a location through millions of records our index has, and small numbers do count!

        I see it this way, if a jpeg picture would not decode like it encoded would you accept it, also if it would be slightly different?

        Right now i don't want to spend my time on finding the cause of the issue since i have working (google) code and I prefer doing cooler stuff like implementing a solution for the 'greenwich' geohash problem.

        Show
        Wouter Heijke added a comment - To me it was major since geohashes are THE way for us to search a location through millions of records our index has, and small numbers do count! I see it this way, if a jpeg picture would not decode like it encoded would you accept it, also if it would be slightly different? Right now i don't want to spend my time on finding the cause of the issue since i have working (google) code and I prefer doing cooler stuff like implementing a solution for the 'greenwich' geohash problem.
        Hide
        patrick o'leary added a comment -

        What google code are you working with?

        Show
        patrick o'leary added a comment - What google code are you working with?
        Hide
        Wouter Heijke added a comment -
        Show
        Wouter Heijke added a comment - I'm happily using now for some time: http://code.google.com/p/geospatialweb/source/browse/trunk/geohash/src/Geohash.java
        Hide
        Erick Erickson added a comment -

        2013 Old JIRA cleanup

        Show
        Erick Erickson added a comment - 2013 Old JIRA cleanup

          People

          • Assignee:
            Unassigned
            Reporter:
            Wouter Heijke
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development