Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: master (7.0)
    • Fix Version/s: 6.1, master (7.0)
    • Component/s: modules/spatial3d
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Here's the failure:

      ant test  -Dtestcase=TestGeo3DPoint -Dtests.method=testEncodeDecodeRoundsDown -Dtests.seed=7046405B94C1716E -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=da-DK -Dtests.timezone=America/Detroit -Dtests.asserts=true -Dtests.file.encoding=UTF-8
      
      1. LUCENE-7312.patch
        8 kB
        Michael McCandless

        Activity

        Hide
        kwright@metacarta.com Karl Wright added a comment -

        Michael McCandless: This is reproducible, and it's your test. Any comments?

        Show
        kwright@metacarta.com Karl Wright added a comment - Michael McCandless : This is reproducible, and it's your test. Any comments?
        Hide
        mikemccand Michael McCandless added a comment -

        Hmm I'll dig ...

        Show
        mikemccand Michael McCandless added a comment - Hmm I'll dig ...
        Hide
        mikemccand Michael McCandless added a comment -

        The test just encodes and then decodes and checks that the decoded version indeed rounded down, yet for this particular value, for some reason, the Y dimension failed to round down:

           [junit4]   1> iter=2881 i=243 lat=-61.000000000001734 lon=-139.2156822602591
           [junit4]   1>   point=[lat=-1.0646508437165714, lon=-2.4297720258517823([X=-0.36655223771437473, Y=-0.31622436580292346, Z=-0.8733499201429061])]
           [junit4]   1>   pointEnc=[X=-0.3665522380425261, Y=-0.3162243658029234, Z=-0.8733499202383181]
        
        Show
        mikemccand Michael McCandless added a comment - The test just encodes and then decodes and checks that the decoded version indeed rounded down, yet for this particular value, for some reason, the Y dimension failed to round down: [junit4] 1> iter=2881 i=243 lat=-61.000000000001734 lon=-139.2156822602591 [junit4] 1> point=[lat=-1.0646508437165714, lon=-2.4297720258517823([X=-0.36655223771437473, Y=-0.31622436580292346, Z=-0.8733499201429061])] [junit4] 1> pointEnc=[X=-0.3665522380425261, Y=-0.3162243658029234, Z=-0.8733499202383181]
        Hide
        kwright@metacarta.com Karl Wright added a comment -

        How odd... I don't know why this would be, but it seems like maybe we have to dump the numbers in question as hex values in order to see what is in fact happening?

        Show
        kwright@metacarta.com Karl Wright added a comment - How odd... I don't know why this would be, but it seems like maybe we have to dump the numbers in question as hex values in order to see what is in fact happening?
        Hide
        mikemccand Michael McCandless added a comment -

        OK indeed this is a double precision issue ... I boiled it down to this small test:

          public void testOneValue() throws Exception {
            double DECODE = 4.661822942981865E-10;
            double x = -0.31622436580292346;
            int xEnc = (int) Math.floor(x / DECODE);  // -678327705
            double xDec = xEnc * DECODE;
            // because we floor'd on encoding to xEnc, this should be true:
            assertTrue("x=" + x + " xDec=" + xDec, xDec <= x);
          }
        

        which fails with this:

          java.lang.AssertionError: x=-0.31622436580292346 xDec=-0.3162243658029234
        

        The reason is that the value x / DECODE is very, very close to an
        integer value, just a hair below it, such that when quantized, it
        jumps just a bit above that int value, causing the floor to return the
        "wrong" value.

        You can also see it with Python's rational number module (fractions) too:

        >>> DECODE = fractions.Fraction(4.661822942981865E-10)
        >>> x = fractions.Fraction(-0.31622436580292346)
        >>> math.floor(x / DECODE)
        -678327706
        >>> math.floor(float(x / DECODE))
        -678327705
        

        I.e. the true floor in this case is -678327706, but if you first
        quantize to 64 bits and take the floor, you get one higher.

        I think we should go back to the "safe" doubles solution we used to
        have, where DECODE is the next higher double that doesn't use any of
        its lower 32 bits.

        I'll also port over a nice 2D test Rob pointed me to, which should be
        more efficient for finding quantization issues.

        Show
        mikemccand Michael McCandless added a comment - OK indeed this is a double precision issue ... I boiled it down to this small test: public void testOneValue() throws Exception { double DECODE = 4.661822942981865E-10; double x = -0.31622436580292346; int xEnc = (int) Math.floor(x / DECODE); // -678327705 double xDec = xEnc * DECODE; // because we floor'd on encoding to xEnc, this should be true: assertTrue("x=" + x + " xDec=" + xDec, xDec <= x); } which fails with this: java.lang.AssertionError: x=-0.31622436580292346 xDec=-0.3162243658029234 The reason is that the value x / DECODE is very, very close to an integer value, just a hair below it, such that when quantized, it jumps just a bit above that int value, causing the floor to return the "wrong" value. You can also see it with Python's rational number module ( fractions ) too: >>> DECODE = fractions.Fraction(4.661822942981865E-10) >>> x = fractions.Fraction(-0.31622436580292346) >>> math.floor(x / DECODE) -678327706 >>> math.floor(float(x / DECODE)) -678327705 I.e. the true floor in this case is -678327706, but if you first quantize to 64 bits and take the floor, you get one higher. I think we should go back to the "safe" doubles solution we used to have, where DECODE is the next higher double that doesn't use any of its lower 32 bits. I'll also port over a nice 2D test Rob pointed me to, which should be more efficient for finding quantization issues.
        Hide
        mikemccand Michael McCandless added a comment -

        Patch, going back to "safe" doubles, and adding a better quantization test ... tests seem to survive some beasting.

        Show
        mikemccand Michael McCandless added a comment - Patch, going back to "safe" doubles, and adding a better quantization test ... tests seem to survive some beasting.
        Hide
        kwright@metacarta.com Karl Wright added a comment -

        This looks reasonable, except for some debugging code:

        @@ -1172,6 +1173,11 @@ public class TestGeo3DPoint extends LuceneTestCase {
                 GeoPoint pointEnc = new GeoPoint(Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.x)),
                                                  Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.y)),
                                                  Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.z)));
        +        if (iter >= 2881) {
        +          System.out.println("iter=" + iter + " i=" + i + " lat=" + lat + " lon=" + lon);
        +          System.out.println("  point=" + point);
        +          System.out.println("  pointEnc=" + pointEnc);
        +        }
                 assertTrue(pointEnc.x <= point.x);
                 assertTrue(pointEnc.y <= point.y);
                 assertTrue(pointEnc.z <= point.z);
        

        I don't actually understand what getNextSafeDouble() is intended to do, though?

        Show
        kwright@metacarta.com Karl Wright added a comment - This looks reasonable, except for some debugging code: @@ -1172,6 +1173,11 @@ public class TestGeo3DPoint extends LuceneTestCase { GeoPoint pointEnc = new GeoPoint(Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.x)), Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.y)), Geo3DUtil.decodeValueFloor(Geo3DUtil.encodeValue(point.z))); + if (iter >= 2881) { + System .out.println( "iter=" + iter + " i=" + i + " lat=" + lat + " lon=" + lon); + System .out.println( " point=" + point); + System .out.println( " pointEnc=" + pointEnc); + } assertTrue(pointEnc.x <= point.x); assertTrue(pointEnc.y <= point.y); assertTrue(pointEnc.z <= point.z); I don't actually understand what getNextSafeDouble() is intended to do, though?
        Hide
        kwright@metacarta.com Karl Wright added a comment - - edited

        Ok, NVM, in the context of your comment above I see what it's trying to do, and it looks like it does that. So I think this solution is fine.

        Show
        kwright@metacarta.com Karl Wright added a comment - - edited Ok, NVM, in the context of your comment above I see what it's trying to do, and it looks like it does that. So I think this solution is fine.
        Hide
        kwright@metacarta.com Karl Wright added a comment -

        Hmm, looking at what I did for Geo3DDocValuesField, and trying to see whether it would suffer from the same issue, it looks like I did something somewhat different:

          private static int encodeX(final double x) {
            if (x > PlanetModel.WGS84.getMaximumXValue()) {
              throw new IllegalArgumentException("x value exceeds WGS84 maximum");
            } else if (x < PlanetModel.WGS84.getMinimumXValue()) {
              throw new IllegalArgumentException("x value less than WGS84 minimum");
            }
            return (int)Math.floor((x - PlanetModel.WGS84.getMinimumXValue()) * xFactor + 0.5);
          }
          
          private static double decodeX(final int x) {
            return x * inverseXFactor + PlanetModel.WGS84.getMinimumXValue();
          }
        

        ... where xFactor and inverseXFactor are related such that inverseXFactor = 1.0/xFactor. Basically, by adding the 0.5 right before floor time I made sure no such effects could happen? I think?? This stuff makes my head hurt...

        Show
        kwright@metacarta.com Karl Wright added a comment - Hmm, looking at what I did for Geo3DDocValuesField, and trying to see whether it would suffer from the same issue, it looks like I did something somewhat different: private static int encodeX( final double x) { if (x > PlanetModel.WGS84.getMaximumXValue()) { throw new IllegalArgumentException( "x value exceeds WGS84 maximum" ); } else if (x < PlanetModel.WGS84.getMinimumXValue()) { throw new IllegalArgumentException( "x value less than WGS84 minimum" ); } return ( int ) Math .floor((x - PlanetModel.WGS84.getMinimumXValue()) * xFactor + 0.5); } private static double decodeX( final int x) { return x * inverseXFactor + PlanetModel.WGS84.getMinimumXValue(); } ... where xFactor and inverseXFactor are related such that inverseXFactor = 1.0/xFactor. Basically, by adding the 0.5 right before floor time I made sure no such effects could happen? I think?? This stuff makes my head hurt...
        Hide
        mikemccand Michael McCandless added a comment -

        except for some debugging code

        Woops, I'll remove.

        Basically, by adding the 0.5 right before floor time I made sure no such effects could happen?

        Hmm, I think you can still see these effects with this math, unfortunately. Probably if you ported this new test case over the this Geo3DDVField's encoding it would fail?

        Floating point precision issues are a nightmare when you want tests to be precise

        Show
        mikemccand Michael McCandless added a comment - except for some debugging code Woops, I'll remove. Basically, by adding the 0.5 right before floor time I made sure no such effects could happen? Hmm, I think you can still see these effects with this math, unfortunately. Probably if you ported this new test case over the this Geo3DDVField's encoding it would fail? Floating point precision issues are a nightmare when you want tests to be precise
        Hide
        kwright@metacarta.com Karl Wright added a comment -

        Ok, when I have some time, I'll try to port the test.

        Show
        kwright@metacarta.com Karl Wright added a comment - Ok, when I have some time, I'll try to port the test.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit dee57ce23fbed7bc1950b50eb71f6b0a3b0baf65 in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dee57ce ]

        LUCENE-7312: fix geo3d's encoding to always round down

        Show
        jira-bot ASF subversion and git services added a comment - Commit dee57ce23fbed7bc1950b50eb71f6b0a3b0baf65 in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dee57ce ] LUCENE-7312 : fix geo3d's encoding to always round down
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 724f7424eaf19b59acce28edc61a40d6086a3d70 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=724f742 ]

        LUCENE-7312: fix geo3d's encoding to always round down

        Show
        jira-bot ASF subversion and git services added a comment - Commit 724f7424eaf19b59acce28edc61a40d6086a3d70 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=724f742 ] LUCENE-7312 : fix geo3d's encoding to always round down

          People

          • Assignee:
            kwright@metacarta.com Karl Wright
            Reporter:
            kwright@metacarta.com Karl Wright
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development