Commons Math
  1. Commons Math
  2. MATH-632

NaN: Method "equals" in Complex not consistent with "==" for "double" primitive type

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: 3.0
    • Labels:
      None

      Description

      The following tests show several contradictions:

      final double a = Double.NaN;
      final double b = Double.NaN;
      Assert.assertFalse("a == b", a == b); // (1)
      Assert.assertEquals("a != b", a, b, Double.MIN_VALUE); // (2)
      Assert.assertFalse("a == b", MathUtils.equals(a, b, Double.MIN_VALUE)); // (3)
      Assert.assertFalse("a == b", MathUtils.equals(a, b, Double.MIN_VALUE)); // (4)
      final Double dA = Double.valueOf(a);
      final Double dB = Double.valueOf(b);
      Assert.assertFalse("dA == dB", dA.doubleValue() == dB.doubleValue()); // (5)
      Assert.assertTrue("!dA.equals(dB)", dA.equals(dB)); // (6)
      final Complex cA = new Complex(a, 0);
      final Complex cB = new Complex(b, 0);
      Assert.assertTrue("!cA.equals(cB)", cA.equals(cB));  // (7)
      

      They all pass; thus:

      1. "Double" does not behave as "double": (1) and (5) vs (6)
      2. Two NaNs are almost equal for Junit: (2)
      3. Two NaNs are never equal for MathUtils: (3) and (4)
      4. Complex.NaN is consistent with Object "Double.valueOf(NaN)" (hence not with primitive "Double.NaN"): (7)

      This is quite confusing.

      In MathUtils, we chose to follow IEEE754 (and Java for primitive "double"), i.e. it is "correct" that assertion (1) is false. Do we want "Complex" to conform with this or with the inconsistent behaviour of "Double"?

        Activity

        Luc Maisonobe made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Luc Maisonobe added a comment -

        changing status to closed as 3.0 has been released

        Show
        Luc Maisonobe added a comment - changing status to closed as 3.0 has been released
        Gilles made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Not A Problem [ 8 ]
        Hide
        Gilles added a comment -

        Clarifying note added in revision 1154392.

        Show
        Gilles added a comment - Clarifying note added in revision 1154392.
        Hide
        Gilles added a comment -

        The javadoc for equals in Colt's Complex (implemented the same way we do) points out that implementing equals this way makes Complex instances work correctly in hashtables.

        Actually they probably copied the similar comment from here.

        I'm convinced that the ambiguity (Object/primitive) cannot be lifted, and I probably should not worry anymore that CM is not more consistent than Java itself...

        As Sebb mentions, one should not search the archive to get an explanation for such seemingly contradictory behaviour. It suffices to say that we follow IEEE754 for "MathUtils.equals" and that we don't for any "Object" even though it would represent a (tuplet of) real numbers, on a par with what Java does.

        Show
        Gilles added a comment - The javadoc for equals in Colt's Complex (implemented the same way we do) points out that implementing equals this way makes Complex instances work correctly in hashtables. Actually they probably copied the similar comment from here . I'm convinced that the ambiguity (Object/primitive) cannot be lifted, and I probably should not worry anymore that CM is not more consistent than Java itself... As Sebb mentions, one should not search the archive to get an explanation for such seemingly contradictory behaviour. It suffices to say that we follow IEEE754 for "MathUtils.equals" and that we don't for any "Object" even though it would represent a (tuplet of) real numbers, on a par with what Java does.
        Hide
        Sebb added a comment -

        For future reference, we should mention these design decisions in the code comments.

        Show
        Sebb added a comment - For future reference, we should mention these design decisions in the code comments.
        Hide
        Phil Steitz added a comment -

        I forgot about a couple of things that we considered in relation to this in the past. The first is the status of the relevant "spec," which is Annex G of the C99x spec for the C language. We decided not to try to strictly adhere to this spec for reasons documented in MATH-5: a) it is a C language spec b) it is not open or freely available (we could not even get blanket approval to quote from the spec in our javadoc) c) when we last checked it was still not normative and d) trying to adhere strictly to the spec for arithmetic operations would kill performance. Search the commons-dev archives for discussion.

        The second thing I forgot is that in this and some other decisions, we were influenced by Colt (the old com.imsl.math package), which took a practical approach. The javadoc for equals in Colt's Complex (implemented the same way we do) points out that implementing equals this way makes Complex instances work correctly in hashtables. This is another reason that we used the definition that we did.

        Show
        Phil Steitz added a comment - I forgot about a couple of things that we considered in relation to this in the past. The first is the status of the relevant "spec," which is Annex G of the C99x spec for the C language. We decided not to try to strictly adhere to this spec for reasons documented in MATH-5 : a) it is a C language spec b) it is not open or freely available (we could not even get blanket approval to quote from the spec in our javadoc) c) when we last checked it was still not normative and d) trying to adhere strictly to the spec for arithmetic operations would kill performance. Search the commons-dev archives for discussion. The second thing I forgot is that in this and some other decisions, we were influenced by Colt (the old com.imsl.math package), which took a practical approach. The javadoc for equals in Colt's Complex (implemented the same way we do) points out that implementing equals this way makes Complex instances work correctly in hashtables. This is another reason that we used the definition that we did.
        Hide
        Phil Steitz added a comment -

        The definition above is not reflexive, so can't be implemented as equals in Java. We would have to add a reference equality check to make it a legitimate equals implementation, making every NaN-infected instance an equals singleton. Unless someone can provide a practical reason why this is better in applications, we should not change it, since it will break any applications that depend on the current implementation.

        Show
        Phil Steitz added a comment - The definition above is not reflexive, so can't be implemented as equals in Java. We would have to add a reference equality check to make it a legitimate equals implementation, making every NaN-infected instance an equals singleton. Unless someone can provide a practical reason why this is better in applications, we should not change it, since it will break any applications that depend on the current implementation.
        Hide
        Gilles added a comment -

        The alternative implementation was reported by Luc from the C++ standard:

          (a.real == b.real) && (a.imaginary == b.imaginary)
        

        It is better simply because it is consistent with whatever is defined by the constituents of the complex number.

        As I said, I don't really care whether NaN == NaN or NaN != NaN. But, I repeat, I'm not knowledgeable enough in floating point intricacies to say that the standard is wrong or useless. Are you?

        Maybe we shouldn't care about IEEE574, and decide to ban any algorithm that would rely on NaN != NaN. And in that case, to be consistent, I would also propose to change "MathUtils.equals" so that it treats two NaNs as equal (and do away with the "equalsIncludingNaN" methods).

        Some time ago, we decided to conform with IEEE754; but if it is more convenient to not conform and if it doesn't hurt anyone...

        Show
        Gilles added a comment - The alternative implementation was reported by Luc from the C++ standard: (a.real == b.real) && (a.imaginary == b.imaginary) It is better simply because it is consistent with whatever is defined by the constituents of the complex number. As I said, I don't really care whether NaN == NaN or NaN != NaN. But, I repeat, I'm not knowledgeable enough in floating point intricacies to say that the standard is wrong or useless. Are you? Maybe we shouldn't care about IEEE574, and decide to ban any algorithm that would rely on NaN != NaN. And in that case, to be consistent, I would also propose to change "MathUtils.equals" so that it treats two NaNs as equal (and do away with the "equalsIncludingNaN" methods). Some time ago, we decided to conform with IEEE754; but if it is more convenient to not conform and if it doesn't hurt anyone...
        Hide
        Phil Steitz added a comment -

        Complex is an object, so has to have an equals implementation that is reflexive, symmetric and transitive. Do you have an alternative implementation in mind? And an explanation of why the alternative is better?

        My reference to collections comes from my own use in simulating complex dynamical systems. Like floating point equals comparisons among doubles, I don't use Complex.equals much, and when I do it is either in test code or comparing results of divergent processes. When processes diverge, it is convenient to be able to compare results wrt infinities and NaNs. In that case, it is convenient to have equals return true for NaNs.

        Show
        Phil Steitz added a comment - Complex is an object, so has to have an equals implementation that is reflexive, symmetric and transitive. Do you have an alternative implementation in mind? And an explanation of why the alternative is better? My reference to collections comes from my own use in simulating complex dynamical systems. Like floating point equals comparisons among doubles, I don't use Complex.equals much, and when I do it is either in test code or comparing results of divergent processes. When processes diverge, it is convenient to be able to compare results wrt infinities and NaNs. In that case, it is convenient to have equals return true for NaNs.
        Hide
        Gilles added a comment -

        [...] equals has to be an equivalence relation [...]

        "==" is an equivalence relation for real numbers. Nevertheless the IEEE574 deemed it important that it is broken when the floating point representation of a real number is NaN (which is not the representation of any real number).
        IIUC correctly (and probably very partially), NaNs are supposedly useful in the implementations of algorithms (which is what CM does).

        The difference between "Object" and primitive is not relevant for deciding which is more important:

        • The purpose of NaNs and conformance with a standard for the representation of real numbers, or
        • the equivalence relation of the representations of real numbers ("Double", "double", "Complex").

        To view it differently: "double" would/should not have existed in Java (as an OO language) if it had been possible to deal exclusively with objects without a significant loss of performance. CM uses (almost?) exclusively "double" and not "Double"; so I think that it makes sense to ask whether or not "Complex" should be consistent with "double" (making the examples quite relevant).

        I don't understand how collections interfere with this issue...

        Show
        Gilles added a comment - [...] equals has to be an equivalence relation [...] "==" is an equivalence relation for real numbers. Nevertheless the IEEE574 deemed it important that it is broken when the floating point representation of a real number is NaN (which is not the representation of any real number). IIUC correctly (and probably very partially), NaNs are supposedly useful in the implementations of algorithms (which is what CM does). The difference between "Object" and primitive is not relevant for deciding which is more important: The purpose of NaNs and conformance with a standard for the representation of real numbers, or the equivalence relation of the representations of real numbers ("Double", "double", "Complex"). To view it differently: "double" would/should not have existed in Java (as an OO language) if it had been possible to deal exclusively with objects without a significant loss of performance. CM uses (almost?) exclusively "double" and not "Double"; so I think that it makes sense to ask whether or not "Complex" should be consistent with "double" (making the examples quite relevant). I don't understand how collections interfere with this issue...
        Hide
        Phil Steitz added a comment -

        The difference between Double and double, and Complex vs double is that the things with capital letters are objects. Jave had to define Double equals to make NaN equal to NaN because equals has to be an equivalence relation, which means it has to be reflexive. Personally, I don't see any problem with Complex equals behaving differently from what we now have in MathUtils.equals, which is a specialized method not intended to represent an equivalence relation on objects. If we did for some reason define our own "real" object, I would expect it to behave as Double does - making NaN equal to itself. The only reasonable alternative to the current implementation of Complex equals (and ArrayRealVector equals, for that matter) is to isolate every NaN-infected instance into its own equivalence class. That is less convenient when dealing with collections of results. Note that, like Double, we will still in any case have to make equals reflexive, which will make it appear "inconsistent" with the (in my mind irrelevant) examples above.

        Show
        Phil Steitz added a comment - The difference between Double and double, and Complex vs double is that the things with capital letters are objects . Jave had to define Double equals to make NaN equal to NaN because equals has to be an equivalence relation, which means it has to be reflexive. Personally, I don't see any problem with Complex equals behaving differently from what we now have in MathUtils.equals, which is a specialized method not intended to represent an equivalence relation on objects. If we did for some reason define our own "real" object, I would expect it to behave as Double does - making NaN equal to itself. The only reasonable alternative to the current implementation of Complex equals (and ArrayRealVector equals, for that matter) is to isolate every NaN-infected instance into its own equivalence class. That is less convenient when dealing with collections of results. Note that, like Double, we will still in any case have to make equals reflexive, which will make it appear "inconsistent" with the (in my mind irrelevant) examples above.
        Hide
        Gilles added a comment -

        The problem starts here:

         (x == x) is true, except when (x == NaN)
        

        Some people advocate that this should not be so, i.e.

         (x == x) is true, always
        

        but IEEE574 has chosen the former. And Java primitive "double" conforms to this.
        I don't understand the rationale of having decided that "Double" should behave differently than "double", but this is beyond our reach anyway . But, if you'd have to define "equals" for "Double" wouldn't you check the equality of the "double" returned by "doubleValue()"? In the example reported by Luc, this is done similarly and two complex NaNs won't be equal.

        What bothers me, slightly [1], is that CM is not consistent with itself: A "Complex" instance is an abstraction/approximation of a complex number, in the same way that a "double" is an abstraction/approximation of a real number. So why would you consider that

        [...] it makes sense to lump all instances with NaN parts into one equivalence class modulo equals.

        and at the same time that CM should also follow IEEE754 for "double" (cf. "MathUtils.equals"), which is the opposite of the previous statement?

        [1] Personally, I never had to use NaN beyond being a signal that there was a bug in my code.

        Show
        Gilles added a comment - The problem starts here: (x == x) is true, except when (x == NaN) Some people advocate that this should not be so, i.e. (x == x) is true, always but IEEE574 has chosen the former. And Java primitive "double" conforms to this. I don't understand the rationale of having decided that "Double" should behave differently than "double", but this is beyond our reach anyway . But, if you'd have to define "equals" for "Double" wouldn't you check the equality of the "double" returned by "doubleValue()"? In the example reported by Luc, this is done similarly and two complex NaNs won't be equal. What bothers me, slightly [1] , is that CM is not consistent with itself: A "Complex" instance is an abstraction/approximation of a complex number, in the same way that a "double" is an abstraction/approximation of a real number. So why would you consider that [...] it makes sense to lump all instances with NaN parts into one equivalence class modulo equals. and at the same time that CM should also follow IEEE754 for "double" (cf. "MathUtils.equals"), which is the opposite of the previous statement? [1] Personally, I never had to use NaN beyond being a signal that there was a bug in my code.
        Hide
        Luc Maisonobe added a comment -

        I have tried to find again some external references from C++ standard and C99 standard. They seem to only specify behavior of == as a logical (a.real == b.real) && (a.imaginary == b.imaginary), which would lead to numbers with NaN never been equal to anything, including themselves.

        We don't use complex internally in [math] yet (at least I am not aware of it). However, some existing users do (at least Arne seems to, as he asked for several changes). Perhaps we should ask on the users list (not dev) for users comments on this, as most users will not be aware of this Jira issue here.

        We may need to use complex in [math] by ourselves for some algorithms we want to implement. I see at least to difference use cases: eigen value decomposition for non-symmetric matrices and root solvers. Do they use some standard definition ?

        Show
        Luc Maisonobe added a comment - I have tried to find again some external references from C++ standard and C99 standard. They seem to only specify behavior of == as a logical (a.real == b.real) && (a.imaginary == b.imaginary), which would lead to numbers with NaN never been equal to anything, including themselves. We don't use complex internally in [math] yet (at least I am not aware of it). However, some existing users do (at least Arne seems to, as he asked for several changes). Perhaps we should ask on the users list (not dev) for users comments on this, as most users will not be aware of this Jira issue here. We may need to use complex in [math] by ourselves for some algorithms we want to implement. I see at least to difference use cases: eigen value decomposition for non-symmetric matrices and root solvers. Do they use some standard definition ?
        Hide
        Phil Steitz added a comment - - edited

        Can anyone present a practical argument for changing the current documented behavior of Complex.equals? There is no perfect solution here, given the way equals is defined for doubles in Java. The current behavior is simple, well-documented and has been defined this way since version 1.0. Changing it may break some code that depends on it, so we need to have good practical reasons to change.

        From my perspective, the current implementation of Complex equals, which is consistent with what we also do for ArrayRealVectors, is natural and convenient. I don't see the examples as particularly relevant, since a Complex instance is not a pair of doubles, but an object that has two double-valued attributes. Once a Complex number has a NaN part, it is for all practical purposes NaN, so it makes sense to lump all instances with NaN parts into one equivalence class modulo equals. IIRC, this was the rationale used to define the current implementation of Complex equals.

        Show
        Phil Steitz added a comment - - edited Can anyone present a practical argument for changing the current documented behavior of Complex.equals? There is no perfect solution here, given the way equals is defined for doubles in Java. The current behavior is simple, well-documented and has been defined this way since version 1.0. Changing it may break some code that depends on it, so we need to have good practical reasons to change. From my perspective, the current implementation of Complex equals, which is consistent with what we also do for ArrayRealVectors, is natural and convenient. I don't see the examples as particularly relevant, since a Complex instance is not a pair of doubles, but an object that has two double-valued attributes. Once a Complex number has a NaN part, it is for all practical purposes NaN, so it makes sense to lump all instances with NaN parts into one equivalence class modulo equals. IIRC, this was the rationale used to define the current implementation of Complex equals.
        Hide
        Gilles added a comment -

        Tests committed in revision 1150496.

        Show
        Gilles added a comment - Tests committed in revision 1150496.
        Gilles created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Gilles
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development