Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-3021

Equality of nested ROWs returns false for identical values

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.19.0
    • Fix Version/s: 1.20.0
    • Component/s: None

      Description

      Problem can be reproduced via:

      select distinct * from (values
          (1, ROW(1,1)),
          (1, ROW(1,1)),
          (2, ROW(2,2))) as v(id,struct);
      

      Which incorrectly returns a duplicated value:

      +----+--------+
      | ID | STRUCT |
      +----+--------+
      |  1 | {1, 1} |
      |  1 | {1, 1} |
      |  2 | {2, 2} |
      +----+--------+
      (3 rows)
      

      The root cause is that currently ArrayEqualityComparer (which is used as comparer for JavaRowFormat.ARRAY) performs the array comparison based on Arrays#equals and Arrays#hashCode (see Functions.java):

        private static class ArrayEqualityComparer implements EqualityComparer<Object[]> {
          public boolean equal(Object[] v1, Object[] v2) {
            return Arrays.equals(v1, v2);
          }
          public int hashCode(Object[] t) {
            return Arrays.hashCode(t);
          }
        }
      

      This will lead to incorrect comparisons in case of multidimensional arrays, e.g. a row (array) with a struct field (another array) inside. To fix the issue, Arrays#deepEquals / Arrays#deepHashCode should be used:

        private static class ArrayEqualityComparer implements EqualityComparer<Object[]> {
          public boolean equal(Object[] v1, Object[] v2) {
            return Arrays.deepEquals(v1, v2);
          }
          public int hashCode(Object[] t) {
            return Arrays.deepHashCode(t);
          }
        }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rubenql Ruben Q L
                Reporter:
                rubenql Ruben Q L
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m