Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26878

QueryTest.compare does not handle maps with array keys correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 3.0.0
    • SQL, Tests
    • None

    Description

      The current strategy for comparing Maps is sorting the (key, value) tuples by _.toString, zipping tuples from both maps together, and then comparing tuples within each of the pairs separately.

      See: https://github.com/apache/spark/blob/ac9c0536bc518f173f2ff53bee42b7a89d28ee20/sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala#L344-L346

      This is not ideal for byte arrays. The string representations of byte arrays looks like “[B@7d263ddc” and has nothing to do with values actually contained within the array.

      Hence, if a map has byte array keys, then random values get compared with each other, which can result in false negatives.

      Attachments

        Issue Links

          Activity

            People

              ala.luszczak Ala Luszczak
              ala.luszczak Ala Luszczak
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: