Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38542

UnsafeHashedRelation should serialize numKeys out

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0, 3.2.2
    • SQL

    Description

      At present, UnsafeHashedRelation does not write out numKeys during serialization, so the numKeys of UnsafeHashedRelation obtained by deserialization is equal to 0. The numFields of UnsafeRows returned by UnsafeHashedRelation.keys() are all 0, which can lead to missing or incorrect data.

       

      For example, in SubqueryBroadcastExec, the HashedRelation.keys() function is called.

      val broadcastRelation = child.executeBroadcast[HashedRelation]().value
      val (iter, expr) = if (broadcastRelation.isInstanceOf[LongHashedRelation]) {
        (broadcastRelation.keys(), HashJoin.extractKeyExprAt(buildKeys, index))
      } else {
        (broadcastRelation.keys(),
          BoundReference(index, buildKeys(index).dataType, buildKeys(index).nullable))
      }

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            mcdull_zhang mcdull_zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: