Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30993

GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.5, 3.0.0
    • Fix Version/s: 2.4.6, 3.0.0
    • Component/s: SQL
    • Labels:

      Description

      This is reported by user mailing list, though the mail thread is regarding suspect of the behavior of mapGroupsWithState.

      https://lists.apache.org/thread.html/r08b44a7afac4e4c971633d30b4e5d11bd7c0d6e28180e03b874ea58b%40%3Cuser.spark.apache.org%3E

      The actual culprit is, there're a couple of methods which don't handle UDT and it makes GenerateUnsafeRowJoiner to generate incorrect code. Specifically, the issue occurs when the sql type of UDT has fixed length - GenerateUnsafeRowJoiner has the logic to update the offset position for all variable-length data, and due to this bug, UDT field with fixed length is being treated as variable-length data and its value is modified.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kabhwan Jungtaek Lim
                Reporter:
                kabhwan Jungtaek Lim
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: