Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30993

GenerateUnsafeRowJoiner corrupts the value if the datatype is UDF and its sql type has fixed length

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.5, 3.0.0
    • 2.4.6, 3.0.0
    • SQL

    Description

      This is reported by user mailing list, though the mail thread is regarding suspect of the behavior of mapGroupsWithState.

      https://lists.apache.org/thread.html/r08b44a7afac4e4c971633d30b4e5d11bd7c0d6e28180e03b874ea58b%40%3Cuser.spark.apache.org%3E

      The actual culprit is, there're a couple of methods which don't handle UDT and it makes GenerateUnsafeRowJoiner to generate incorrect code. Specifically, the issue occurs when the sql type of UDT has fixed length - GenerateUnsafeRowJoiner has the logic to update the offset position for all variable-length data, and due to this bug, UDT field with fixed length is being treated as variable-length data and its value is modified.

      Attachments

        Issue Links

          Activity

            People

              kabhwan Jungtaek Lim
              kabhwan Jungtaek Lim
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: