Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36918

unionByName shouldn't consider types when comparing structs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0
    • SQL
    • None

    Description

      Improvement/follow-on of https://issues.apache.org/jira/browse/SPARK-35290.

      We use StructType.sameType to see if we need to recreate the struct, but this can lead to false positives if the structure is the same but the types are different, and will lead to simply creating a new struct that's exactly the same as the original. This can cause significant overhead when unioning multiple deeply nested nullable structs, as each time it's recreated it gets wrapped in a If(IsNull()). Only comparing the field names can lead to more efficient plans.

      Attachments

        Issue Links

          Activity

            People

              kimahriman Adam Binford
              kimahriman Adam Binford
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: