Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41498

Union does not propagate Metadata output

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1
    • None
    • SQL
    • None

    Description

      Currently, the Union operator does not propagate any metadata output. This makes it impossible to access any metadata if a Union operator is used, even though the children have the exact same metadata output.
      Example:

       

      val df1 = spark.read.load(path1)
      val df2 = spark.read.load(path2)
      df1.union(df2).select("_metadata.file_path"). // <-- fails

      Attachments

        Activity

          People

            Unassigned Unassigned
            fred-db Fredrik Klauß
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: