Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42683

Automatically rename metadata columns that conflict with data schema columns

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.2, 3.4.0
    • 3.5.0
    • Spark Core
    • None

    Description

      Today, if a datasource already has a column called `_metadata`, queries cannot access the file-source metadata column that normally carries that name. We can address this conflict with two changes to metadata column handling:

      1. Automatically rename any metadata column whose name conflicts with a data schema column
      2. Add a facility to reliably find metadata columns by their original/logical name, even if they were renamed.

      Attachments

        Activity

          People

            ryan.johnson@databricks.com Ryan Johnson
            ryan.johnson@databricks.com Ryan Johnson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: