Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3851

Support for reading parquet files with different but compatible schema

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: SQL
    • Labels:
      None
    • Target Version/s:

      Description

      Right now it is required that all of the parquet files have the same schema. It would be nice to support some safe subset of cases where the schemas of files is different. For example:

      • Adding and removing nullable columns.
      • Widening types (a column that is of both Int and Long type)

        Attachments

          Activity

            People

            • Assignee:
              lian cheng Cheng Lian
              Reporter:
              marmbrus Michael Armbrust
            • Votes:
              5 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: