Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2682

Spark schema not updated with new columns on hive sync

    XMLWordPrintableJSON

Details

    Description

      When syncing hive schema, new columns added from the source dataset are not propagated to the `spark.sql.sources.schema` metadata on the hive table. This leads to columns not being available when querying the dataset via spark SQL.

      Tested with both spark data writer and deltastreamer).

      The column we observed this on was a struct column, but it seems like it would be independent of datatype.

      Attachments

        Issue Links

          Activity

            People

              dongkelun 董可伦
              charlie.briggs Charlie Briggs
              Tao Meng
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: