[HUDI-2682] Spark schema not updated with new columns on hive sync - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.9.0
Fix Version/s: 0.10.1
Component/s: hive, spark
Labels:
- pull-request-available

Description

When syncing hive schema, new columns added from the source dataset are not propagated to the `spark.sql.sources.schema` metadata on the hive table. This leads to columns not being available when querying the dataset via spark SQL.

Tested with both spark data writer and deltastreamer).

The column we observed this on was a struct column, but it seems like it would be independent of datatype.

Attachments

Issue Links

links to

GitHub Pull Request #4533

Activity

People

Assignee:: 董可伦

Reporter:: Charlie Briggs

Reviewers:: Tao Meng

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 03/Nov/21 16:47

Updated:: 09/Jan/22 00:03

Resolved:: 09/Jan/22 00:03