Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.11.0
-
None
Description
We've recently discovered that TableSchemaResolver does a lot of throw-away work during initialization and basic schema reading performed by Spark Datasource (see screenshot).
This poses a problem for large tables where HoodieCommitMetadata is of non-trivial size (100s of Mbs).
We'd minimize amount of throw-away work done by `TableSchemaResolver` and try to re-use read/parsed commits' metadata as much as possible.