Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Expose a public dataset w/ schema details and how to use them.
For eg:
- We could have a parquet dump somewhere, where one could read from generate their own hudi tables.
- We could have playbook to create diff types of hudi tables(COW/MOR) by reading from this source.
- We could add a playbook to use deltastreamer to read from this source one file at a time and inject to hudi table.