Details
-
Epic
-
Status: Closed
-
Major
-
Resolution: Implemented
-
0.9.0
-
None
-
Insert Overwrite API
Description
Usecases:
- Tables where the majority of records change every cycle. So it is likely efficient to write new data instead of doing upserts.
- Operational tasks to fix a specific corrupted partition. We can do 'insert overwrite' on that partition with records from the source. This can be much faster than restore and replay for some data sources.
The functionality will be similar to hive definition of 'insert overwite'. But, doing this in Hoodie will provide better isolation between writer and readers. I can share possible implementation choices and some nuances if the community thinks this is a useful feature to add.