Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
2
Description
As a follow-up to the recent discussions in the community regarding out-of-the-box configuration (DB blog), i think we should adjust some aspects of our OOB configuration to stay in-line with other formats as it's inevitable that people would be comparing Hudi's performance against Delta and Iceberg:
For example, we should make sure that whenever someone is creating a table from scratch we always use "bulk_insert" instead of "upsert" as there's no reason for us to incur the overhead of upserting since we know the table was empty.
It could roughly go as following:
- If the table is empty, and
- There's no explicit operation configured, and
- There's no pre-combining configured
Then we treat it as "bulk_insert" case