Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
With adding delete support for RLI, https://github.com/apache/hudi/pull/9058/files
Hbase index needs some fixes.
Test that is failing is:
TestSparkHoodieHBaseIndex.
testTagLocationAndPartitionPathUpdateWithExplicitRollback
Root cause:
when update partition path is set to true, within same batch we have a deleted record and a new insert record. So, to hbase we are sending both the records and for some inserts take precedence, while for others deletes take precedence.
we need to fix SparkHoodieHbaseIndex.
updateLocation
to do one pass overWriteStatus and ensure we de-dup if we have two records where one of them is deleted and another is inserted.
but there are chances that only deletes are present, so in such cases, we need to ensure deletes are routed to hbase.