Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6460

Fix Hbase Index for deletes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • index
    • None

    Description

      With  adding delete support for RLI, https://github.com/apache/hudi/pull/9058/files 

      Hbase index needs some fixes. 

      Test that is failing is:

      TestSparkHoodieHBaseIndex.

      testTagLocationAndPartitionPathUpdateWithExplicitRollback

       

      Root cause:

      when update partition path is set to true, within same batch we have a deleted record and a new insert record. So, to hbase we are sending both the records and for some inserts take precedence, while for others deletes take precedence. 

       

      we need to fix SparkHoodieHbaseIndex.

      updateLocation

      to do one pass overWriteStatus and ensure we de-dup if we have two records where one of them is deleted and another is inserted. 

      but there are chances that only deletes are present, so in such cases, we need to ensure deletes are routed to hbase. 

       

       

       

       

       

       

       

      Attachments

        Activity

          People

            pwason Prashant Wason
            shivnarayan sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: