Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-8719 File group reader enhancement - Phase 0
  3. HUDI-7612

HoodieSparkRecordMerger does not handle deletes based on the preCombine/ordering field

Details

    • Sub-task
    • Status: In Progress
    • Blocker
    • Resolution: Unresolved
    • None
    • 1.0.1
    • spark
    • 4

    Description

      The merger handles deletes based off of overwrite with latest. But the rest of the logic is like default payload

      Attachments

        Issue Links

          Activity

            linliu Lin Liu added a comment -

            jonvex, lastarsenal, What logic do we want to support?

            linliu Lin Liu added a comment - jonvex , lastarsenal , What logic do we want to support?
            linliu Lin Liu added a comment -

            This is deprioritized for now.

            linliu Lin Liu added a comment - This is deprioritized for now.
            linliu Lin Liu added a comment -

            I revisited the logic. The most delete records do not have ordering field. In this case, we fall back to natural ordering, i.e., t he ordering field is 0. In his case, we can compare delete records, with non-delete records. When we compare delete and delete records, we still need to fall back to COMMIT_TiME_ORDERING. 

            This should be the most natural way to handle delete now.

            linliu Lin Liu added a comment - I revisited the logic. The most delete records do not have ordering field. In this case, we fall back to natural ordering, i.e., t he ordering field is 0. In his case, we can compare delete records, with non-delete records. When we compare delete and delete records, we still need to fall back to COMMIT_TiME_ORDERING.  This should be the most natural way to handle delete now.

            People

              linliu Lin Liu
              jonvex Jonathan Vexler
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: