Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-8544

Fix RLI and SI MDT record generation to be consistent w/ spark task retries

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 1.0.0
    • metadata, writer-core

    Description

      Very rarely we could have some inconstencies when RLI and SI record generation since it relies on RDD<WriteStatus> while all other MDT partitions relies on List<HoodieWriteStat> which is already computed and is in drivers memory. 

       

      So, wanted to fix RLI and SI for now. Eventually we wanted to go w/ a full dag rewrite to be streaming friendly. 

       

      Attachments

        Activity

          People

            shivnarayan sivabalan narayanan
            shivnarayan sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 20m
                20m
                Remaining:
                Remaining Estimate - 20m
                20m
                Logged:
                Time Spent - Not Specified
                Not Specified