Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-7101

File slice instantiation for MDT file groups

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • metadata
    • None

    Description

      here is what a typical file group instantiation of MDT partition looks like

      t10: create a dummy commit w/ base commit time "0000000". 

      So this will create a log file w/ dummy delete block. 

      Immediately following this, we take the bulk_insert which will create a new file slice but w/ same commit time. 

      base_file_00000.parquet. 

      Theoretically, these both belong to diff file slices and when latest snapshot is read, only latest base file should be read. but as of now, we consider the log file also as latest and read it. Since its dummy delete log block, there is no correctness issue here. 

       

      Just some code clean up is required. 

       

      this is an issue only w/ a fresh table. 

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            shivnarayan sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: