Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1628

[Umbrella] Improve data locality during ingestion

    XMLWordPrintableJSON

Details

    • Epic
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • writer-core
    • 0
    • Improve data locality during ingestion

    Description

      Today the upsert partitioner does the file sizing/bin-packing etc for
      inserts and then sends some inserts over to existing file groups to
      maintain file size.
      We can abstract all of this into strategies and some kind of pipeline
      abstractions and have it also consider "affinity" to an existing file group
      based
      on say information stored in the metadata table?

      See http://mail-archives.apache.org/mod_mbox/hudi-dev/202102.mbox/browser
      for more details

      Attachments

        Activity

          People

            guoyihua Ethan Guo
            satishkotha satish
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: