Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-3895

Make sure Hudi relations do proper file-split packing (on par w/ Spark)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.11.0
    • None

    Description

      While investigating on HUDI-3891, it was discovered that upon introduction of Hudi's own Spark's Relation implementations, file-split packing algorithm was inadvertently subverted: 

      Spark algorithm does greedy packing which relies on the list of file-splits being ordered by the file size (descending in order).

      Attachments

        Issue Links

          Activity

            People

              alexey.kudinkin Alexey Kudinkin
              alexey.kudinkin Alexey Kudinkin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: