Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
Description
While investigating on HUDI-3891, it was discovered that upon introduction of Hudi's own Spark's Relation implementations, file-split packing algorithm was inadvertently subverted:
Spark algorithm does greedy packing which relies on the list of file-splits being ordered by the file size (descending in order).
Attachments
Issue Links
- causes
-
HUDI-3891 Investigate Hudi vs Raw Parquet table discrepancy
- Closed
- links to