Details
-
Umbrella
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.2.0
-
None
-
None
Description
This umbrella ticket aim to track repartition before writing data source tables. It contains:
- repartition by dynamic partition column before writing dynamic partition tables.
- repartition before writing normal tables to avoid generating too many small files.
- Improve local shuffle reader.
Attachments
1.
|
Repartition by dynamic partition columns before insert table | In Progress | Unassigned | |
2.
|
Improve CoalesceShufflePartitions to avoid generating small files | In Progress | Unassigned | |
3.
|
A not very elegant way to control ouput small file | In Progress | Unassigned | |
4.
|
Reduce the output partition of output stage to avoid producing small files. | In Progress | Unassigned |