Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Currently the criteria of whether a merge job should be fired on dynamic generated partitions are is the average file size of files across all dynamic partitions. It is very common that some dynamic partitions contains mostly large files and some contains mostly small files. Even though the average size of the total files are larger than the hive.merge.smallfiles.avgsize, we should merge those partitions containing small files only.