Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Right now, in list bucketing DML, if it involves merge, it uses 1 MR job for all skewed directory. If no. of files is big, it might triggers hive client side OOM due to too many spits. If we use 1 MR job for one skewed dir, it will reduce OOM risks.