Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.8.0, 3.0.0-alpha1
-
None
-
None
-
None
Description
The mapred archive-logs command currently has no way to throttle the number of requested containers. For example, we recently saw a busy cluster where the tool hadn't been run for a while and there were about 20,000 apps to process. This meant that the tool tried to request 20,000 containers and got a ton of GC and then OOM trying to handle that.
This problem can be mitigated by setting -maxEligibleApps to a more reasonable value, but doing so would require running the tool multiple times; plus, the default value is -1 (all).
We should add a way to throttle the max number of concurrently running containers that the tool manages. Something like -concurrency <n> where it would only allow up to n containers at a time.
Attachments
Issue Links
- is related to
-
MAPREDUCE-6415 Create a tool to combine aggregated logs into HAR files
- Resolved