Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3065

ApplicationMaster killed by NodeManager due to excessive virtual memory consumption

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: None
    • Component/s: nodemanager
    • Labels:
      None

      Description

      > Hey Vinod,
      >
      > OK, so I have a little more clarity into this.
      >
      > When I bump my resource request for my AM to 4096, it runs. The important line in the NM logs is:
      >
      > 2011-09-21 13:43:44,366 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 25656 for container-id container_1316637655278_0001_01_000001 : Virtual 2260938752 bytes, limit : 4294967296 bytes; Physical 120860672 bytes, limit -1 bytes
      >
      > The thing to note is the virtual memory, which is off the charts, even though my physical memory is almost nothing (12 megs). I'm still poking around the code, but I am noticing that there are two checks in the NM, one for virtual mem, and one for physical mem. The virtual memory check appears to be toggle-able, but is presumably defaulted to on.
      >
      > At this point I'm trying to figure out exactly what the VMEM check is for, why YARN thinks my app is taking 2 gigs, and how to fix this.
      >
      > Cheers,
      > Chris
      > ________________________________________
      > From: Chris Riccomini [criccomini@linkedin.com]
      > Sent: Wednesday, September 21, 2011 1:42 PM
      > To: mapreduce-dev@hadoop.apache.org
      > Subject: Re: ApplicationMaster Memory Usage
      >
      > For the record, I bumped to 4096 for memory resource request, and it works.
      >
      >
      >
      > On 9/21/11 1:32 PM, "Chris Riccomini" <criccomini@linkedin.com> wrote:
      >
      >> Hey Vinod,
      >>
      >> So, I ran my application master directly from the CLI. I commented out the
      >> YARN-specific code. It runs fine without leaking memory.
      >>
      >> I then ran it from YARN, with all YARN-specific code commented it. It again
      >> ran fine.
      >>
      >> I then uncommented JUST my registerWithResourceManager call. It then fails
      >> with OOM after a few seconds. I call registerWithResourceManager, and then go
      >> into a while(true)

      { println("yeh") sleep(1000) }

      . Doing this prints:
      >>
      >> yeh
      >> yeh
      >> yeh
      >> yeh
      >> yeh
      >>
      >> At which point, it dies, and, in the NodeManager,I see:
      >>
      >> 2011-09-21 13:24:51,036 WARN monitor.ContainersMonitorImpl
      >> (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree for
      >> container: container_1316626117280_0005_01_000001 has processes older than 1
      >> iteration running over the configured limit. Limit=2147483648, current usage =
      >> 2192773120
      >> 2011-09-21 13:24:51,037 WARN monitor.ContainersMonitorImpl
      >> (ContainersMonitorImpl.java:run(453)) - Container
      >> [pid=23852,containerID=container_1316626117280_0005_01_000001] is running
      >> beyond memory-limits. Current usage : 2192773120bytes. Limit :
      >> 2147483648bytes. Killing container.
      >> Dump of the process-tree for container_1316626117280_0005_01_000001 :
      >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS)
      >> VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
      >> |- 23852 20570 23852 23852 (bash) 0 0 108638208 303 /bin/bash -c java -Xmx512M
      >> -cp './package/*' kafka.yarn.ApplicationMaster
      >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
      >> com.linkedin.TODO 1
      >> 1>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
      >> 001/stdout
      >> 2>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
      >> 001/stderr
      >> |- 23855 23852 23852 23852 (java) 81 4 2084134912 14772 java -Xmx512M -cp
      >> ./package/* kafka.yarn.ApplicationMaster
      >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
      >> com.linkedin.TODO 1
      >> 2011-09-21 13:24:51,037 INFO monitor.ContainersMonitorImpl
      >> (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root 23852
      >>
      >> Either something is leaking in YARN, or my registerWithResourceManager code
      >> (see below) is doing something funky.
      >>
      >> I'm trying to avoid going through all the pain of attaching a remote debugger.
      >> Presumably things aren't leaking in YARN, which means it's likely that I'm
      >> doing something wrong in my registration code.
      >>
      >> Incidentally, my NodeManager is running with 1000 megs. My application master
      >> memory is set to 2048, and my -Xmx setting is 512M
      >>
      >> Cheers,
      >> Chris
      >> ________________________________________
      >> From: Vinod Kumar Vavilapalli [vinodkv@hortonworks.com]
      >> Sent: Wednesday, September 21, 2011 11:52 AM
      >> To: mapreduce-dev@hadoop.apache.org
      >> Subject: Re: ApplicationMaster Memory Usage
      >>
      >> Actually MAPREDUCE-2998 is only related to MRV2, so that isn't related.
      >>
      >> Somehow, your JVM itself is taking so much of virtual memory. Are you
      >> loading some native libs?
      >>
      >> And how many containers have already been allocated by the time the AM
      >> crashes. May be you are accumulating some per-container data. You can try
      >> dumping heap vai hprof.
      >>
      >> +Vinod
      >>
      >>
      >> On Wed, Sep 21, 2011 at 11:21 PM, Chris Riccomini
      >> <criccomini@linkedin.com>wrote:
      >>
      >>> Hey Vinod,
      >>>
      >>> I svn up'd, and rebuilt. My application's task (container) now runs!
      >>>
      >>> Unfortunately, my application master eventually gets killed by the
      >>> NodeManager anyway, and I'm still not clear as to why. The AM is just
      >>> running a loop, asking for a container, and executing a command in the
      >>> container. It keeps doing this over and over again. After a few iterations,
      >>> it gets killed with something like:
      >>>
      >>> 2011-09-21 10:42:40,869 INFO monitor.ContainersMonitorImpl
      >>> (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 21666
      >>> for container-id container_1316626117280_0002_01_000001 : Virtual 2260938752
      >>> bytes, limit : 2147483648 bytes; Physical 77398016 bytes, limit -1 bytes
      >>> 2011-09-21 10:42:40,869 WARN monitor.ContainersMonitorImpl
      >>> (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree for
      >>> container: container_1316626117280_0002_01_000001 has processes older than 1
      >>> iteration running over the configured limit. Limit=2147483648, current usage
      >>> = 2260938752
      >>> 2011-09-21 10:42:40,870 WARN monitor.ContainersMonitorImpl
      >>> (ContainersMonitorImpl.java:run(453)) - Container
      >>> [pid=21666,containerID=container_1316626117280_0002_01_000001] is running
      >>> beyond memory-limits. Current usage : 2260938752bytes. Limit :
      >>> 2147483648bytes. Killing container.
      >>> Dump of the process-tree for container_1316626117280_0002_01_000001 :
      >>> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
      >>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
      >>> |- 21669 21666 21666 21666 (java) 105 4 2152300544 18593 java
      >>> -Xmx512M -cp ./package/* kafka.yarn.ApplicationMaster
      >>> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280
      >>> com.linkedin.TODO 1
      >>> |- 21666 20570 21666 21666 (bash) 0 0 108638208 303 /bin/bash -c
      >>> java -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster
      >>> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280
      >>> com.linkedin.TODO 1
      >>> 1>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_00
      >>> 0001/stdout
      >>> 2>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_00
      >>> 0001/stderr
      >>>
      >>> 2011-09-21 10:42:40,870 INFO monitor.ContainersMonitorImpl
      >>> (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root 21666
      >>>
      >>> I don't think that my AM is leaking memory. Full code paste after the break
      >>>
      >>> 1. Do I need to release a container in my AM even if the AM receives it as
      >>> a finished container in the resource request response?
      >>> 2. Do I need to free any other resources after a resource request (e.g.
      >>> ResourceRequest, AllocateRequest, etc)?
      >>>
      >>> Cheers,
      >>> Chris
      >>>
      >>>
      >>> def main(args: Array[String]) {
      >>> // YARN will always give our ApplicationMaster
      >>> // the first four parameters as: <package> <app id> <attempt id>
      >>> <timestamp>
      >>> val packagePath = args(0)
      >>> val appId = args(1).toInt
      >>> val attemptId = args(2).toInt
      >>> val timestamp = args(3).toLong
      >>>
      >>> // these are our application master's parameters
      >>> val streamerClass = args(4)
      >>> val tasks = args(5).toInt
      >>>
      >>> // TODO log params here
      >>>
      >>> // start the application master helper
      >>> val conf = new Configuration
      >>> val applicationMasterHelper = new ApplicationMasterHelper(appId,
      >>> attemptId, timestamp, conf)
      >>> .registerWithResourceManager
      >>>
      >>> // start and manage the slaves
      >>> val noReleases = List[ContainerId]()
      >>> var runningContainers = 0
      >>>
      >>> // keep going forever
      >>> while (true) {
      >>> val nonRunningTasks = tasks - runningContainers
      >>> val response =
      >>> applicationMasterHelper.sendResourceRequest(nonRunningTasks, noReleases)
      >>>
      >>> response.getAllocatedContainers.foreach(container =>

      { >>> new ContainerExecutor(packagePath, container) >>> .addCommand("java -Xmx256M -cp './package/*' >>> kafka.yarn.StreamingTask " + streamerClass + " " >>> + "1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout " >>> + "2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + >>> "/stderr").execute(conf) >>> }

      )
      >>>
      >>> runningContainers += response.getAllocatedContainers.length
      >>> runningContainers -= response.getCompletedContainersStatuses.length
      >>>
      >>> Thread.sleep(1000)
      >>> }
      >>>
      >>> applicationMasterHelper.unregisterWithResourceManager("SUCCESS")
      >>> }
      >>>
      >>>
      >>> class ApplicationMasterHelper(iAppId: Int, iAppAttemptId: Int, lTimestamp:
      >>> Long, conf: Configuration) {
      >>> val rpc = YarnRPC.create(conf)
      >>> val appId = Records.newRecord(classOf[ApplicationId])
      >>> val appAttemptId = Records.newRecord(classOf[ApplicationAttemptId])
      >>> val rmAddress =
      >>> NetUtils.createSocketAddr(conf.get(YarnConfiguration.RM_SCHEDULER_ADDRESS,
      >>> YarnConfiguration.DEFAULT_RM_SCHEDULER_ADDRESS))
      >>> val resourceManager = rpc.getProxy(classOf[AMRMProtocol], rmAddress,
      >>> conf).asInstanceOf[AMRMProtocol]
      >>> var requestId = 0
      >>>
      >>> appId.setClusterTimestamp(lTimestamp)
      >>> appId.setId(iAppId)
      >>> appAttemptId.setApplicationId(appId)
      >>> appAttemptId.setAttemptId(iAppAttemptId)
      >>>
      >>> def registerWithResourceManager(): ApplicationMasterHelper =

      { >>> val req = Records.newRecord(classOf[RegisterApplicationMasterRequest]) >>> req.setApplicationAttemptId(appAttemptId) >>> // TODO not sure why these are blank- This is how spark does it >>> req.setHost("") >>> req.setRpcPort(1) >>> req.setTrackingUrl("") >>> resourceManager.registerApplicationMaster(req) >>> this >>> }

      >>>
      >>> def unregisterWithResourceManager(state: String): ApplicationMasterHelper
      >>> =

      { >>> val finReq = Records.newRecord(classOf[FinishApplicationMasterRequest]) >>> finReq.setAppAttemptId(appAttemptId) >>> finReq.setFinalState(state) >>> resourceManager.finishApplicationMaster(finReq) >>> this >>> }

      >>>
      >>> def sendResourceRequest(containers: Int, release: List[ContainerId]):
      >>> AMResponse =

      { >>> // TODO will need to make this more flexible for hostname requests, etc >>> val request = Records.newRecord(classOf[ResourceRequest]) >>> val pri = Records.newRecord(classOf[Priority]) >>> val capability = Records.newRecord(classOf[Resource]) >>> val req = Records.newRecord(classOf[AllocateRequest]) >>> request.setHostName("*") >>> request.setNumContainers(containers) >>> pri.setPriority(1) >>> request.setPriority(pri) >>> capability.setMemory(128) >>> request.setCapability(capability) >>> req.setResponseId(requestId) >>> req.setApplicationAttemptId(appAttemptId) >>> req.addAllAsks(Lists.newArrayList(request)) >>> req.addAllReleases(release) >>> requestId += 1 >>> // TODO we might want to return a list of container executors here >>> instead of AMResponses >>> resourceManager.allocate(req).getAMResponse >>> }

      >>> }
      >>>
      >>>
      >>> ________________________________________
      >>> From: Vinod Kumar Vavilapalli [vinodkv@hortonworks.com]
      >>> Sent: Wednesday, September 21, 2011 10:08 AM
      >>> To: mapreduce-dev@hadoop.apache.org
      >>> Subject: Re: ApplicationMaster Memory Usage
      >>>
      >>> Yes, the process-dump clearly tells that this is MAPREDUCE-2998.
      >>>
      >>> +Vinod
      >>> (With a smirk to see his container-memory-monitoring code in action)
      >>>
      >>>
      >>> On Wed, Sep 21, 2011 at 10:26 PM, Arun C Murthy <acm@hortonworks.com>
      >>> wrote:
      >>>
      >>>> I'll bet you are hitting MR-2998.
      >>>>
      >>>> From the changelog:
      >>>>
      >>>> MAPREDUCE-2998. Fixed a bug in TaskAttemptImpl which caused it to fork
      >>>> bin/mapred too many times. Contributed by Vinod K V.
      >>>>
      >>>> Arun
      >>>>
      >>>> On Sep 21, 2011, at 9:52 AM, Chris Riccomini wrote:
      >>>>
      >>>>> Hey Guys,
      >>>>>
      >>>>> My ApplicationMaster is being killed by the NodeManager because of
      >>> memory
      >>>> consumption, and I don't understand why. I'm using -Xmx512M, and setting
      >>> my
      >>>> resource request to 2048.
      >>>>>
      >>>>>
      >>>>> .addCommand("java -Xmx512M -cp './package/*'
      >>>> kafka.yarn.ApplicationMaster " ...
      >>>>>
      >>>>> ...
      >>>>>
      >>>>> private var memory = 2048
      >>>>>
      >>>>> resource.setMemory(memory)
      >>>>> containerCtx.setResource(resource)
      >>>>> containerCtx.setCommands(cmds.toList)
      >>>>> containerCtx.setLocalResources(Collections.singletonMap("package",
      >>>> packageResource))
      >>>>> appCtx.setApplicationId(appId)
      >>>>> appCtx.setUser(user.getShortUserName)
      >>>>> appCtx.setAMContainerSpec(containerCtx)
      >>>>> request.setApplicationSubmissionContext(appCtx)
      >>>>> applicationsManager.submitApplication(request)
      >>>>>
      >>>>> When this runs, I see (in my NodeManager's logs):
      >>>>>
      >>>>>
      >>>>> 2011-09-21 09:35:19,112 INFO monitor.ContainersMonitorImpl
      >>>> (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 28134
      >>>> for container-id container_1316559026783_0003_01_000001 : Virtual
      >>> 2260938752
      >>>> bytes, limit : 2147483648 bytes; Physical 71540736 bytes, limit -1 bytes
      >>>>> 2011-09-21 09:35:19,112 WARN monitor.ContainersMonitorImpl
      >>>> (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree
      >>> for
      >>>> container: container_1316559026783_0003_01_000001 has processes older
      >>> than 1
      >>>> iteration running over the configured limit. Limit=2147483648, current
      >>> usage
      >>>> = 2260938752
      >>>>> 2011-09-21 09:35:19,113 WARN monitor.ContainersMonitorImpl
      >>>> (ContainersMonitorImpl.java:run(453)) - Container
      >>>> [pid=28134,containerID=container_1316559026783_0003_01_000001] is running
      >>>> beyond memory-limits. Current usage : 2260938752bytes. Limit :
      >>>> 2147483648bytes. Killing container.
      >>>>> Dump of the process-tree for container_1316559026783_0003_01_000001 :
      >>>>> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
      >>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
      >>>>> |- 28134 25886 28134 28134 (bash) 0 0 108638208 303 /bin/bash -c
      >>>> java -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster 3 1
      >>>> 1316559026783 com.linkedin.TODO 1
      >>>>
      >>> 1>/tmp/logs/application_1316559026783_0003/container_1316559026783_0003_01_00
      >>> 0001/stdout
      >>>>
      >>> 2>/tmp/logs/application_1316559026783_0003/container_1316559026783_0003_01_00
      >>> 0001/stderr
      >>>>> |- 28137 28134 28134 28134 (java) 92 3 2152300544 17163 java
      >>>> -Xmx512M -cp ./package/* kafka.yarn.ApplicationMaster 3 1 1316559026783
      >>>> com.linkedin.TODO 1
      >>>>>
      >>>>> 2011-09-21 09:35:19,113 INFO monitor.ContainersMonitorImpl
      >>>> (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root
      >>> 28134
      >>>>>
      >>>>> It appears that YARN is honoring my 2048 command, yet my process is
      >>>> somehow taking 2260938752 bytes. I don't think that I'm using nearly that
      >>>> much in permgen, and my heap is limited to 512. I don't have any JNI
      >>> stuff
      >>>> running (that I know of), so it's unclear to me what's going on here. The
      >>>> only thing that I can think of is that Java's Runtime exec is forking,
      >>> and
      >>>> copying its entire JVM memory footprint for the fork.
      >>>>>
      >>>>> Has anyone seen this? Am I doing something dumb?
      >>>>>
      >>>>> Thanks!
      >>>>> Chris
      >>>>
      >>>>
      >>>
      >

        Activity

        Hide
        Chris Riccomini added a comment -

        I've been doing more digging. With some understanding of what the container monitor is doing now, I went back and checked the process tree dump when my application master was killed. What I see is that there actually appear to be two processes running:

        • PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        • 21669 21666 21666 21666 (java) 105 4 2152300544 18593 java -Xmx512M -cp ./package/* kafka.yarn.ApplicationMaster /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280 com.linkedin.TODO 1
        • 21666 20570 21666 21666 (bash) 0 0 108638208 303 /bin/bash -c java -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280 com.linkedin.TODO 1 1>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_000001/stdout 2>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_000001/stderr

        So it appears that the bash command (which I assume is coming from task.sh) is taking almost no memory, but the java command that it triggers is taking 2 gigs in virtual memory. No idea why. When I run the bash -c .. command outside of YARN, it's taking about 10 megs of physical memory, and almost no virtual memory.

        Show
        Chris Riccomini added a comment - I've been doing more digging. With some understanding of what the container monitor is doing now, I went back and checked the process tree dump when my application master was killed. What I see is that there actually appear to be two processes running: PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE 21669 21666 21666 21666 (java) 105 4 2152300544 18593 java -Xmx512M -cp ./package/* kafka.yarn.ApplicationMaster /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280 com.linkedin.TODO 1 21666 20570 21666 21666 (bash) 0 0 108638208 303 /bin/bash -c java -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 2 1 1316626117280 com.linkedin.TODO 1 1>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_000001/stdout 2>/tmp/logs/application_1316626117280_0002/container_1316626117280_0002_01_000001/stderr So it appears that the bash command (which I assume is coming from task.sh) is taking almost no memory, but the java command that it triggers is taking 2 gigs in virtual memory. No idea why. When I run the bash -c .. command outside of YARN, it's taking about 10 megs of physical memory, and almost no virtual memory.
        Hide
        Chris Riccomini added a comment -

        cat'd out task.sh for my AM:

        #!/bin/bash

        export YARN_HOME="/tmp/hadoop-mapreduce/hadoop-mapreduce-0.24.0-SNAPSHOT-LATEST/bin/.."
        export YARN_LOCAL_DIRS="/tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002"
        export JVM_PID="$$"
        export HADOOP_TOKEN_FILE_LOCATION="/tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002/container_1316638915485_0002_01_000001/container_tokens"
        ln -sf /tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002/filecache/-3545403967124200139/kafka-streamer.tgz package
        exec setsid /bin/bash -c "cat ./task.sh 1>/tmp/logs/application_1316638915485_0002/container_1316638915485_0002_01_000001/stdout 2>/tmp/logs/application_1316638915485_0002/container_1316638915485_0002_01_000001/stderr "

        Obviously, my java ... command goes where the `cat ./task.sh` is.

        Show
        Chris Riccomini added a comment - cat'd out task.sh for my AM: #!/bin/bash export YARN_HOME="/tmp/hadoop-mapreduce/hadoop-mapreduce-0.24.0-SNAPSHOT-LATEST/bin/.." export YARN_LOCAL_DIRS="/tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002" export JVM_PID="$$" export HADOOP_TOKEN_FILE_LOCATION="/tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002/container_1316638915485_0002_01_000001/container_tokens" ln -sf /tmp/nm-local-dir/usercache/criccomi/appcache/application_1316638915485_0002/filecache/-3545403967124200139/kafka-streamer.tgz package exec setsid /bin/bash -c "cat ./task.sh 1>/tmp/logs/application_1316638915485_0002/container_1316638915485_0002_01_000001/stdout 2>/tmp/logs/application_1316638915485_0002/container_1316638915485_0002_01_000001/stderr " Obviously, my java ... command goes where the `cat ./task.sh` is.
        Hide
        Arun C Murthy added a comment -

        One reason for VMEM spikes are forks, are u seeing any?

        Show
        Arun C Murthy added a comment - One reason for VMEM spikes are forks, are u seeing any?
        Hide
        Chris Riccomini added a comment -

        Regarding what I said in the email above, when I comment out the registerWithResourceManager, I don't get killed. I re-checked the process usage when I do this:

        2011-09-21 14:29:24,914 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 28080 for container-id container_1316640481414_0001_01_000001 : Virtual 2123530240 bytes, limit : 2147483648 bytes; Physical 61272064 bytes, limit -1 bytes

        It appears that without the registerWithResourceManager call, things are staying just BARELY under the kill limit, so it's not the register call that's leaking memory. Wondering if it's something to do with Java's fork behavior?

        Show
        Chris Riccomini added a comment - Regarding what I said in the email above, when I comment out the registerWithResourceManager, I don't get killed. I re-checked the process usage when I do this: 2011-09-21 14:29:24,914 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 28080 for container-id container_1316640481414_0001_01_000001 : Virtual 2123530240 bytes, limit : 2147483648 bytes; Physical 61272064 bytes, limit -1 bytes It appears that without the registerWithResourceManager call, things are staying just BARELY under the kill limit, so it's not the register call that's leaking memory. Wondering if it's something to do with Java's fork behavior?
        Hide
        Chris Riccomini added a comment -

        Hey Arun,

        Am I seeing any spikes, or forks? Doesn't it always fork when it execs task.sh? As far as spikes go, it would appear that it's what I'm seeing. The odd thing is that it doesn't seem to be going down. I tried suggesting a System.gc(), but it didn't help. The process is sitting just under the 2 gig kill limit with vmem usage.

        Cheers,
        Chris

        Show
        Chris Riccomini added a comment - Hey Arun, Am I seeing any spikes, or forks? Doesn't it always fork when it execs task.sh? As far as spikes go, it would appear that it's what I'm seeing. The odd thing is that it doesn't seem to be going down. I tried suggesting a System.gc(), but it didn't help. The process is sitting just under the 2 gig kill limit with vmem usage. Cheers, Chris
        Hide
        Chris Riccomini added a comment -

        Hmm, I dropped my NM's -Xmx setting to 512m from 1024m. The VMEM setting STILL sits at just under the 2 gig kill limit.

        This seems pretty interesting: http://stackoverflow.com/questions/561245/virtual-memory-usage-from-java-under-linux-too-much-memory-used

        Reading.

        Show
        Chris Riccomini added a comment - Hmm, I dropped my NM's -Xmx setting to 512m from 1024m. The VMEM setting STILL sits at just under the 2 gig kill limit. This seems pretty interesting: http://stackoverflow.com/questions/561245/virtual-memory-usage-from-java-under-linux-too-much-memory-used Reading.
        Hide
        Chris Riccomini added a comment -

        OK, I took a pmap of my application master, and, unsurprisingly, it's taking nearly 2 gigs. When I looked at the individual entries, I see a lot of normal stuff, but then a few really massive anon's:

        00007fedc8021000 65404K ----- [ anon ]
        00007fedcc000000 132K rwx-- [ anon ]
        00007fedcc021000 65404K ----- [ anon ]
        00007fedd0000000 3544K rwx-- [ anon ]
        00007fedd0376000 61992K ----- [ anon ]
        00007fedd4000000 3736K rwx-- [ anon ]
        00007fedd43a6000 61800K ----- [ anon ]
        00007fedd8000000 132K rwx-- [ anon ]
        00007fedd8021000 65404K ----- [ anon ]
        00007fedde16f000 96836K r-x-- /usr/lib/locale/locale-archive
        00007fede4000000 132K rwx-- [ anon ]
        00007fede4021000 65404K ----- [ anon ]
        00007fede8000000 132K rwx-- [ anon ]
        00007fede8021000 65404K ----- [ anon ]
        00007fedec000000 132K rwx-- [ anon ]
        00007fedec021000 65404K ----- [ anon ]
        00007fedf0000000 132K rwx-- [ anon ]
        00007fedf0021000 65404K ----- [ anon ]
        00007fedf4000000 132K rwx-- [ anon ]
        00007fedf4021000 65404K ----- [ anon ]
        00007fedf8000000 132K rwx-- [ anon ]
        00007fedf8021000 65404K ----- [ anon ]
        00007fedfc000000 132K rwx-- [ anon ]
        00007fedfc021000 65404K ----- [ anon ]
        00007fee00000000 132K rwx-- [ anon ]
        00007fee00021000 65404K ----- [ anon ]
        00007fee04000000 132K rwx-- [ anon ]
        00007fee04021000 65404K ----- [ anon ]
        00007fee08000000 132K rwx-- [ anon ]
        00007fee08021000 65404K ----- [ anon ]
        00007fee0c000000 132K rwx-- [ anon ]
        00007fee0c021000 65404K ----- [ anon ]

        ...

        00007fee12f2e000 34376K rwx-- [ anon ]
        00007fee150c0000 64768K rwx-- [ anon ]
        00007fee19000000 526784K rwx-- [ anon ]
        00007fee39270000 51904K rwx-- [ anon ]
        00007fee3c520000 60288K ----- [ anon ]

        Collectively, those guys are taking up well over a gig. Continuing to experiment.

        Any ideas?

        Show
        Chris Riccomini added a comment - OK, I took a pmap of my application master, and, unsurprisingly, it's taking nearly 2 gigs. When I looked at the individual entries, I see a lot of normal stuff, but then a few really massive anon's: 00007fedc8021000 65404K ----- [ anon ] 00007fedcc000000 132K rwx-- [ anon ] 00007fedcc021000 65404K ----- [ anon ] 00007fedd0000000 3544K rwx-- [ anon ] 00007fedd0376000 61992K ----- [ anon ] 00007fedd4000000 3736K rwx-- [ anon ] 00007fedd43a6000 61800K ----- [ anon ] 00007fedd8000000 132K rwx-- [ anon ] 00007fedd8021000 65404K ----- [ anon ] 00007fedde16f000 96836K r-x-- /usr/lib/locale/locale-archive 00007fede4000000 132K rwx-- [ anon ] 00007fede4021000 65404K ----- [ anon ] 00007fede8000000 132K rwx-- [ anon ] 00007fede8021000 65404K ----- [ anon ] 00007fedec000000 132K rwx-- [ anon ] 00007fedec021000 65404K ----- [ anon ] 00007fedf0000000 132K rwx-- [ anon ] 00007fedf0021000 65404K ----- [ anon ] 00007fedf4000000 132K rwx-- [ anon ] 00007fedf4021000 65404K ----- [ anon ] 00007fedf8000000 132K rwx-- [ anon ] 00007fedf8021000 65404K ----- [ anon ] 00007fedfc000000 132K rwx-- [ anon ] 00007fedfc021000 65404K ----- [ anon ] 00007fee00000000 132K rwx-- [ anon ] 00007fee00021000 65404K ----- [ anon ] 00007fee04000000 132K rwx-- [ anon ] 00007fee04021000 65404K ----- [ anon ] 00007fee08000000 132K rwx-- [ anon ] 00007fee08021000 65404K ----- [ anon ] 00007fee0c000000 132K rwx-- [ anon ] 00007fee0c021000 65404K ----- [ anon ] ... 00007fee12f2e000 34376K rwx-- [ anon ] 00007fee150c0000 64768K rwx-- [ anon ] 00007fee19000000 526784K rwx-- [ anon ] 00007fee39270000 51904K rwx-- [ anon ] 00007fee3c520000 60288K ----- [ anon ] Collectively, those guys are taking up well over a gig. Continuing to experiment. Any ideas?
        Hide
        Arun C Murthy added a comment -

        Chris, as Milind mentioned on list, can u try tuning your MALLOC_ARENA_MAX ala HADOOP-7154?

        Do an 'svn up' to get MAPREDUCE-2880 and set an env for your AM:

        MALLOC_ARENA_MAX=$

        {MALLOC_ARENA_MAX:-4}
        Show
        Arun C Murthy added a comment - Chris, as Milind mentioned on list, can u try tuning your MALLOC_ARENA_MAX ala HADOOP-7154 ? Do an 'svn up' to get MAPREDUCE-2880 and set an env for your AM: MALLOC_ARENA_MAX=$ {MALLOC_ARENA_MAX:-4}
        Hide
        Chris Riccomini added a comment -

        It worked! THANKS!

        That sucked.

        For the record, I just set export MALLOC_ARENA_MAX=$

        {MALLOC_ARENA_MAX:-4}

        , and made sure echo $MALLOC_ARENA_MAX printed 4 in my RM shell, NM shell, and AM.

        pmap is now showing 1.3megs instead of 2 gigs. WOOT.

        Thanks again!

        Show
        Chris Riccomini added a comment - It worked! THANKS! That sucked. For the record, I just set export MALLOC_ARENA_MAX=$ {MALLOC_ARENA_MAX:-4} , and made sure echo $MALLOC_ARENA_MAX printed 4 in my RM shell, NM shell, and AM. pmap is now showing 1.3megs instead of 2 gigs. WOOT. Thanks again!
        Hide
        Arun C Murthy added a comment -

        Cool.

        Chris, do you want to provide a patch to add MALLOC_ARENA_MAX automatically to containers? We can do this in the NM.

        Show
        Arun C Murthy added a comment - Cool. Chris, do you want to provide a patch to add MALLOC_ARENA_MAX automatically to containers? We can do this in the NM.
        Hide
        Chris Riccomini added a comment -

        Sure thing. Will get on it tomorrow AM. Headed out the door at the moment.

        Show
        Chris Riccomini added a comment - Sure thing. Will get on it tomorrow AM. Headed out the door at the moment.
        Hide
        Chris Riccomini added a comment -

        Per vinod, new ticket to track this: https://issues.apache.org/jira/browse/MAPREDUCE-3068

        Show
        Chris Riccomini added a comment - Per vinod, new ticket to track this: https://issues.apache.org/jira/browse/MAPREDUCE-3068
        Hide
        Milind Bhandarkar added a comment -

        @Chris, glad to know that it worked! Ironically, I had discovered this issue with RHEL6 when working on another project earlier this year at LinkedIn

        Show
        Milind Bhandarkar added a comment - @Chris, glad to know that it worked! Ironically, I had discovered this issue with RHEL6 when working on another project earlier this year at LinkedIn
        Hide
        Chris Riccomini added a comment -

        @Milind, lol yea. It's these linux boxes that they threw us on.

        Show
        Chris Riccomini added a comment - @Milind, lol yea. It's these linux boxes that they threw us on.

          People

          • Assignee:
            Unassigned
            Reporter:
            Chris Riccomini
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development