Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7989

[ec2] hadoop Could not create the Java virtual machine

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Amazon EC2 m1.large instance (AMI-ID ami-fd589594)

    • Tags:
      hadoop

      Description

      On Amazon EC2, If I set mapred.child.java.opts to "-Xmx512m". Job execution fails saying that "Could Not create the java virtual machine"
      If unset mapred.child.java.opts everything runs fine. I've even tried the same thing on c1.xlarge instances but with the same result.
      I'll add full logs. I need to set mapred.child.java.opts becuase my reduce program are memory hungry.

      I've been trying to setup a map-reduce cluster on ec2 for last two days via whirr, but this error was recurring. At first I thought It might be whirr
      specific, but manual ec2 cluster also resulted in same.

      <snippet>
      ubuntu@ip-10-84-237-173:~/workspace$ hadoop jar $HADOOP_HOME/examples.jar wordcount in out
      12/01/21 18:08:28 INFO input.FileInputFormat: Total input paths to process : 16
      12/01/21 18:08:28 INFO mapred.JobClient: Running job: job_201201211751_0007
      12/01/21 18:08:29 INFO mapred.JobClient: map 0% reduce 0%
      12/01/21 18:08:35 INFO mapred.JobClient: Task Id : attempt_201201211751_0007_m_000017_0, Status : FAILED
      java.lang.Throwable: Child Error
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
      Caused by: java.io.IOException: Task process exit with nonzero status of 1.
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

      12/01/21 18:08:35 WARN mapred.JobClient: Error reading task outputhttp://ip-10-12-15-189.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_m_000017_0&filter=stdout
      12/01/21 18:08:35 WARN mapred.JobClient: Error reading task outputhttp://ip-10-12-15-189.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_m_000017_0&filter=stderr
      12/01/21 18:08:41 INFO mapred.JobClient: Task Id : attempt_201201211751_0007_r_000002_0, Status : FAILED
      java.lang.Throwable: Child Error
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
      Caused by: java.io.IOException: Task process exit with nonzero status of 1.
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

      12/01/21 18:08:41 WARN mapred.JobClient: Error reading task outputhttp://ip-10-12-15-189.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_r_000002_0&filter=stdout
      12/01/21 18:08:41 WARN mapred.JobClient: Error reading task outputhttp://ip-10-12-15-189.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_r_000002_0&filter=stderr
      12/01/21 18:08:47 INFO mapred.JobClient: Task Id : attempt_201201211751_0007_m_000017_1, Status : FAILED
      java.lang.Throwable: Child Error
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
      Caused by: java.io.IOException: Task process exit with nonzero status of 1.
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

      12/01/21 18:08:47 WARN mapred.JobClient: Error reading task outputhttp://ip-10-32-50-20.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_m_000017_1&filter=stdout
      12/01/21 18:08:47 WARN mapred.JobClient: Error reading task outputhttp://ip-10-32-50-20.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_m_000017_1&filter=stderr
      12/01/21 18:08:53 INFO mapred.JobClient: Task Id : attempt_201201211751_0007_r_000002_1, Status : FAILED
      java.lang.Throwable: Child Error
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
      Caused by: java.io.IOException: Task process exit with nonzero status of 1.
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

      12/01/21 18:08:53 WARN mapred.JobClient: Error reading task outputhttp://ip-10-32-50-20.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_r_000002_1&filter=stdout
      12/01/21 18:08:53 WARN mapred.JobClient: Error reading task outputhttp://ip-10-32-50-20.ec2.internal:50060/tasklog?plaintext=true&attemptid=attempt_201201211751_0007_r_000002_1&filter=stderr
      12/01/21 18:09:01 INFO mapred.JobClient: Task Id : attempt_201201211751_0007_m_000017_2, Status : FAILED
      java.lang.Throwable: Child Error
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
      </snippet>

      <snippet>
      ubuntu@ip-10-12-15-189:~/workspace$ cat ~/hadoop/logs/userlogs/job_201201211751_0007/attempt_201201211751_0007_m_000016_1/std*
      Could not create the Java virtual machine.
      Error occurred during initialization of VM
      Could not reserve enough space for object heap
      ubuntu@ip-10-12-15-189:~/workspace$
      </snippet>

        Activity

        Hide
        flukebox Jai Kumar Singh added a comment -

        same issue on m2.4xlarge ... so certainly it is not related to instance type
        any clue/pointers would be appreciated.
        Thanks

        Show
        flukebox Jai Kumar Singh added a comment - same issue on m2.4xlarge ... so certainly it is not related to instance type any clue/pointers would be appreciated. Thanks
        Hide
        kaykay.unique Karthik K added a comment -

        What is the (max) number of M-R child processes running on the machine ?

        What does 'jps' say , in terms of other java processes running on the machine ?

        Show
        kaykay.unique Karthik K added a comment - What is the (max) number of M-R child processes running on the machine ? What does 'jps' say , in terms of other java processes running on the machine ?
        Hide
        flukebox Jai Kumar Singh added a comment -

        Okay, It turns out that ulimit was the problem.
        Setting mapred.child.ulimit=unlimited (mapred-site.xml) solves the problem for me.
        In whirr.properties, you can set it as following.
        hadoop-mapreduce.mapred.child.ulimit=unlimited

        Show
        flukebox Jai Kumar Singh added a comment - Okay, It turns out that ulimit was the problem. Setting mapred.child.ulimit=unlimited (mapred-site.xml) solves the problem for me. In whirr.properties, you can set it as following. hadoop-mapreduce.mapred.child.ulimit=unlimited
        Hide
        flukebox Jai Kumar Singh added a comment -

        It was a setting problem. (by whirr) ulimit was not set to unlimited which was causing jvm to fail.

        Show
        flukebox Jai Kumar Singh added a comment - It was a setting problem. (by whirr) ulimit was not set to unlimited which was causing jvm to fail.

          People

          • Assignee:
            Unassigned
            Reporter:
            flukebox Jai Kumar Singh
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development