Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.8.0
-
None
-
None
-
My local machine is Ubuntu 12.04 64bit.
Trying to create hadoop cluster in AWS EC2.
Description
I tried to create a hadoop cluster in EC2 using Whirr. I used an image of EBS type. When I access the Namenode Web UI I found the 'Configured Capacity' is small. It seems that the /data0 folder is not linked to the /mnt correctly, so that the large storage device of the EC2 instance is not used by hadoop.
How to reproduce this problem:
I found the images here: http://cloud.ubuntu.com/ami/
I used the image ami-ce3283cf (Ubuntu lucid 64bit ebs in ap-northeast-1)
My hadoop.properties file is like this:
whirr.provider=aws-ec2 whirr.location-id=ap-northeast-1a whirr.cluster-name=hadoopcluster whirr.cluster-user=${sys:user.name} whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,12 hadoop-datanode+hadoop-tasktracker whirr.private-key-file=${sys:user.home}/.ssh/id_rsa whirr.public-key-file=${whirr.private-key-file}.pub whirr.hardware-id=c1.medium whirr.image-id=ap-northeast-1/ami-ce3283cf
Then I ran the command:
bin/whirr launch-cluster --config hadoop.properties
Sometimes the error 'java: command not found' come out during the procedure. In this case, if you logged into the master instance and run the command 'hadoop', it says 'hadoop' is not found too. The namenode Web UI is also not available.
But sometimes, the procedure can success. I can logged into the master instance and run the hadoop commands. I can also access the Namenode Web UI. But I saw the 'Configured Capacity' is only about 80GB.
If I change the image to an instance-store type one: ami-be3283bf (Ubuntu lucid 64bit instance-store in ap-northeast-1), all other configs not changed, it all works fine. The 'Configured Capacity' became 3.91TB.
So I think the problem comes from the EBS.