Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
-
Description
Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.