Here's an initial attempt at this (for the Java implementation). Configuration is generated by a HadoopConfigurationBuilder, and is pushed to a file on cluster nodes using jclouds' Statements.createFile call.
HadoopConfigurationBuilder takes care of dynamic properties like fs.default.name and mapred.job.tracker which depend on the cluster object. It may be extended in future to set mapred.reduce.tasks according to the number of slots in the cluster, or mapred.tasktracker.
.tasks.maximum according to the number of CPUs on each instance.
Properties may be overridden by specifying them in the Whirr configuration. For example, to override Hadoop's dfs.replication property to 2 you would add
to your Whirr properties file. The hadoop-hdfs prefix signifies that the property should go in hdfs-site.xml. (This patch also incorporates
As a simplification, this patch also removes the webserver running on the namenode, since the URLs for the namenode and jobtracker are now logged explicitly:
Namenode web UI available at http://ec2-184-73-89-144.compute-1.amazonaws.com:50070
Jobtracker web UI available at http://ec2-184-73-89-144.compute-1.amazonaws.com:50030
so you can go directly to the web UIs.