Currently, during cluster deployment, puppet categorizes nodes as head_node, worker_nodes, gateway_nodes, standy_node based on user specified info. This functionality gives user control over picking up a particular node as head_node, standy_node, gateway_node and rest others as worker_nodes. But, I woulld like to have more fine-grained control on which deamons should run on which node. For example, I do not want to run namenode, datanode on the same node. This functionality can be introduced with the concept of roles. Each node can be assigned a set of role. For example, Node A can be assigned ["namenode", "resourcemanager"] roles. Node B can be assigned ["datanode", "nodemanager"] and Node C can be assigned ["nodemanager", "hadoop-client"]. Now, each node will only run the specified daemons. Prerequisite for this kind of deployment is that each node should be given the necessary configurations that it needs to know. For example, each datanode should know which is the namenode etc... This functionality will allow users to customize the cluster deployment according to their needs.