Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
0.22.0
-
None
-
None
Description
I'm playing with the conf/ servlet, trying to do a workflow that
- pulls down the conf servlet from a well known URL (this is trickier when your VMs are dynamic, but possible)
- saves it locally, using <get> task
- <get> some info on the machines in the allocated cluster, like their external hostnames
- SCP in the configuration files, JAR files needed to submit work,
- submit work via SSH
I have to SSH as the VMs have different internal/external addresses; HDFS gets upset.
Some issues I've found so far
- It's good to set expires headers on everything;
HADOOP-6607covers that - Having sorted conf values makes it easier to locate properties, otherwise you have to save it to a text editor and search around
- the <!-- Loaded from Unknown --> option makes things noisy
- Saving as a java.util.Properties would let me pull these things into a build file or other tool very easily. This is easy to test too.
- Have a comment at the top listing when the conf was generated, and the hostname. Maybe even make them conf values
More tricky is the conf options that are dynamic, things like
<property><!--Loaded from Unknown--><name>dfs.datanode.address</name><value>0.0.0.0:0</value></property>
These show what the node was started with, not what it actually got. I am doing a workaround there with my code (setting the actual values in the conf file with live.dfs.datanode.address, etc, and extracting them that way. I don't want to lose the original values, but do want the real ones