-You can look for the xml:space attribute on any element and act on it; when working with XSD-schema'd docs I think xerces behaves differently when it hits it, but I forget these things.
-yes, it would cause windows to behave differently and not allow filenames with trailing spaces, or other strings. But I dont see that filenames with trailing spaces and carriage returns do actually make sense, even on windows. Spaces mid-path, maybe, but leading or trailing? Danger.
FWIW, I'm not using the XML format for our configurations; we use our own configuration format
Looking at the current declarations, there's nowhere where white space is useful, and there are places (in comma separated lists), where it may already be harmful and need filtering. There may be some inconsistency between filenames (
HADOOP-2366) and user group information, where spaces between words are allowed in hadoop.job.ugi. I would propose
-consistent filtering of spaces wherever lists are taken (strip leading, trailing),
-trim leading, tailing whitespace
What may make sense is to allow quoted whitespace, so you could have a list of directories, those in quotes would be passed down as is:
<value>/mnt/hstore2/hdfs , "/home/user2/temp hadoop dir"</value>
This would resolve to a list with two entries ["/mnt/hstore2/hdfs","/home/user2/temp hadoop dir"]