Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
Security Level: Public (Anyone can view this level - this is the default.)
-
CloudStack 3.0.2 KVM agent running on Fedora 14
Description
When adding an instance to a routerVM DHCP configuration, it seems that the KVM agent calls /usr/lib64/cloud/agent/scripts/network/domr/dhcp_entry.sh with wrongly constructed command line arguments, making the script fail to add correct entries (specifically default router, DNS servers and static routes) for that instance to the routerVM's /etc/dhcphosts.txt + /etc/dhcpopts.txt.
Especially adding a specific default gateway fails, so the routerVM will always announce itself as the default router, because the correct entry in /etc/dhcpopts specifying the gateway of the instance's default network as the gateway is missing. This is especially nasty for non-default/additional networks of an instance, messing up the default routing.
Examples:
Management server log entry:
"2012-11-29 14:36:37,764 DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-3:null) Executing: /usr/lib64/cloud/agent/./scripts/network/domr/dhcp_entry.sh -r 169.254.0.122 -v 172.31.2.233 -m 06:b5:88:00:02:30 -n vmname -d 172.31.2.1 -N 172.31.2.201 "
Notice the double spaces before -d and -N (and the extra space at the EOL) – WARNING: you have to view the source of this description to actually see the double spaces!)
After patching /usr/lib64/cloud/agent/scripts/network/domr/dhcp_entry.sh to do meaningful logging, it's clear that the script does not get called with "-d", but " -d" instead (same for -N), so with an extra space before the dash. Thus, getopts fails to parse/recognize these two arguments correctly and passed empty values for $dfltrt and $dns to the /root/edithosts.sh being called on the routerVM.
It's also clear that the CloudStack KVM Java agent calls it with the wrongly constructed command line, because if a shell would interpret this command line, it would just ignore the extra spaces itself.
I've not been able to dig it down, but I somehow suspect that one of
./utils/src/com/cloud/utils/script/Script.java:protected String buildCommandLine(String[] command) { }
./utils/src/com/cloud/utils/script/Script.java:public String execute(OutputInterpreter interpreter) { }
might mess up building the command line of the command that had been built by
./core/src/com/cloud/agent/resource/virtualnetwork/VirtualRoutingResource.java:protected synchronized Answer execute (final DhcpEntryCommand cmd) { }
before.
I've not tried 4.0.0 so far, thus I cannot say whether it might be affected or not.
As a workaround, I've patched dhcp_entry.sh to re-evaluate the positional parameters using 'set – ${@}' (will attach a patch, also one for logging).