Uploaded image for project: 'VCL'
  1. VCL
  2. VCL-839

Problems occur when "localhost" is used for a management node name



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4
    • 2.4.1
    • vcld (backend)
    • None


      The vcl-install.sh script uses localhost as the name of the management node by default. This FQDN parameter in /etc/vcl/vcld.conf gets set to localhost as well as the managementnode.hostname value.

      The backend code needs to determine the private IP address being used on the management node. This is not stored in the database. Only the management node's hostname and an ambiguous IPaddress values are stored in the management node table. The IPaddress value should be set to the public IP address in order to allow management nodes which don't share the same private network to communicate.

      To determine its own private IP address, the management node attempts to resolve its hostname, localhost, which resolves to After this step, the code compares the resolved IP address to the addresses assigned to the management node's interfaces. The loopback interface's IP addresses are explicitly excluded because there would be no reason for the code to ever use a loopback address.

      This introduces the first problem, which is mostly cosmetic at this point. The following warning is generated:

      |30351|3|3|new|OS.pm:get_private_interface_name|1451| ---- WARNING ----
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| 2015-03-18 14:17:32|30351|3|3|new|OS.pm:get_private_interface_name|1451|failed to determine private interface name, no interface is assigned the private IP address for the reservation:
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| : {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   "eth0" => {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "broadcast_address" => "10.x.x.x",
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "ip_address" => {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :       "10.x.x.x" => ""
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     },
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "name" => "eth0",
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "physical_address" => "00:50:56:23:00:bc"
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   },
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   "eth1" => {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "broadcast_address" => "x.x.x.x",
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "default_gateway" => "x.x.x.x",
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "ip_address" => {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :       "" => ""
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     },
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "name" => "eth1",
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "physical_address" => "00:50:56:23:00:bd"
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   },
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   "lo" => {
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "ip_address" => {},
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :     "name" => "lo"
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| :   }
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| : }
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| ( 0) OS.pm, get_private_interface_name (line: 1451)
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| (-1) OS.pm, get_private_network_configuration (line: 1695)
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| (-2) (eval 762), (eval) (line: 1)
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| (-3) OS.pm, get_ip_address (line: 1846)
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| (-4) OS.pm, get_private_ip_address (line: 1901)
      |30351|3|3|new|OS.pm:get_private_interface_name|1451| (-5) Linux.pm, post_load (line: 418)

      The next problem occurs when a computer is being loaded. Linux.pm's post_load subroutine attempts to add firewall rules to allow traffic to any port and and specifically to port 22 from the management node's private IP address. This isn't working as expected because the private IP address could not be determined. The result is the attempt to allow traffic to any port from the management node's private IP address is skipped:

      |30351|3|3|new|Linux.pm:enable_firewall_port|3655| ---- WARNING ----
      |30351|3|3|new|Linux.pm:enable_firewall_port|3655| 2015-03-18 14:22:44|30351|3|3|new|Linux.pm:enable_firewall_port|3655|firewall not modified, port argument is not restricted to a certain port: 'any', scope argument was not sup
      plied, it must be restricted to certain IP addresses if the port argument is unrestricted

      The attempt to allow traffic to port 22 is completed. However, because no IP address was specified traffic is allowed from any address. At this point, the management node can still control the computer.

      After the computer is reserved and the user connects, the code attempts to lock down the firewall to the user's remote IP address. Existing firewall rules for the specific connect method port are replaced when a user initially connects:

      2015-03-18 14:23:35|31054|3|3|reserved|Linux.pm:enable_firewall_port|3734|overwrite existing argument specified, existing tcp/22 firewall rule(s) will be replaced:
      |31054|3|3|reserved|Linux.pm:enable_firewall_port|3734| existing scope:
      |31054|3|3|reserved|Linux.pm:enable_firewall_port|3734| new scope: y.y.y.y/

      y.y.y.y is the user's remote IP address in this example

      Once the firewall is modified, the managment loses control of the computer because the only existing rule which allowed access, 22 from any IP address, was removed. All commands after this point fail.

      2015-03-18 14:23:35|31054|3|3|reserved|utils.pm:run_ssh_command|4181|executing SSH command on (vm241-1): '/sbin/iptables-save > /etc/sysconfig/iptables'
      |31054|3|3|reserved|utils.pm:run_ssh_command|4291| ---- WARNING ----
      |31054|3|3|reserved|utils.pm:run_ssh_command|4291| 2015-03-18 14:23:35|31054|3|3|reserved|utils.pm:run_ssh_command|4291|attempt 1/3: failed to execute SSH command on (vm241-1): '/sbin/iptables-save > /etc/sysconfig/iptables', exit status: 255, output:
      |31054|3|3|reserved|utils.pm:run_ssh_command|4291| ssh output (/sbin/ipta...): ssh: connect to host port 22: No route to host

      The user isn't affected at this point. Traffic is still allowed from his/her remote IP address. The management node will continue to check for a user connection every few minutes. It continues to fail to do so. The reservation is not timed out when a management node has no control over the computer.

      Everything is fine for the user as long as he/she does not change location. If they do so and click the Connect button from another remote IP address, the management node won't be able to open the firewall to the new address and the user will not be able to connect.

      User initiated image captures will also fail:

      |6680|3|3|image|OS.pm:pre_capture|102| ---- WARNING ----
      |6680|3|3|image|OS.pm:pre_capture|102| 2015-03-18 14:31:22|6680|3|3|image|OS.pm:pre_capture|102|unable to complete capture preparation tasks, vm241-1 is powered on but not responding to SSH
      |6680|3|3|image|OS.pm:pre_capture|102| ( 0) OS.pm, pre_capture (line: 102)
      |6680|3|3|image|OS.pm:pre_capture|102| (-1) Linux.pm, pre_capture (line: 331)
      |6680|3|3|image|OS.pm:pre_capture|102| (-2) VMware.pm, capture (line: 752)
      |6680|3|3|image|OS.pm:pre_capture|102| (-3) image.pm, process (line: 179)
      |6680|3|3|image|OS.pm:pre_capture|102| (-4) vcld, make_new_child (line: 587)
      |6680|3|3|image|OS.pm:pre_capture|102| (-5) vcld, main (line: 348)

      One simple fix is to not use localhost for the management node name. Another fix would be to edit /etc/hosts on the management node and set localhost to the private IP address. I'm not sure if this will cause other problems if something relies on localhost being a loopback address.

      Regardless, the problems with the code need to be resolved. A management node should never lock itself out.




            arkurth Andrew Kurth
            arkurth Andrew Kurth
            0 Vote for this issue
            2 Start watching this issue

