Uploaded image for project: 'Slider'
  1. Slider
  2. SLIDER-1259

Slider does not work in multi homed environments

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • Slider 0.92
    • Slider 1.0.0
    • appmaster
    • None

    Description

      In an an environment where Hadoop Worker nodes bind the Node Manager to an interface with a hostname different from the one returned by socket.getfqdn() for example in our test environment a difference between f-bcpc-vm3 and just bcpc-vm3, which is the hostname bound to the management interface, but not the interface for hadoop/production traffic.  This results in our inability to introspect running jobs.

       

      For example running  slider registry --name slider_poc --listexp results in the following output in the ResourceManager logs

      2018-01-26 17:30:32,147 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: ubuntu is accessing unchecked http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports which is the app master GUI of application_1516910361403_0094 owned by ubuntu 
      2018-01-26 17:31:13,639 WARN org.mortbay.log: /proxy/application_1516910361403_0094/ws/v1/slider/publisher/exports: java.net.ConnectException: Connection timed out (Connection timed out) 

       

      Note how the redirect is to http://bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports, where as it should have been to http://f-bcpc-vm3.bcpc.example.com:46391/ws/v1/slider/publisher/exports.  Renaming the host to f-bcpc-vm3 results in appropriate behavior.

       

      perhaps hostname.py can be instructed to look at one of before registering 

      yarn.nodemanager.address
      yarn.nodemanager.bind-host
      yarn.nodemanager.hostname

       

      When called in Register.py

      register =

      {'responseId': int(id),   'timestamp': timestamp,   'label': self.config.getLabel(),   *'publicHostname': hostname.public_hostname(),*   'agentVersion': version,   'actualState': actualState,   'expectedState': expectedState,   'allocatedPorts': allocated_ports,   'logFolders': log_folders,   'tags': tags }

      Attachments

        1. SLIDER-1259-001.patch
          1 kB
          Steve Loughran

        Activity

          People

            stevel@apache.org Steve Loughran
            lbronshtein Lev Bronshtein
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: