Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9335

LIBPROCESS_ADVERTISE_IP is not passed to mesos-docker-executor

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.7.0
    • None
    • executor
    • None
    • Linux 4.4.0-1069-aws #79-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux

      Mesos 1.7.0

    Description

      I noticed that when I set both LIBPROCESS_IP and LIBPROCESS_ADVERTISE_IP for my mesos-slave, only LIBPROCESS_IP gets propagated to mesos-docker-executor. I noticed this because I have to set them both to avoid a hostname lookup, which doesn't work in my environment. LIBPROCESS_IP is set to 0.0.0.0, so that the slave will bind to any IP adrdess (and still be reachable locally at port 5051 for metrics gathering), while LIBPROCESS_ADVERTISE_IP is set to my externally reachable IP address so the rest of the cluster can talk to it. Lo and behold, with this setup, my slave executor processes were failing with the dreaded hostname lookup.

      I notice there is code to inject LIBPROCESS_IP into the executor environment, but not mention of LIBPROCESS_ADVERTISE_IP.

      https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L9974-L9983

      Here's the command line and environment for my slave:

      LIBPROCESS_IP=0.0.0.0

      MASTER=zk://10.33.13.250:2181,10.33.9.108:2181,10.33.7.6:2181/mesos

      LC_ALL=en_US.UTF-8

      LOGS=/var/log/mesos

      LIBPROCESS_ADVERTISE_IP=10.33.15.130

      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

      PWD=/

      LANG=en_US.UTF-8

      SHLVL=0

      ULIMIT=-n 8192

      /usr/sbin/mesos-slave --master=zk://10.33.13.250:2181,10.33.9.108:2181,10.33.7.6:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --work_dir=/mesos

      And here's the command-line and environment for the executor process it attempted to run:

      LIBPROCESS_IP=0.0.0.0

      LIBPROCESS_PORT=0

      MESOS_AGENT_ENDPOINT=10.33.15.130:5051

      MESOS_CHECKPOINT=0

      MESOS_DIRECTORY=/mesos/slaves/7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864/frameworks/dummy_sleep-func-dadkins-d84e56b1a9/executors/dummy_sleep-func-dadkins-d84e56b1a9-func_0/runs/6b5adff6-c745-49ce-93c3-682bf7a23aca

      MESOS_EXECUTOR_ID=dummy_sleep-func-dadkins-d84e56b1a9-func_0

      MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs

      MESOS_FRAMEWORK_ID=dummy_sleep-func-dadkins-d84e56b1a9

      MESOS_HTTP_COMMAND_EXECUTOR=0

      MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.7.0.so

      MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.7.0.so

      MESOS_SLAVE_ID=7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864

      MESOS_SLAVE_PID=slave(1)@10.33.15.130:5051

      PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

      mesos-docker-executor --cgroups_enable_cfs=false --container=mesos-6b5adff6-c745-49ce-93c3-682bf7a23aca–docker=docker --docker_socket=/var/run/docker.sock --help=false --initialize_driver_logging=true --launcher_dir=/usr/libexec/mesos --logbufsecs=0 --logging_level=INFO --mapped_directory=/mnt/mesos/sandbox --quiet=false --sandbox_directory=/mesos/slaves/7c587a36-c4ed-48ce-bfa2-2b0d6e8274b2-S3864/frameworks/dummy_sleep-func-dadkins-d84e56b1a9/executors/dummy_sleep-func-dadkins-d84e56b1a9

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            dan-at-teza Dan Adkins

            Dates

              Created:
              Updated:

              Slack

                Issue deployment