Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-3325

Running mesos-slave@0.23 in a container causes slave to be lost after a restart

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Abandoned
    • Affects Version/s: 0.23.0
    • Fix Version/s: None
    • Component/s: agent
    • Labels:
      None
    • Environment:

      CoreOS, Container, Docker

      Description

      We are attempting to run mesos-slave 0.23 in a container. However it appears that the mesos-slave agent registers as a new slave instead of re-registering. This causes the formerly-launched tasks to continue running.

      systemd unit being used:

      ```
      [Unit]
      Description=MesosSlave
      After=docker.service dockercfg.service
      Requires=docker.service dockercfg.service

      [Service]
      Environment=MESOS_IMAGE=mesosphere/mesos-slave:0.23.0-1.0.ubuntu1404
      Environment=ZOOKEEPER=redacted
      User=core
      KillMode=process
      Restart=always
      RestartSec=20
      TimeoutStartSec=0
      ExecStartPre=-/usr/bin/docker kill mesos_slave
      ExecStartPre=-/usr/bin/docker rm mesos_slave
      ExecStartPre=/usr/bin/docker pull ${MESOS_IMAGE}
      ExecStart=/usr/bin/sh -c "sudo /usr/bin/docker run \
      --name=mesos_slave \
      --net=host \
      --pid=host \
      --privileged \
      -v /home/core/.dockercfg:/root/.dockercfg:ro \
      -v /sys:/sys \
      -v /usr/bin/docker:/usr/bin/docker:ro \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v /lib64/libdevmapper.so.1.02:/lib/libdevmapper.so.1.02:ro \
      -v /var/lib/mesos/slave:/var/lib/mesos/slave \
      ${MESOS_IMAGE} \
      --ip=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4` \
      --attributes=zone:$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)\;os:coreos \
      --containerizers=docker,mesos \
      --executor_registration_timeout=10mins \
      --hostname=`curl -s http://169.254.169.254/latest/meta-data/public-hostname` \
      --log_dir=/var/log/mesos \
      --master=zk://${ZOOKEEPER}/mesos \
      --work_dir=/var/lib/mesos/slave"
      ExecStop=/usr/bin/docker stop mesos_slave

      [Install]
      WantedBy=multi-user.target

      [X-Fleet]
      Global=true
      MachineMetadata=role=worker
      ```

      ps, yes I saw the coreos-setup repo was deprecated.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              cfortier Chris Fortier
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: