Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6989

Docker executor segfaults in ~MesosExecutorDriver()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 1.2.0
    • docker
    • Mesosphere Sprint 49
    • 1

    Description

      With the current Mesos master state (commit 42e515bc5c175a318e914d34473016feda4db6ff), the Docker executor segfaults during shutdown.

      Steps to reproduce:

      1) Start master:

      $ ./bin/mesos-master.sh --ip=127.0.0.1 --work_dir=/tmp/jp/mesos
      WARNING: Logging before InitGoogleLogging() is written to STDERR
      I0125 13:41:15.963775 14744 main.cpp:278] Build: 2017-01-25 13:37:42 by jp
      I0125 13:41:15.963868 14744 main.cpp:279] Version: 1.2.0
      I0125 13:41:15.963877 14744 main.cpp:286] Git SHA: 42e515bc5c175a318e914d34473016feda4db6ff
      

      (note that building it at 13:37 is not part of the repro)

      2) Start agent:

      $ ./bin/mesos-slave.sh --containerizers=mesos,docker --master=127.0.0.1:5050 --work_dir=/tmp/jp/mesos
      

      3) Run mesos-execute with the Docker containerizer:

      $ ./src/mesos-execute --master=127.0.0.1:5050 --name=testcommand --containerizer=docker --docker_image=debian --command=env
      I0125 13:43:59.704973 14951 scheduler.cpp:184] Version: 1.2.0
      I0125 13:43:59.706425 14952 scheduler.cpp:470] New master detected at master@127.0.0.1:5050
      Subscribed with ID 57596743-06f4-45f1-a975-348cf70589b1-0000
      Submitted task 'testcommand' to agent '57596743-06f4-45f1-a975-348cf70589b1-S0'
      Received status update TASK_RUNNING for task 'testcommand'
        source: SOURCE_EXECUTOR
      Received status update TASK_FINISHED for task 'testcommand'
        message: 'Container exited with status 0'
        source: SOURCE_EXECUTOR
      

      Relevant agent output that shows the executor segfault:

      [...]
      I0125 13:44:16.249191 14823 slave.cpp:4328] Got exited event for executor(1)@192.99.40.208:33529
      I0125 13:44:16.347095 14830 docker.cpp:2358] Executor for container 396282a9-7bf0-48ee-ba07-3ff2ca801d53 has exited
      I0125 13:44:16.347127 14830 docker.cpp:2052] Destroying container 396282a9-7bf0-48ee-ba07-3ff2ca801d53
      I0125 13:44:16.347439 14830 docker.cpp:2179] Running docker stop on container 396282a9-7bf0-48ee-ba07-3ff2ca801d53
      I0125 13:44:16.349215 14826 slave.cpp:4691] Executor 'testcommand' of framework 57596743-06f4-45f1-a975-348cf70589b1-0000 terminated with signal Segmentation fault (core dumped)
      [...]
      

      The complete task stderr:

      $ cat /tmp/jp/mesos/slaves/57596743-06f4-45f1-a975-348cf70589b1-S0/frameworks/57596743-06f4-45f1-a975-348cf70589b1-0000/executors/testcommand/runs/latest/stderr 
      I0125 13:44:12.850073 15030 exec.cpp:162] Version: 1.2.0
      I0125 13:44:12.864229 15050 exec.cpp:237] Executor registered on agent 57596743-06f4-45f1-a975-348cf70589b1-S0
      I0125 13:44:12.865842 15054 docker.cpp:850] Running docker -H unix:///var/run/docker.sock run --cpu-shares 1024 --memory 134217728 --env-file /tmp/xFZ8G9 -v /tmp/jp/mesos/slaves/57596743-06f4-45f1-a975-348cf70589b1-S0/frameworks/57596743-06f4-45f1-a975-348cf70589b1-0000/executors/testcommand/runs/396282a9-7bf0-48ee-ba07-3ff2ca801d53:/mnt/mesos/sandbox --net host --entrypoint /bin/sh --name mesos-57596743-06f4-45f1-a975-348cf70589b1-S0.396282a9-7bf0-48ee-ba07-3ff2ca801d53 debian -c env
      I0125 13:44:15.248721 15064 exec.cpp:410] Executor asked to shutdown
      *** Aborted at 1485369856 (unix time) try "date -d @1485369856" if you are using GNU date ***
      PC: @     0x7fb38f153dd0 (unknown)
      *** SIGSEGV (@0x68) received by PID 15030 (TID 0x7fb3961a88c0) from PID 104; stack trace: ***
          @     0x7fb38f15b5c0 (unknown)
          @     0x7fb38f153dd0 (unknown)
          @     0x7fb39332c607 __gthread_mutex_lock()
          @     0x7fb39332c657 __gthread_recursive_mutex_lock()
          @     0x7fb39332edca std::recursive_mutex::lock()
          @     0x7fb393337bd8 _ZZ11synchronizeISt15recursive_mutexE12SynchronizedIT_EPS2_ENKUlPS0_E_clES5_
          @     0x7fb393337bf8 _ZZ11synchronizeISt15recursive_mutexE12SynchronizedIT_EPS2_ENUlPS0_E_4_FUNES5_
          @     0x7fb39333ba6b Synchronized<>::Synchronized()
          @     0x7fb393337cac synchronize<>()
          @     0x7fb39492f15c process::ProcessManager::wait()
          @     0x7fb3949353f0 process::wait()
          @     0x55fd63f31fe5 process::wait()
          @     0x7fb39332ce3c mesos::MesosExecutorDriver::~MesosExecutorDriver()
          @     0x55fd63f2bd86 main
          @     0x7fb38e4fc401 __libc_start_main
          @     0x55fd63f2ab5a _start
      

      Attachments

        Activity

          People

            kaysoky Joseph Wu
            jgehrcke Dr. Jan-Philip Gehrcke
            Anand Mazumdar Anand Mazumdar
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: