Uploaded image for project: 'Aurora'
  1. Aurora
  2. AURORA-1830

Unknown exception initializing sandbox

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 0.16.0
    • 0.16.0
    • Executor
    • None

    Description

      When launching a job using the Mesos containerizer and a docker image, the sandbox setup fails with the following error:

      FAILED • Unknown exception initializing sandbox: [Errno 2] No such file or directory

      Aurora file:

      # run the script
      python = Process(
        name = 'python',
        cmdline = 'python --version')
      
      # describe the task
      python_task = Task(
        processes = [python],
        resources = Resources(cpu = 1, ram = 1*GB, disk=8*GB))
      
      jobs = [
        Service(cluster = 'MY Cluster',
                environment = 'devel',
                role = 'root',
                name = 'python',
                task = python_task,
                container = Mesos( image = DockerImage (name = 'python', tag = '2')))
      ]
      

      _main_.log:

      Log file created at: 2016/11/24 14:45:44
      Running on machine: gnode1
      [DIWEF]mmdd hh:mm:ss.uuuuuu pid file:line] msg
      Command line: /var/lib/mesos/slave/slaves/195fbdc8-6720-443b-b036-7fa5608b27cc-S24/frameworks/195fbdc8-6720-443b-b036-7fa5608b27cc-0014/executors/thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8/runs/e25e2e98-0b65-4e9f-a86d-13a18dff01bc/thermos_executor --announcer-ensemble 127.0.0.1:2181
      I1124 14:45:44.041621 25610 executor_base.py:45] Executor [None]: registered() called with:
      I1124 14:45:44.042294 25610 executor_base.py:45] Executor [None]:    ExecutorInfo:  executor_id {
        value: "thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8"
      }
      resources {
        name: "cpus"
        type: SCALAR
        scalar {
          value: 0.25
        }
        role: "*"
      }
      resources {
        name: "mem"
        type: SCALAR
        scalar {
          value: 128.0
        }
        role: "*"
      }
      command {
        uris {
          value: "/usr/bin/thermos_executor"
          executable: true
        }
        value: "${MESOS_SANDBOX=.}/thermos_executor --announcer-ensemble 127.0.0.1:2181"
      }
      framework_id {
        value: "195fbdc8-6720-443b-b036-7fa5608b27cc-0014"
      }
      name: "AuroraExecutor"
      source: "root.devel.python.0"
      container {
        type: MESOS
        volumes {
          container_path: "taskfs"
          mode: RO
          image {
            type: DOCKER
            docker {
              name: python:2"
            }
          }
        }
        mesos {
        }
      }
      labels {
        labels {
          key: "source"
          value: "root.devel.python.0"
        }
      }
      
      I1124 14:45:44.042458 25610 executor_base.py:45] Executor [None]:    FrameworkInfo: user: "root"
      name: "Aurora"
      id {
        value: "195fbdc8-6720-443b-b036-7fa5608b27cc-0014"
      }
      failover_timeout: 1814400.0
      checkpoint: true
      hostname: "vnode7"
      capabilities {
        type: GPU_RESOURCES
      }
      
      I1124 14:45:44.043046 25610 executor_base.py:45] Executor [None]:    SlaveInfo:     hostname: "000.000.00.001"
      resources {
        name: "gpus"
        type: SCALAR
        scalar {
          value: 2.0
        }
        role: "*"
      }
      resources {
        name: "ports"
        type: RANGES
        ranges {
          range {
            begin: 1025
            end: 2180
          }
          range {
            begin: 2182
            end: 3887
          }
          range {
            begin: 3889
            end: 5049
          }
          range {
            begin: 5052
            end: 8079
          }
          range {
            begin: 8082
            end: 8180
          }
          range {
            begin: 8182
            end: 32000
          }
        }
        role: "*"
      }
      resources {
        name: "disk"
        type: SCALAR
        scalar {
          value: 428201.0
        }
        role: "*"
      }
      resources {
        name: "cpus"
        type: SCALAR
        scalar {
          value: 8.0
        }
        role: "*"
      }
      resources {
        name: "mem"
        type: SCALAR
        scalar {
          value: 14957.0
        }
        role: "*"
      }
      attributes {
        name: "hostname"
        type: TEXT
        text {
          value: "gnode1"
        }
      }
      attributes {
        name: "ip"
        type: TEXT
        text {
          value: "000.000.00.001"
        }
      }
      attributes {
        name: "rack"
        type: TEXT
        text {
          value: "gpu"
        }
      }
      attributes {
        name: "gputype"
        type: TEXT
        text {
          value: "titanz"
        }
      }
      id {
        value: "195fbdc8-6720-443b-b036-7fa5608b27cc-S24"
      }
      checkpoint: true
      port: 5051
      
      I1124 14:45:44.043673 25610 executor_base.py:45] Executor [None]: launchTask got task: root/devel/python:root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8
      I1124 14:45:44.044601 25610 executor_base.py:45] Executor [195fbdc8-6720-443b-b036-7fa5608b27cc-S24]: Updating root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8 => STARTING
      I1124 14:45:44.044718 25610 executor_base.py:45] Executor [195fbdc8-6720-443b-b036-7fa5608b27cc-S24]:    Reason: Initializing sandbox.
      F1124 14:45:44.049196 25610 aurora_executor.py:85] Unknown exception initializing sandbox: [Errno 2] No such file or directory
      I1124 14:45:44.049439 25610 executor_base.py:45] Executor [195fbdc8-6720-443b-b036-7fa5608b27cc-S24]: Updating root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8 => FAILED
      I1124 14:45:44.049519 25610 executor_base.py:45] Executor [195fbdc8-6720-443b-b036-7fa5608b27cc-S24]:    Reason: Unknown exception initializing sandbox: [Errno 2] No such file or directory
      I1124 14:45:49.152787 25610 thermos_executor_main.py:299] MesosExecutorDriver.run() has finished.
      

      stderr

      I1124 14:45:43.559283 25614 fetcher.cpp:498] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/195fbdc8-6720-443b-b036-7fa5608b27cc-S24\/root","items":[{"action":"BYPASS_CACHE","uri":{"executable":true,"extract":true,"value":"\/usr\/bin\/thermos_executor"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/195fbdc8-6720-443b-b036-7fa5608b27cc-S24\/frameworks\/195fbdc8-6720-443b-b036-7fa5608b27cc-0014\/executors\/thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8\/runs\/e25e2e98-0b65-4e9f-a86d-13a18dff01bc","user":"root"}
      I1124 14:45:43.561226 25614 fetcher.cpp:409] Fetching URI '/usr/bin/thermos_executor'
      I1124 14:45:43.561242 25614 fetcher.cpp:250] Fetching directly into the sandbox directory
      I1124 14:45:43.561266 25614 fetcher.cpp:187] Fetching URI '/usr/bin/thermos_executor'
      I1124 14:45:43.561285 25614 fetcher.cpp:167] Copying resource with command:cp '/usr/bin/thermos_executor' '/var/lib/mesos/slave/slaves/195fbdc8-6720-443b-b036-7fa5608b27cc-S24/frameworks/195fbdc8-6720-443b-b036-7fa5608b27cc-0014/executors/thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8/runs/e25e2e98-0b65-4e9f-a86d-13a18dff01bc/thermos_executor'
      I1124 14:45:43.569787 25614 fetcher.cpp:547] Fetched '/usr/bin/thermos_executor' to '/var/lib/mesos/slave/slaves/195fbdc8-6720-443b-b036-7fa5608b27cc-S24/frameworks/195fbdc8-6720-443b-b036-7fa5608b27cc-0014/executors/thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8/runs/e25e2e98-0b65-4e9f-a86d-13a18dff01bc/thermos_executor'
      twitter.common.app debug: Initializing: twitter.common.log (Logging subsystem.)
      Writing log files to disk in /var/lib/mesos/slave/slaves/195fbdc8-6720-443b-b036-7fa5608b27cc-S24/frameworks/195fbdc8-6720-443b-b036-7fa5608b27cc-0014/executors/thermos-root-devel-python-0-e33ad106-90dd-481a-8d45-c320990b67d8/runs/e25e2e98-0b65-4e9f-a86d-13a18dff01bc
      I1124 14:45:44.033974 25610 exec.cpp:161] Version: 1.0.0
      I1124 14:45:44.040127 25639 exec.cpp:236] Executor registered on agent 195fbdc8-6720-443b-b036-7fa5608b27cc-S24
      FATAL] Unknown exception initializing sandbox: [Errno 2] No such file or directory
      twitter.common.app debug: Shutting application down.
      twitter.common.app debug: Running exit function for twitter.common.log (Logging subsystem.)
      twitter.common.app debug: Finishing up module teardown.
      twitter.common.app debug:   Active thread: <_MainThread(MainThread, started 139772146038592)>
      twitter.common.app debug:   Active thread (daemon): <_DummyThread(Dummy-2, started daemon 139771946940160)>
      twitter.common.app debug: Exiting cleanly.
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            kr0t Kostiantyn Bokhan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: