Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-7872

Scheduler hang when registration fails.

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Mesosphere Sprint 61, Mesosphere Sprint 62
    • 5

    Description

      I'm finding that if framework registration fails, the mesos driver client will hang indefinitely with the following output:

      I0809 20:04:22.479391    73 sched.cpp:1187] Got error ''FrameworkInfo.role' is not a valid role: Role '/test/role/slashes' cannot start with a slash'
      I0809 20:04:22.479658    73 sched.cpp:2055] Asked to abort the driver
      I0809 20:04:22.479843    73 sched.cpp:1233] Aborting framework 
      

      I'd have expected one or both of the following:

      • SchedulerDriver.run() should have exited with a failed Proto.Status of some form
      • Scheduler.error() should have been invoked when the "Got error" occurred

      Steps to reproduce:

      • Launch a scheduler instance, have it register with a known-bad framework info. In this case a role containing slashes was used
      • Observe that the scheduler continues in a TASK_RUNNING state despite the failed registration. From all appearances it looks like the Scheduler implementation isn't invoked at all

      I'd guess that because this failure happens before framework registration, there's some error handling that isn't fully initialized at this point.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            alexr Alex R
            tillt Till Toenshoff
            Anand Mazumdar Anand Mazumdar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Completed Sprints:
                Mesosphere Sprint 61 ended 18/Aug/17
                Mesosphere Sprint 62 ended 06/Sep/17
                View on Board

                Slack

                  Issue deployment