Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.0
Description
I'm finding that if framework registration fails, the mesos driver client will hang indefinitely with the following output:
I0809 20:04:22.479391 73 sched.cpp:1187] Got error ''FrameworkInfo.role' is not a valid role: Role '/test/role/slashes' cannot start with a slash' I0809 20:04:22.479658 73 sched.cpp:2055] Asked to abort the driver I0809 20:04:22.479843 73 sched.cpp:1233] Aborting framework
I'd have expected one or both of the following:
- SchedulerDriver.run() should have exited with a failed Proto.Status of some form
- Scheduler.error() should have been invoked when the "Got error" occurred
Steps to reproduce:
- Launch a scheduler instance, have it register with a known-bad framework info. In this case a role containing slashes was used
- Observe that the scheduler continues in a TASK_RUNNING state despite the failed registration. From all appearances it looks like the Scheduler implementation isn't invoked at all
I'd guess that because this failure happens before framework registration, there's some error handling that isn't fully initialized at this point.