Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.0, 2.0.0
-
None
Description
We have been trying to roll out CGROUP enforcement and right now are running into a number of race conditions in the supervisor. When using CGROUPS the timing of some operations are different and are exposing issues that we would not see without this.
In order to make progress with testing/deploying CGROUP and RAS we are going to try and refactor the supervisor to have a simpler threading model, but likely with more threads. We will base the code off of the java code currently in master, and may replace that in the 2.0 release, but plan on having it be a part of 1.x too, if it truly is more stable.
I will try to keep this JIRA up to date with what we are doing and the architecture to keep the community informed. We need to move quickly to meet some of our company goals but will not just shove this in. We welcome any feedback on the design and code before it goes into the community.
Attachments
Attachments
Issue Links
- breaks
-
STORM-2336 LocalCluster doesn't terminate properly after cluster shutdown (needs Ctrl+C to terminate)
-
- Resolved
-
- is related to
-
STORM-2690 resurrect invocation of ISupervisor.assigned() & make Supervisor.launchDaemon() accessible
-
- Resolved
-
- supercedes
-
STORM-2071 nimbus-test test-leadership failing with Exception
-
- Resolved
-
- links to