Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3024

Allow scheduling for RAS to happen in the background

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Won't Fix
    • 2.0.0
    • None
    • storm-server

    Description

      We have run into some issues recently where occasionally a strategy on a very large cluster will take an extra long amount of time finish scheduling.  This slowness cascades into other issues, like topologies not being able to be killed because the timer thread is still in use trying to run scheduling.

      The plan is to make scheduling happen in a thread pool.  The main thread will wait for up to a configurable amount of time for the topology to be scheduled, but if it does not complete in that time it will be left to keep running in the background thread in hopes that later on it will be scheduled.

      If for some reason the state of the cluster changes while scheduling is happening in the background we will cancel the scheduling, as any scheduling it produced may not be able to fit on the cluster.  The next time the scheduler runs it will restart the scheduling and hopefully allow the cluster to reach a steady state even if it takes a while, but without blocking kills and other critical operations from happening.

      Note that we are also working on optimizing scheduling as well so that these issues don't happen in the first place.

      Attachments

        Issue Links

          Activity

            People

              revans2 Robert Joseph Evans
              revans2 Robert Joseph Evans
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m