Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3093

Cache the storm id to executors mapping on master to avoid repeat computation

    Details

      Description

      Now nimbus will collect all the topologies's conf/topology-ser/storm-base to compute in a scheduling round, which is a very heavy work. The scheduling will still take to minutes even we now change to RPC heartbeats and assignment distribution.

      So i decide to redesign the scheduler, so we can only schedule the topologies that need to: that have dead workers or not enough number workers.

      Here i checkout out the code and found that the id->executors mapping is computed every time for every topology, which is really a heavy computation and totally not that necessary, because this mapping is fixed invariable for a topology unless we rebalance or kill it.

      So i refactor the code a little here, and this is more powerful after the scheduler is resigned for delta-scheduling[ which is very lightweight even there are thousands of topologies on one cluster.]

      For now this is enough for us.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                danny0405 Danny Chan
                Reporter:
                danny0405 Danny Chan
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m