Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2043

Nimbus should not make assignments crazy when Pacemaker down

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.0.0, 1.0.1, 1.0.2, 1.1.0
    • None
    • storm-core
    • CentOS 6.5
    • Important

    Description

      When pacemaker goes down, all the heartbeats of workers are lost. These heartbeats will need a long time to recover even if pacemaker goes up immediately if it costs dozens of GB memory. During the time worker heartbeats are not complete,Nimbus will think the workers are died( heartbeat time out ), and reassign these workers crazily. But actually the workers are healthy, the reassignment will move in cycles until pacemaker heartbeats recover. During this time, all the topologies's throughout will goes down. We should avoid this, because Pacemaker has no HA.

      Attachments

        Activity

          People

            Unassigned Unassigned
            danny0405 Danny Chen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 672h
                672h
                Remaining:
                Remaining Estimate - 672h
                672h
                Logged:
                Time Spent - Not Specified
                Not Specified