Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-33

Introduce clustering for high availability & fault tolerance

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.18
    • Broker-J
    • None

    Description

      This task has been created as an initial place holder from which it is anticipated many tasks will derive.

      We currently have a clustering implementation which provides scalability but not high availability i.e. currently if a broker in a cluster fails its clients can failover to another broker in the same cluster BUT we do not have the ability to restart on another node at the last state before failure using the saved state (from shared storage).

      The other brokers in a cluster will know about (via broadcasting) each other's queues etc, but not about any action the failed broker will processing - thus we could potentially suffer message loss and state disconnect. Also note that currently membership of a cluster does not imply any failover behaviour automatically.

      We know that there are users who require HA/fault tolerant clustering with 99.999% availability.

      A holding page for clustering & HA notes exists here: http://cwiki.apache.org/confluence/display/qpid/ClusteringHA with use case content.

      The analysis for this task will involve expanding the design documentation and inviting review prior to work starting on the implementation and also requires a thorough understanding of the protocol.

      Attachments

        Activity

          People

            kwall Keith Wall
            marnie Marnie McCormack
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: