Uploaded image for project: 'Apache Helix'
  1. Apache Helix
  2. HELIX-134

HelixManager zk session expiry/gc handling

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None
    • Sprint #4 10/2 - 10/16

    Description

      current helix manager is not handling zk session expiry (caused by long gc) reliably especially in case of frequent long gc's

      Attachments

        1.
        separate HelixManager implementation for participant, controller, and distributed controller Sub-task Resolved Zhen Zhang
        2.
        Take care of consecutive handleNewSession() and session expiry during handleNewSession() Sub-task Resolved Zhen Zhang
        3.
        HelixManager#isLeader() should compare both instanceName and sessionId Sub-task Resolved Zhen Zhang
        4.
        handleNewSession() should wait on all left-over tasks to be cancelled successfully before start new session Sub-task Resolved Zhen Zhang
        5.
        Proper handling exception thrown by handleNewSessionAsParticipant() when an instance already exists Sub-task Open Zhen Zhang
        6.
        Flapping detection Sub-task Resolved Zhen Zhang
        7.
        Design and implement new api's for new HelixManager implementation Sub-task Resolved Zhen Zhang
        8.
        Add stress test by simulating gc and network partition for the new HelixManager implementation Sub-task Open Zhen Zhang
        9.
        Need to double check the logic to prevent 2 controllers to control the same cluster Sub-task Resolved Zhen Zhang
        10.
        Race condition between FINALIZE callbacks and Zk Callbacks Sub-task Resolved Zhen Zhang
        11.
        race condition in ZkHelixManager.handleNewSession() Sub-task Resolved Zhen Zhang
        12.
        ZkHelixManager.handleNewSession() can happen when a liveinstance already exists Sub-task Resolved Unassigned
        13.
        StateModel stateTransitionMethod() and rest() are not synchronized on the partition Sub-task Open Zhen Zhang
        14.
        ZkHelixManager.handleNewSession() and ZkHelixManager.disconnect() need to be sync'ed Sub-task Open Zhen Zhang
        15.
        Possible race condition in ZkHelixManager.disconnect() during zk session expiry Sub-task Open Zhen Zhang
        16.
        ZkHelixManager.isLeader() should check session id in addition to instance name Sub-task Resolved Zhen Zhang

        Activity

          People

            dafu Zhen Zhang
            dafu Zhen Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: