Uploaded image for project: 'Apache Helix'
  1. Apache Helix
  2. HELIX-134 HelixManager zk session expiry/gc handling
  3. HELIX-124

race condition in ZkHelixManager.handleNewSession()

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.6.2-incubating
    • None
    • None
    • Sprint #4 10/2 - 10/16

    Description

      ZkHelixManager.handleNewSession() is an async callback. There is a race condition when we have multiple consecutive session expiries. for example:

      1) sessionExpiry_0 happens, and newSessionId is sessionId_1
      2) sessionExpiry_1 happens, and newSessionId is sessionId_2
      3) handleNewSession caused by sessionExpiry_0 is invoked
      4) handleNewSession caused by sessionExpiry_1 is invoked

      Since 3) and 4) both happen after 2), we will get zk handleNewSession callback twice with the same session id (sessionId_2)

      This is problematic:
      if the manager is a PARTICIPANT, the second handleNewSession() will fail to create live-instance and reset all listeners. then we come to the situation where live-instance exists but no listener is registered for the PARTICIPANT

      if the manager is a CONTROLLER, we add listeners twice

      Attachments

        Activity

          People

            dafu Zhen Zhang
            dafu Zhen Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: