Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-2534

concurrently started locators fail to create a unified system

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0
    • Component/s: locator
    • Labels:
      None

      Description

      During startup a locator responded to a "find coordinator" request before knowing its own identity. This caused it to respond differently to subsequent requests during concurrent locator startup. As a result it created its own distributed system while the locator that received the initial response created a different one.

      [fine 2017/02/23 15:32:02.031 UTC locator-default-0 <main> tid=0x1] LogWriter is created.
      
      [fine 2017/02/23 15:32:02.031 UTC locator-default-0 <main> tid=0x1] Responding to a property change event. Property name is config.
      
      [info 2017/02/23 15:32:02.886 UTC locator-default-0 <main> tid=0x1] Peer locator is connecting to local membership services
      
      [fine 2017/02/23 15:32:02.887 UTC locator-default-0 <locator request thread[1]> tid=0x14] Peer locator: coordinator from registrations is 10.85.100.166(locator-default-2:8706:locator)<ec>:49152
      
      [fine 2017/02/23 15:32:02.887 UTC locator-default-0 <locator request thread[1]> tid=0x14] Peer locator returning FindCoordinatorResponse(coordinator=10.85.100.166(locator-default-2:8706:locator)<ec>:49152, fromView=false, viewId=nul, registrants=1, senderId=null, network partition detection enabled=true, locators preferred as coordinators=true)
      
      [info 2017/02/23 15:32:02.891 UTC locator-default-0 <main> tid=0x1] Starting membership services
      
      [fine 2017/02/23 15:32:02.891 UTC locator-default-0 <main> tid=0x1] starting Authenticator
      
      [fine 2017/02/23 15:32:02.891 UTC locator-default-0 <main> tid=0x1] starting Messenger
      
      ...
      
      [fine 2017/02/23 15:32:03.369 UTC locator-default-0 <main> tid=0x1] All membership services have been started
      
      [fine 2017/02/23 15:32:03.369 UTC locator-default-0 <main> tid=0x1] join timeout is set to 24000
      
      [fine 2017/02/23 15:32:03.370 UTC locator-default-0 <main> tid=0x1] searching for the membership coordinator
      
      [fine 2017/02/23 15:32:03.370 UTC locator-default-0 <main> tid=0x1] sending FindCoordinatorRequest(memberID=10.85.100.165(locator-default-0:8873:locator)<ec>:49152, rejected=[], lastViewId=-1) to [/10.85.100.165:55221, /10.85.100.166:55221, /10.85.100.167:55221]
      
      ...
      
      [fine 2017/02/23 15:32:03.376 UTC locator-default-0 <locator request thread[1]> tid=0x14] Peer locator: coordinator from registrations is 10.85.100.165(locator-default-0:8873:locator)<ec>:49152
      
      [fine 2017/02/23 15:32:03.376 UTC locator-default-0 <locator request thread[1]> tid=0x14] Peer locator returning FindCoordinatorResponse(coordinator=10.85.100.165(locator-default-0:8873:locator)<ec>:49152, fromView=false, viewId=nul, registrants=2, senderId=10.85.100.165(locator-default-0:8873:locator)<ec>:49152, network partition detection enabled=true, locators preferred as coordinators=true)
      

      The locator should not respond to requests to find the coordinator before it knows its own identity.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              bschuchardt Bruce J Schuchardt
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: