Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-6522

Server hangs during shutdown after becoming membership coordinator

Agile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.9.0
    • membership
    • None

    Description

      Recent changes to processing of "leave" requests can cause a member to become the coordinator when it receives a request from one member and all other potential coordinators have failed availability checks.  Unfortunately this is causing a hang during shutdown.  The new coordinator sends out a new view but that view doesn't have the members that failed availability checks removed.  This causes the new View Creator thread to be stopped.  Another one isn't started until additional suspect processing is performed but that doesn't always happen.  This can cause shutdown to hang with other threads trying to contact servers that are no longer there.

       

      [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] finished waiting for responses to view preparation
      
      [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] received new view: View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
      old view is: View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3] members: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023]
      
      [info 2019/03/13 08:38:24.974 PDT <Geode Membership View Creator> tid=0x147] Failure detection is now watching turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015; suspects are {turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002=View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3] members: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023]}
      
      [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] sending new view View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
      
      [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] BRUCE: setting shutdown flag in view creator
      java.lang.Exception: stack trace
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299)
      
      [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] View Creator thread is exiting
      
      [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] BRUCE: setting shutdown flag in view creator
      java.lang.Exception: stack trace
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2379)
      
      [info 2019/03/13 08:38:26.416 PDT <vm_4_thr_8_bridge_2_1_host1_17023> tid=0x150] GemFireCache[id = 66144348; isClosing = true; isShutDownAll = false; created = Wed Mar 13 08:34:41 PDT 2019; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing.
      
      
      
      ...
      
      [info 2019/03/13 08:38:27.495 PDT <Geode Membership View Creator> tid=0x161] View Creator thread is starting
      
      [info 2019/03/13 08:38:27.507 PDT <Geode Membership View Creator> tid=0x161] preparing new view View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
      
      [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] finished waiting for responses to view preparation
      
      [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] received new view: View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
      old view is: View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
      
      [info 2019/03/13 08:38:27.566 PDT <Geode Membership View Creator> tid=0x161] sending new view View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
      
      [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] BRUCE: setting shutdown flag in view creator
      java.lang.Exception: stack trace
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220)
      at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299)
      
      [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] View Creator thread is exiting
      
      

      etc.

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bschuchardt Bruce J Schuchardt
            bschuchardt Bruce J Schuchardt
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 1h
              1h

              Slack

                Issue deployment