Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-3212

Servers get stuck with the warning "Failed to wait for initial partition map exchange" during falover test

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Critical
    • Resolution: Cannot Reproduce
    • 1.6
    • 3.0
    • None
    • None

    Description

      Servers being restarted during failover test get stuck after some time with the warning "Failed to wait for initial partition map exchange".

      [08:44:41,303][INFO ][disco-event-worker-#80%null%][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=db557f04-43b7-4e28-ae0d-d4dcf4139c89, addrs=
      [10.20.0.222, 127.0.0.1], sockAddrs=[fosters-222/10.20.0.222:47503, /10.20.0.222:47503, /127.0.0.1:47503], discPort=47503, order=44, intOrder=32, lastExchangeTime=1464
      363880917, loc=false, ver=1.6.0#20160525-sha1:48321a40, isClient=false]
      [08:44:41,304][INFO ][disco-event-worker-#80%null%][GridDiscoveryManager] Topology snapshot [ver=44, servers=19, clients=1, CPUs=64, heap=160.0GB]
      [08:45:11,455][INFO ][disco-event-worker-#80%null%][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=6fae61a7-c1c1-40e5-8ad0-8bf5d6c86eb7, addrs=
      [10.20.0.223, 127.0.0.1], sockAddrs=[fosters-223/10.20.0.223:47503, /10.20.0.223:47503, /127.0.0.1:47503], discPort=47503, order=45, intOrder=33, lastExchangeTime=1464
      363910999, loc=false, ver=1.6.0#20160525-sha1:48321a40, isClient=false]
      [08:45:11,455][INFO ][disco-event-worker-#80%null%][GridDiscoveryManager] Topology snapshot [ver=45, servers=20, clients=1, CPUs=64, heap=170.0GB]
      [08:45:19,942][INFO ][ignite-update-notifier-timer][GridUpdateNotifier] Update status is not available.
      [08:46:20,370][WARN ][main][GridCachePartitionExchangeManager] Failed to wait for initial partition map exchange. Possible reasons are:
        ^-- Transactions in deadlock.
        ^-- Long running transactions (ignore if this is the case).
        ^-- Unreleased explicit locks.
      [08:48:30,375][WARN ][main][GridCachePartitionExchangeManager] Still waiting for initial partition map exchange ...
      

      "Failed to wait for partition release future" warnings are on other nodes.

      [08:09:45,822][WARN ][exchange-worker-#82%null%][GridDhtPartitionsExchangeFuture] Failed to wait for partition release future [topVer=AffinityTopologyVersion [topVer=29, minorTopVer=0], node=cab5d0e0-7365-4774-8f99-d9f131c5d896]. Dumping pending objects that might be the cause:
      [08:09:45,822][WARN ][exchange-worker-#82%null%][GridCachePartitionExchangeManager] Ready affinity version: AffinityTopologyVersion [topVer=28, minorTopVer=1]
      [08:09:45,826][WARN ][exchange-worker-#82%null%][GridCachePartitionExchangeManager] Last exchange future: GridDhtPartitionsExchangeFuture ...
      

      Load config:

      • 1 client, 20 servers (5 servers per 1 host)
      • warmup 60
      • duration 66h
      • preload 5M
      • key range 10M
      • operations: PUT PUT_ALL GET GET_ALL INVOKE INVOKE_ALL REMOVE REMOVE_ALL PUT_IF_ABSENT REPLACE
      • backups count 3
      • 3 servers restart every 15 min with 30 sec step, pause between stop and start 5min

      Attachments

        Activity

          People

            Unassigned Unassigned
            krybakova Ksenia Rybakova
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: