Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-10589

Multiple server node failure after a client node stopping

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.8
    • None
    • None

    Description

      after stopping a client
      we see topology change and pme finish on the coordinator,
      and at soon on another nodes we still don't see new topology, but have
      Critical error resulting nodes failure
      crd log

      2018-12-06 15:55:23.660 [WARN ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Node FAILED: ZookeeperClusterNode [id=979f03db-f858-44f6-8646-12034dfd5c93, addrs=[10.116.206.1], order=129, loc=false, client=true]
      2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Topology snapshot [ver=162, servers=128, clients=0, CPUs=7168, offheap=140000.0GB, heap=4000.0GB]
      2018-12-06 15:55:23.660 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Node [id=44D27930-80E5-4EB7-B377-8B07C02C2033, clusterState=ACTIVE]
      2018-12-06 15:55:23.660 [INFO ][zk-DPL_GRID%DplGridNodeName-EventThread][o.a.i.s.d.z.i.ZookeeperDiscoveryImpl] Process alive nodes change [alives=128]
      2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- Baseline [id=0, size=128, online=128, offline=0]
      2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager] Data Regions Configured:
      2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- dpl_mem_plc [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=true]
      2018-12-06 15:55:23.661 [INFO ][disco-event-worker-#159%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]   ^-- not-persisted [initSize=256.0 MiB, maxSize=556.6 GiB, persistenceEnabled=false]
      2018-12-06 15:55:23.670 [DEBUG][sys-#564%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager] Process node left 979f03db-f858-44f6-8646-12034dfd5c93
      2018-12-06 15:55:23.670 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Started exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true, evt=NODE_FAILED, evtNode=979f03db-f858-44f6-8646-12034dfd5c93, customEvt=null, allowMerge=true]
      2018-12-06 15:55:23.712 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.ignite.internal.exchange.time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], crd=true]
      2018-12-06 15:55:23.699 [INFO ][exchange-worker-#160%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], err=null]
      

      on a node(1) we have critical error(1)

      2018-12-06 15:55:23.727 [ERROR][utility-#432%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
       futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=105, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrd
      er=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=15557137
      4, order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCa
      cheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurat
      ion [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1
      a8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-33192
      9065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0
      -4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61
      abf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38
      b2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7
      a61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72f
      a1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444
      -0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b7
      147304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d
      2f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1,
      e347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-
      983d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1
      , c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e
      9-80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed
      =1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}]
      , hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntr
      yPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=1545
      426660336, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=G
      ridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1942, id=219868, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, map
      pedNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|
      removed=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, part
      UpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, c
      nt=0, super=GridCacheIdMessage [cacheId=0]]]]]
      org.apache.ignite.IgniteException: Failed to resolve nodes topology [cacheGrp=N/A, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], history=[AffinityTopologyVersion [topVer=161, minorTopVer=0]], snap=Snapshot [topVer=AffinityTopologyVersion [topVer
      =161, minorTopVer=0]], locNode=ZookeeperClusterNode [id=51dc74ab-c989-4268-b850-ed69a24cca30, addrs=[10.116.206.28], order=161, loc=true, client=false]]
              at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.resolveDiscoCache(GridDiscoveryManager.java:2111)
              at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.consistentId(GridDiscoveryManager.java:1950)
              at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:104)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
              at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
              at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
              at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
              at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
              at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      on many other nodes we have Critical error (2) with unable to find nodeId=51dc74ab-c989-4268-b850-ed69a24cca30 of the Node(1) and nearNodeid is CRD:

      2018-12-06 15:55:23.730 [ERROR][utility-#386%DPL_GRID%DplGridNodeName%][o.a.i.i.p.cache.GridCacheIoManager] Failed processing message [senderId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, msg=GridDhtTxPrepareRequest [nearNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033,
       futId=1d225238761-05eea259-5c25-4a4b-8469-9dd8980e218c, miniId=79, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0], invalidateNearEntries={}, nearWrites=null, owned=null, nearXidVer=GridCacheVersion [topVer=155571374, order=1545423626166, nodeOrde
      r=1], subjId=44d27930-80e5-4eb7-b377-8b07c02c2033, taskNameHash=0, preloadKeys=null, skipCompletedVers=false, super=GridDistributedTxPrepareRequest [threadId=1281, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=155571374
      , order=1545423626614, nodeOrder=96], timeout=0, reads=null, writes=ArrayList [IgniteTxEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601, txKey=IgniteTxKey [key=KeyCac
      heObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], cacheId=-2100569601], val=CacheObjectImpl [val=GridServiceAssignments [nodeId=426a4a51-1af3-4019-9769-4a58d8ece426, topVer=162, cfg=LazyServiceConfigurati
      on [srvcClsName=com.sbt.dpl.gridgain.thread.DPLThreadManager, svcCls=, nodeFilterCls=IgniteAllNodesPredicate], assigns=HashMap {74979bc7-e4c3-424a-8347-2f3a589aca3e=1, 32afc0a1-156d-4998-9b64-2336b86fb1c2=1, 6b34e59e-924c-404b-9451-ddf4b8b935b5=1, c2af8947-1a
      8d-48d5-93ca-165f97399519=1, 12f327e6-0f5f-44e2-be2c-f7bf99d22eec=1, 5638cb8c-4abb-49e3-8eef-edb3d5ccad77=1, 20349aca-ecf9-4a5f-bce9-2640e54fbbb2=1, e83f5de6-deaf-4f60-af5c-5a13d4f251a7=1, 2b4261be-feb5-4d59-be2a-6c9fcbe2fa4a=1, c3f0bff0-08fc-4601-b9f9-331929
      065c0a=1, bac0156c-a56b-4ab8-aa7c-7d9878151e9f=1, ccc2d442-8df4-402e-8589-d8ec3c6ec243=1, e34256a2-1bb9-4a17-86d1-21532833dded=1, 50bae4c5-16a5-48bf-a3f6-aa1b123074af=1, c5ecce59-cf6f-4be3-9861-e5c2622480a5=1, d95ad91e-abd6-4c59-bc02-6298278f84c5=1, 035787c0-
      4497-4682-9488-9be55e875175=1, e17bd18f-4a71-47f5-bdd6-64199a9bfb3a=1, e3565de2-04a3-4107-95d9-01cdd790838b=1, dd372a51-8239-4f8f-8eef-5d6f206e971e=1, c7ff660d-a003-493b-9aac-a6f73ad46561=1, 426a4a51-1af3-4019-9769-4a58d8ece426=1, 107810a2-c04d-452c-b4b5-d61a
      bf16272c=1, 99df5f9f-5bc9-4f6b-a538-25c097124f38=1, 93baebe8-c8fc-4e25-8b6a-17f925e67dce=1, 44f063cf-9ad5-4095-96a3-54e554ab9ca3=1, d4f7a539-cc66-4d76-afd6-2d41f533c44e=1, 0be09f47-0a70-4589-a78c-6c9fb7393d43=1, 6094f19a-6754-4ce9-8892-a540a52cf775=1, 17ad38b
      2-1ec5-4531-9fca-397acbfa4a98=1, 23f220b1-cddf-4ac8-8987-05ec33569855=1, e7b59e5d-3102-4dda-9d84-95783d80940d=1, cd9fc4bb-b488-40f5-8a72-24ecf633e3b0=1, 76fd7993-7c53-4a59-b199-ece9ce6f1b32=1, 8aef872d-83cd-42f1-9641-612206f1d026=1, 158087d4-2ca2-4ee2-83be-7a
      61e52c9aac=1, 59b4bfeb-7690-4844-b3b9-d7939d72c098=1, 16bd8778-33a1-4a06-a904-8792f9991921=1, 5198403c-ffb1-4b64-b523-202cc76aee59=1, fd636af8-4dea-4df1-b6b1-099bd14ec8aa=1, aa2471f8-f5cd-445f-8b79-fa58e08783a7=1, 22b3ebec-8a3e-4320-8ccc-a0c968362222=1, b72fa
      1b9-7547-452c-99d2-5b61b6f8ebec=1, c56babe1-1071-4d8e-9499-3fc211d375a0=1, 43ff2a2f-db9c-4f5e-b721-ef350451ca0a=1, 20a6ff76-53f6-4508-8e4c-c87359e625a8=1, 07377a77-24da-4de3-8993-595b6f77e199=1, 0923e9e8-b412-40ae-bbb6-c75e1923cbd8=1, 416e7265-7507-4d05-b444-
      0118999146a1=1, cd12c8c4-d209-4e5b-b34e-9cf25523ee7d=1, 32552ec6-5a88-4ae6-a069-0d86594b7031=1, 89a4c1a0-33a1-40b2-a443-f15533bd13d7=1, b6545e2e-6ee5-4c0f-8fd8-3448bc2ab546=1, ab1dedb8-919b-4373-af45-250f12c7a8af=1, 944d0054-4b92-4eae-80dc-b6a45fe415c8=1, b71
      47304-51c3-4c2d-9fb7-45853b27b79f=1, 6e334bfa-05d7-4909-94fb-e812d0fd4c76=1, a91104a0-e902-46e4-8725-47438b48b102=1, 5e5fb902-1c22-488a-bb25-87d10c8ccc0e=1, 297d27cc-4cd1-486f-a922-0a27808d8304=1, 5e069562-4ac7-48c6-8ce2-e6feb6be3d44=1, 7db65bfd-5478-4b3d-8d2
      f-7f35696fe6fe=1, 66dbd7b0-c7b2-421e-bc11-edc3919e4a0c=1, c83a8121-38e8-4409-8cdb-d4929bf4d0cf=1, f9b0d868-0a8e-41a5-9c45-73ae688ffc1e=1, be79ee19-8cc4-467c-bb15-2b066c27d667=1, 9e54bf88-b439-4e2f-9f14-4dee1bb66a0d=1, 9eee3d1d-702e-4e02-916f-f21b4b5dc27a=1, e
      347eecb-3489-4e0d-a7dc-420854b8b3e9=1, 9cdf5210-6624-4f2f-9314-3e6bf3b23587=1, 1e17c56a-5213-4a1b-b94b-4575a95a2c81=1, 1988168d-839c-471b-9a83-3c48f0a7447f=1, 835ed2f5-6fd4-43e4-84bd-504f5df0e301=1, 8852b77b-17ee-47a7-af2f-4d63babe970e=1, d7b33e74-5683-40c4-9
      83d-6e2661183022=1, 16997ebc-723f-4b26-a14e-c98c40191646=1, 00dd20b3-12b4-46fe-800c-3ec405d11d98=1, b08008de-2ad0-4777-8365-2c7db6b470f3=1, 2bba7299-41f5-4635-8896-c6995425796e=1, 9d9234fe-3a77-4623-b83f-4297193ddf04=1, fbd450b0-b545-467c-acd6-2d4acf53f239=1,
       c36dc0f8-7e04-43c0-b1af-8883c42128f2=1, 8da84ab8-1d64-41c1-b904-03348fec36c6=1, fce7c69c-60c7-4006-8d2e-7363309c05d8=1, d7412fac-3f80-4c92-806d-cf10ae545a63=1, 7bafe6ad-d5fa-45d0-8c86-cef3d3d1bd3a=1, ce614913-d061-4e6b-ab6d-58df42fdab80=1, d1b9e739-b957-43e9
      -80d2-c2a984c48639=1, af32cdf5-01f7-4196-aea6-e07ed36de5aa=1, 6e64c0d8-2bcb-428c-b64b-bd3391763d4e=1, 9b980efd-edee-462b-adad-c667e4b4ee65=1, 205dddb8-defe-4717-93a3-05953ceb406d=1, dabfb14f-5b83-469b-9798-ab89b440f379=1, b05053e8-dc96-42b8-99bd-a4918c950aed=
      1, 2645af19-64fe-4b9d-bbba-a3d9202105be=1, 3bf0cb89-bdb4-47ff-81e7-ce67377d750b=1, 667bd106-cb20-429c-adec-07293f794db9=1, 4e7d90fb-b536-4ac0-83ca-036e151bf707=1, a0d5cf3b-7c1e-4ae6-a39c-b1c98a04ea8f=1, 625eccc2-d955-4bf9-a293-f53b945b0f09=1... and 28 more}],
       hasValBytes=true][op=UPDATE, val=], prevVal=[op=NOOP, val=null], oldVal=CacheObjectImpl [val=null, hasValBytes=true][op=UPDATE, val=], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntry
      Predicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], val=CacheObjectImpl [val=null, hasValBytes=true], startVer=15454
      44833545, ver=GridCacheVersion [topVer=155571373, order=1544175423890, nodeOrder=96], hash=1900127065, extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=null, rmts=LinkedList [GridCacheMvccCandidate [nodeId=1e17c56a-5213-4a1b-b94b-4575a95a2c81, ver=Gr
      idCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], threadId=1937, id=210410, topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null, otherNodeId=44d27930-80e5-4eb7-b377-8b07c02c2033, otherVer=null, mappedDhtNodes=null, mapp
      edNearNodes=null, ownerVer=null, serOrder=null, key=KeyCacheObjectImpl [part=65, val=GridServiceAssignmentsKey [name=DPLThreadManager_service], hasValBytes=true], masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|r
      emoved=0|read=0, prevVer=null, nextVer=null]]]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=65, super=], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partU
      pdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=5, txState=null, flags=last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=155571374, order=1545423626612, nodeOrder=96], committedVers=null, rolledbackVers=null, cn
      t=0, super=GridCacheIdMessage [cacheId=0]]]]]
      java.lang.IllegalStateException: Unable to find consistentId by UUID [nodeId=51dc74ab-c989-4268-b850-ed69a24cca30, topVer=AffinityTopologyVersion [topVer=162, minorTopVer=0]]
              at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactId(ConsistentIdMapper.java:62)
              at org.apache.ignite.internal.managers.discovery.ConsistentIdMapper.mapToCompactIds(ConsistentIdMapper.java:123)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1142)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:993)
              at org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.prepareRemoteTx(GridDistributedTxRemoteAdapter.java:407)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.startRemoteTx(IgniteTxHandler.java:1759)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareRequest(IgniteTxHandler.java:1121)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$400(IgniteTxHandler.java:101)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:205)
              at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$5.apply(IgniteTxHandler.java:203)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
              at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
              at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
              at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
              at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
              at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      some details I noticed
      therea no diagnostic Metrics for local node messages in logs of
      Node(1) while we have thread grid-timeout-worker in thread dump
      Thread name="grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%", id=366, state=TIMED_WAITING, blockCnt=2, waitCnt=247178
      Lock [object=java.lang.Object@682fdbd8, ownerName=null, ownerId=-1]
      at java.lang.Object.wait(Native Method)
      at o.a.i.i.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:258)
      at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:110)
      at java.lang.Thread.run(Thread.java:745)

      I see Ignite on this node(1) in process of start:

      
      Thread [name="Thread-20", id=61, state=WAITING, blockCnt=117, waitCnt=1116823]
              at sun.misc.Unsafe.park(Native Method)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
              at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
              at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
              at o.a.i.i.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:938)
              at o.a.i.i.IgniteKernal.start(IgniteKernal.java:1118)
              at o.a.i.i.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
              at o.a.i.i.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
              - locked o.a.i.i.IgnitionEx$IgniteNamedInstance@1b40e2ca
              at o.a.i.i.IgnitionEx.start0(IgnitionEx.java:1153)
              at o.a.i.i.IgnitionEx.start(IgnitionEx.java:673)
              at o.a.i.i.IgnitionEx.start(IgnitionEx.java:598)
              at o.a.i.Ignition.start(Ignition.java:323)
              at com.sbt.dpl.gridgain.ignite.IgniteWrapper.initIgniteIfNeed(IgniteWrapper.java:90)
              - locked java.lang.Object@63457adc
              at com.sbt.dpl.gridgain.container.IgniteFactory.getOrStartIgnite(IgniteFactory.java:83)
              - locked java.lang.Object@63457adc
              at com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.getDplFactory(DPLManagerLifecycleManager.java:68)
              at com.sbt.dpl.gridgain.container.DPLManagerLifecycleManager.getDplFactory(DPLManagerLifecycleManager.java:40)
              at com.sbt.dpl.gridgain.container.ContainerDPLFactory.<init>(ContainerDPLFactory.java:84)
              at com.sbt.dpl.gridgain.springsupport.SpringDPLFactory.init(SpringDPLFactory.java:74)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:366)
              at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:309)
              at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:136)
              at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:416)
              at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1691)
              at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:573)
              at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
              at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:317)
              at org.springframework.beans.factory.support.AbstractBeanFactory$$Lambda$34/1141613979.getObject(Unknown Source)
              at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
              - locked java.util.concurrent.ConcurrentHashMap@6b2be83d
      

      Attachments

        1. 16_02.tar
          17.84 MB
          Sergey Kosarev

        Issue Links

          Activity

            People

              ilantukh Ilya Lantukh
              macrergate Sergey Kosarev
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: