Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Following assertion error triggers failure handler and crashes the node. Can possibly crash the whole cluster.
2020-02-18 14:34:09.775\[ERROR]\[query-#146129%DPL_GRID%DplGridNodeName%]\[o.a.i.i.p.cache.GridCacheIoManager] Failed to process message \[senderId=727757ed-4ad4-4779-bda9-081525725cce, msg=GridCacheQueryRequest \[id=178, cacheName=com.sbt.tokenization.data.entity.KEKEntity_DPL_union-module, type=SCAN, fields=false, clause=null, clsName=null, keyValFilter=null, rdc=null, trans=null, pageSize=1024, incBackups=false, cancel=false, incMeta=false, all=false, keepBinary=true, subjId=727757ed-4ad4-4779-bda9-081525725cce, taskHash=0, part=-1, topVer=AffinityTopologyVersion \[topVer=97, minorTopVer=0], sendTimestamp=-1, receiveTimestamp=-1, super=GridCacheIdMessage \[cacheId=-1129073400, super=GridCacheMessage \[msgId=179, depInfo=GridDeploymentInfoBean \[clsLdrId=c32670e3071-d30ee64b-0833-45d4-abbe-fb6282669caa, depMode=SHARED, userVer=0, locDepOwner=false, participants=null], lastAffChangedTopVer=AffinityTopologyVersion \[topVer=8, minorTopVer=6], err=null, skipPrepare=false]]]] java.lang.AssertionError: null at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager$CachedDeploymentInfo.<init>(GridCacheDeploymentManager.java:918) at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager$CachedDeploymentInfo.<init>(GridCacheDeploymentManager.java:889) at org.apache.ignite.internal.processors.cache.GridCacheDeploymentManager.p2pContext(GridCacheDeploymentManager.java:422) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1576) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:584) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:386) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:312) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:102) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:301) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1565) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1189) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:130) at org.apache.ignite.internal.managers.communication.GridIoManager$8.run(GridIoManager.java:1092) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
There is no fair reproducer for now, but it seems that we should prevent such situation in general like following:
1) check the correctness of the message before it will be sent - inside of GridCacheDeploymentManager#prepare. If we have the corresponding class loader on local node, we can try to fix message and replace wrong class loader with local one.
2) log suspicious deployments which we receive from GridDeploymentManager#deploy - maybe we have obsolete deployments in caches.
3) possibly we can remove this assertion, we should have this class on sender node and use it as class loader id, and if we don't, we will receive exception on finishUnmarshall (Failed to peer load class) and try to process this situation with GridCacheIoManager#processFailedMessage.
Attachments
Issue Links
- causes
-
IGNITE-14308 IgnitePeerToPeerClassLoadingException: Could not use deployment to prepare deployable, because local node id does not correspond with class loader id
- Resolved
- fixes
-
IGNITE-13420 Add assertion message to assert in CachedDeploymentInfo private constructor
- Resolved
- is duplicated by
-
IGNITE-12330 Assertion error in CachedDeploymentInfo
- Resolved
- links to