Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Release Notes Required
Description
There are some diagnostic problems:
- assertions inside of PagesList can lead to CorruptedTreeException, which makes no sense. Example:
2020-11-30 20:17:27.170[ERROR]sys-stripe-29-#30%DPL_GRID%DplGridNodeName%[org.apache.ignite.Ignite] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-782612924, val2=72372732968376779]], groupName=CACHEGROUP_PARTICLE_union-module_com.sbt.processing.data.partition.dpl.PartitionKey, msg=Runtime failure on search row: SearchRow [key=KeyCacheObject [hasValBytes=true], hash=513719283, cacheId=-295471981]]]] 2org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-782612924, val2=72372732968376779]], groupName=CACHEGROUP_PARTICLE_union-module_com.sbt.processing.data.partition.dpl.PartitionKey, msg=Runtime failure on search row: SearchRow [key=KeyCacheObject [hasValBytes=true], hash=513719283, cacheId=-295471981]] 3at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:6117) 4at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1937) 5at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1670) 6at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1653) 7at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2519) 8at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:436) 9at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4312) 10at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4289) 11at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1555) 12at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:756) 13at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:794) 14at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:605) 15at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:477) 16at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:534) 17at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1092) 18at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:968) 19at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:923) 20at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:132) 21at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:229) 22at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:227) 23at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) 24at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) 25at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392) 26at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318) 27at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109) 28at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308) 29at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1722) 30at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1329) 31at org.apache.ignite.internal.managers.communication.GridIoManager.access$4600(GridIoManager.java:158) 32at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1214) 33at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:54) 34at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:559) 35at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) 36at java.lang.Thread.run(Thread.java:748) 37Caused by: java.lang.AssertionError: Incorrectly recycled pageId in reuse bucket: ff011e9e000012f7 38at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.takeEmptyPage(PagesList.java:1358) 39at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.insertDataRow(AbstractFreeList.java:517) 40at org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:74) 41at org.apache.ignite.internal.processors.cache.persistence.freelist.CacheFreeList.insertDataRow(CacheFreeList.java:35) 42at org.apache.ignite.internal.processors.cache.persistence.RowStore.addRow(RowStore.java:112) 43at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.createRow(IgniteCacheOffheapManagerImpl.java:1720) 44at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.createRow(GridCacheOffheapManager.java:2494) 45at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5876) 46at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$UpdateClosure.call(GridCacheMapEntry.java:5813) 47at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:4000) 48at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5700(BPlusTree.java:3894) 49at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:2020) 50at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1997) 51at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1997) 52at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1904)
- corruptions of partition meta also lead to mismatching exception type in pages list, e.g.:
2021-01-29 05:48:41.644[ERROR][db-checkpoint-thread-#307%DPL_GRID%DplGridNodeName%][org.apache.ignite.Ignite] Critical system error detected. Will be handled accordingly to configured handler [ 2hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failu 3reCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.AssertionError: Missing tails [bucket=250, tails=null, metaPage=000120ca00002798]]] 4java.lang.AssertionError: Missing tails [bucket=250, tails=null, metaPage=000120ca00002798] 5 at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.updateTail(PagesList.java:624) 6 at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.mergeNoNext(PagesList.java:1628) 7 at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.removeDataPage(PagesList.java:1577) 8 at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList$RemoveRowHandler.run(AbstractFreeList.java:318) 9 at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList$RemoveRowHandler.run(AbstractFreeList.java:273) 10 at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:292) 11 at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:273) 12 at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.removeDataRowByLink(AbstractFreeList.java:633) 13 at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:367) 14 at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$2(GridCacheOffheapManager.java:288) 15 at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$3(IgniteUtils.java:11665) 16 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 17 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 18 at java.lang.Thread.run(Thread.java:748)
All such exceptions should be passed to DiagnosticProcessor and contain page ids that are possibly corrupted, to be able to abalyze them in PDS.
Attachments
Issue Links
- links to