Details
-
Test
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
Reproduced by Activate/Deactivate suit, almost any tests in IgniteChangeGlobalStateTest class. for example IgniteChangeGlobalStateTest#testStopPrimaryAndActivateFromClientNode
Failed to reinitialize local partitions (preloading will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6, minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=ChangeGlobalStateMessage [id=9093c48a461-165cdacd-8a3b-4072-9f48-e80e1b63fda9, reqId=07393ea5-1c6a-4581-b016-9eb88d6bd978, initiatingNodeId=8dced5ba-725d-494b-8e8e-ffc76453fecd, activate=true, baselineTopology=BaselineTopology [id=0, branchingHash=314980173, branchingType='Cluster activation', baselineNodes=[node2, node0, node1]], forceChangeBaselineTopology=false, timestamp=1531832492029], affTopVer=AffinityTopologyVersion [topVer=6, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=8dced5ba-725d-494b-8e8e-ffc76453fecd, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.25.4.132], sockAddrs=[/172.25.4.132:47504, /0:0:0:0:0:0:0:1%lo:47504, /127.0.0.1:47504], discPort=47504, order=2, intOrder=2, lastExchangeTime=1531832486546, loc=false, ver=2.7.0#19700101-sha1:00000000, isClient=false], topVer=6, nodeId8=9960f6b9, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1531832492035]], nodeId=8dced5ba, evt=DISCOVERY_CUSTOM_EVT] java.lang.AssertionError: calculatedOffset=3072, allocated=2048, headerSize=1024 at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:358) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:400) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:384) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:783) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:627) at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:144) at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.init(PagesList.java:169) at org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.<init>(AbstractFreeList.java:371) at org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage$FreeListImpl.<init>(MetaStorage.java:484) at org.apache.ignite.internal.processors.cache.persistence.metastorage.MetaStorage.init(MetaStorage.java:143) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:852) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:954) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:661) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2484) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2364) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748)
Given:
- Activated Node1-1 in grid1.
- MetaStorage on node1-1 in OffHeap.
- MetaStorage have not storage on disk yet.
When:
- Checkpoint on node1-1 is starting. Start checkpoint marker was written.
- node2-1 in grid2 is starting.(grid1 and grid2 have same persistence)
Then:
- node2-1 found expected checkpoint marker("Found unexpected checkpoint marker") and initialize FilePageStore for metaStorage by empty page
- node1-1 finished checkpoint and wrote MetaStorage on disk.
- After stop grid1 and activate grid2 node2-1 was failed because try read more than one page.
Possible solution:
- We can skip initialization FilePageStore for MetaStorage by empty page during the start
- We can take a lock for metaStorage that only one node can read or write one MetaStorage in one moment.
- We can reinitialize FilePageStore from disk when we activate cluster.
Attachments
Issue Links
- is duplicated by
-
IGNITE-6538 Ignite Activate/Deactivate Cluster suite: After tests validation improvements test became flaky on TC
- Resolved
-
IGNITE-7651 Assertion error on cache start while reading meta pages
- Resolved
- links to