Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1
-
None
Description
If cache proxy is in synchronous mode, user thread may be interrupted during read from file page store file. This will cause closing of partition file with ClosedByInterruptException.
Example stacktrace:
class org.apache.ignite.IgniteCheckedException: Runtime failure on lookup row: org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@717729d at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:394) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:371) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:2952) at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:61) at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:52) at org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:38) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1012) at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:198) at org.apache.ignite.internal.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:868) at org.apache.ignite.internal.processors.cache.GridCacheGateway.leaveNoLock(GridCacheGateway.java:240) at org.apache.ignite.internal.processors.cache.GridCacheGateway.leave(GridCacheGateway.java:225) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onLeave(GatewayProtectedCacheProxy.java:1680) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:875) at org.apache.ignite.internal.processors.cache.persistence.db.RestartGridTest$TestService.execute(RestartGridTest.java:160) at org.apache.ignite.internal.processors.service.GridServiceProcessor$2.run(GridServiceProcessor.java:1160) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: class org.apache.ignite.IgniteCheckedException: Read error at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:356) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:287) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:272) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:570) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:488) at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:129) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.treeMeta(BPlusTree.java:822) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$7700(BPlusTree.java:81) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Get.init(BPlusTree.java:2392) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1099) at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1065) ... 20 more Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:746) at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:724) at org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:67) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:319) ... 30 more
Any subsequent file operation will throw exception. Furthermore, this potentially may break LFS crash recovery.
We should either handle ClosedByInterruptException or delegate file I/O operations to internal thread.
Reproducer test is attached.
S2R:
1) Comment all lines in GridAbstractTest#beforeTestsStarted
2) Run test until failure (usually, 10-20 runs are enough).