Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-6228

Avoid closing page store file with ClosedByInterruptException when user thread is interrupted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1
    • 2.3
    • persistence
    • None

    Description

      If cache proxy is in synchronous mode, user thread may be interrupted during read from file page store file. This will cause closing of partition file with ClosedByInterruptException.
      Example stacktrace:

      class org.apache.ignite.IgniteCheckedException: Runtime failure on lookup row: org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@717729d
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070)
      	at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476)
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276)
      	at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:394)
      	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:371)
      	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:2952)
      	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:61)
      	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:52)
      	at org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:38)
      	at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1012)
      	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:198)
      	at org.apache.ignite.internal.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:868)
      	at org.apache.ignite.internal.processors.cache.GridCacheGateway.leaveNoLock(GridCacheGateway.java:240)
      	at org.apache.ignite.internal.processors.cache.GridCacheGateway.leave(GridCacheGateway.java:225)
      	at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onLeave(GatewayProtectedCacheProxy.java:1680)
      	at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:875)
      	at org.apache.ignite.internal.processors.cache.persistence.db.RestartGridTest$TestService.execute(RestartGridTest.java:160)
      	at org.apache.ignite.internal.processors.service.GridServiceProcessor$2.run(GridServiceProcessor.java:1160)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: class org.apache.ignite.IgniteCheckedException: Read error
      	at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:356)
      	at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:287)
      	at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:272)
      	at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:570)
      	at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:488)
      	at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:129)
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.treeMeta(BPlusTree.java:822)
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$7700(BPlusTree.java:81)
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Get.init(BPlusTree.java:2392)
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1099)
      	at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1065)
      	... 20 more
      Caused by: java.nio.channels.ClosedByInterruptException
      	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
      	at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:746)
      	at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:724)
      	at org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:67)
      	at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:319)
      	... 30 more
      

      Any subsequent file operation will throw exception. Furthermore, this potentially may break LFS crash recovery.
      We should either handle ClosedByInterruptException or delegate file I/O operations to internal thread.

      Reproducer test is attached.
      S2R:
      1) Comment all lines in GridAbstractTest#beforeTestsStarted
      2) Run test until failure (usually, 10-20 runs are enough).

      Attachments

        1. RestartGridTest.java
          6 kB
          Ivan Rakov

        Activity

          People

            ascherbakov Alexey Scherbakov
            ivan.glukos Ivan Rakov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: