Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-6652

Change in checks for next dir in disk store

    XMLWordPrintableJSON

Details

    Description

      Hi,

      As a summary, the issue is that when using a disk store with directories of different sizes, when oplog files rotate, the available space of the next disk store directory to be used seems not to be checked correctly.

       

      I did a test where I have a persistent region, and a disk store composed by three directories with different sizes.

      gfsh> start locator --name=locator --bind-address=127.0.0.1
      gfsh> start server --name=server1 --locators=127.0.0.1[10334] --server-port=0 --J=-Dgemfire.statistic-sampling-enabled=true --statistic-archive-file=stats.gfs --enable-time-statistics=true
      gfsh> create disk-store --dir=./store1/dir1#20,./store1/dir2#10,./store1/dir3#5 --name=store1 --max-oplog-size=1
      gfsh> create region --name=store1-region --type=REPLICATE_PERSISTENT --disk-store=store1
       

      Then, I started populating the region. I could see how the files in the directories are rotating (BACKUPstore1_1 in dir1, BACKUPstore1_2 in dir2, BACKUPstore1_3 in dir3, BACKUPstore1_4 in dir1, etc...) , but there is a moment when the smallest directory is not able to fit more files. This situation is not detected when the files are created, and the server crashes.

      org.apache.geode.cache.client.ServerOperationException: remote server on bovis-z1020-172-17-0-1(23072:loner):49804:c1233620: : While performing a remote put
              at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processAck(PutOp.java:384)
              at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processResponse(PutOp.java:308)
              at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.attemptReadResponse(PutOp.java:449)
              at org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:386)
              at org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:274)
              at org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:325)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:892)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:171)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:128)
              at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:758)
              at org.apache.geode.cache.client.internal.PutOp.execute(PutOp.java:89)
              at org.apache.geode.cache.client.internal.ServerRegionProxy.put(ServerRegionProxy.java:152)
              at org.apache.geode.internal.cache.LocalRegion.serverPut(LocalRegion.java:3032)
              at org.apache.geode.internal.cache.LocalRegion.cacheWriteBeforePut(LocalRegion.java:3144)
              at org.apache.geode.internal.cache.ProxyRegionMap.basicPut(ProxyRegionMap.java:238)
              at org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5664)
              at org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:152)
              at org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5090)
              at org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1635)
              at org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1622)
              at org.apache.geode.internal.cache.AbstractRegion.put(AbstractRegion.java:419)
              at org.apache.geode_examples.statistics.Example.insertValues(Example.java:81)
              at org.apache.geode_examples.statistics.Example.main(Example.java:49)
      Caused by: org.apache.geode.cache.DiskAccessException: For DiskStore: store1: Could not pre-allocate file /home/alb3rtobr/git/geode-examples/statistics/server1/./store1/dir3/BACKUPstore1_18.crf with size=943718, caused by java.io.IOException: not enough space left to pre-blow, available=523879, required=943718
              at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1044)
              at org.apache.geode.internal.cache.Oplog.createCrf(Oplog.java:1072)
              at org.apache.geode.internal.cache.Oplog.<init>(Oplog.java:645)
              at org.apache.geode.internal.cache.Oplog.switchOpLog(Oplog.java:3721)
              at org.apache.geode.internal.cache.Oplog.basicCreate(Oplog.java:3506)
              at org.apache.geode.internal.cache.Oplog.create(Oplog.java:3440)
              at org.apache.geode.internal.cache.PersistentOplogSet.create(PersistentOplogSet.java:181)
              at org.apache.geode.internal.cache.DiskStoreImpl.put(DiskStoreImpl.java:719)
              at org.apache.geode.internal.cache.DiskRegion.put(DiskRegion.java:338)
              at org.apache.geode.internal.cache.entries.DiskEntry$Helper.writeBytesToDisk(DiskEntry.java:826)
              at org.apache.geode.internal.cache.entries.DiskEntry$Helper.basicUpdate(DiskEntry.java:948)
              at org.apache.geode.internal.cache.entries.DiskEntry$Helper.update(DiskEntry.java:860)
              at org.apache.geode.internal.cache.entries.AbstractDiskRegionEntry.setValue(AbstractDiskRegionEntry.java:40)
              at org.apache.geode.internal.cache.entries.AbstractRegionEntry.setValueWithTombstoneCheck(AbstractRegionEntry.java:306)
              at org.apache.geode.internal.cache.EntryEventImpl.setNewValueInRegion(EntryEventImpl.java:1710)
              at org.apache.geode.internal.cache.EntryEventImpl.putNewEntry(EntryEventImpl.java:1614)
              at org.apache.geode.internal.cache.map.RegionMapPut.createEntry(RegionMapPut.java:420)
              at org.apache.geode.internal.cache.map.RegionMapPut.createOrUpdateEntry(RegionMapPut.java:244)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutAndDeliverEvent(AbstractRegionMapPut.java:297)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWithIndexUpdatingInProgress(AbstractRegionMapPut.java:305)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutIfPreconditionsSatisified(AbstractRegionMapPut.java:293)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnSynchronizedRegionEntry(AbstractRegionMapPut.java:279)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnRegionEntryInMap(AbstractRegionMapPut.java:270)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.addRegionEntryToMapAndDoPut(AbstractRegionMapPut.java:248)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutRetryingIfNeeded(AbstractRegionMapPut.java:213)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doWithIndexInUpdateMode(AbstractRegionMapPut.java:195)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPut(AbstractRegionMapPut.java:177)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWhileLockedForCacheModification(AbstractRegionMapPut.java:119)
              at org.apache.geode.internal.cache.map.RegionMapPut.runWhileLockedForCacheModification(RegionMapPut.java:150)
              at org.apache.geode.internal.cache.map.AbstractRegionMapPut.put(AbstractRegionMapPut.java:167)
              at org.apache.geode.internal.cache.AbstractRegionMap.basicPut(AbstractRegionMap.java:2100)
              at org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5664)
              at org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:370)
              at org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:152)
              at org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5644)
              at org.apache.geode.internal.cache.LocalRegion.basicBridgePut(LocalRegion.java:5281)
      Number of entries to insert (type 0 for exit):
              at org.apache.geode.internal.cache.tier.sockets.command.Put65.cmdExecute(Put65.java:388)
              at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:178)
              at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:844)
              at org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:74)
              at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1214)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:594)
              at org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:121)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.io.IOException: not enough space left to pre-blow, available=523879, required=943718
              at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1040)
              ... 45 more
      [warn 2019/04/15 10:58:40.572 CEST <poolTimer-DEFAULT-31> tid=0x3e] Pool unexpected closed socket on server connection=Pooled Connection to bovis-z1020-172-17-0-1.extern.sw.ericsson.se:34749: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
      

       
      I think Geode should skip the next directory in the disk store if it has not enough space to store a new set of oplog files.

       

       

      Attachments

        Issue Links

          Activity

            People

              alberto.bustamante.reyes Alberto Bustamante Reyes
              alberto.bustamante.reyes Alberto Bustamante Reyes
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m