Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Hi,
As a summary, the issue is that when using a disk store with directories of different sizes, when oplog files rotate, the available space of the next disk store directory to be used seems not to be checked correctly.
I did a test where I have a persistent region, and a disk store composed by three directories with different sizes.
gfsh> start locator --name=locator --bind-address=127.0.0.1 gfsh> start server --name=server1 --locators=127.0.0.1[10334] --server-port=0 --J=-Dgemfire.statistic-sampling-enabled=true --statistic-archive-file=stats.gfs --enable-time-statistics=true gfsh> create disk-store --dir=./store1/dir1#20,./store1/dir2#10,./store1/dir3#5 --name=store1 --max-oplog-size=1 gfsh> create region --name=store1-region --type=REPLICATE_PERSISTENT --disk-store=store1
Then, I started populating the region. I could see how the files in the directories are rotating (BACKUPstore1_1 in dir1, BACKUPstore1_2 in dir2, BACKUPstore1_3 in dir3, BACKUPstore1_4 in dir1, etc...) , but there is a moment when the smallest directory is not able to fit more files. This situation is not detected when the files are created, and the server crashes.
org.apache.geode.cache.client.ServerOperationException: remote server on bovis-z1020-172-17-0-1(23072:loner):49804:c1233620: : While performing a remote put at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processAck(PutOp.java:384) at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.processResponse(PutOp.java:308) at org.apache.geode.cache.client.internal.PutOp$PutOpImpl.attemptReadResponse(PutOp.java:449) at org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:386) at org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:274) at org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:325) at org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:892) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:171) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:128) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:758) at org.apache.geode.cache.client.internal.PutOp.execute(PutOp.java:89) at org.apache.geode.cache.client.internal.ServerRegionProxy.put(ServerRegionProxy.java:152) at org.apache.geode.internal.cache.LocalRegion.serverPut(LocalRegion.java:3032) at org.apache.geode.internal.cache.LocalRegion.cacheWriteBeforePut(LocalRegion.java:3144) at org.apache.geode.internal.cache.ProxyRegionMap.basicPut(ProxyRegionMap.java:238) at org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5664) at org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:152) at org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5090) at org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1635) at org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1622) at org.apache.geode.internal.cache.AbstractRegion.put(AbstractRegion.java:419) at org.apache.geode_examples.statistics.Example.insertValues(Example.java:81) at org.apache.geode_examples.statistics.Example.main(Example.java:49) Caused by: org.apache.geode.cache.DiskAccessException: For DiskStore: store1: Could not pre-allocate file /home/alb3rtobr/git/geode-examples/statistics/server1/./store1/dir3/BACKUPstore1_18.crf with size=943718, caused by java.io.IOException: not enough space left to pre-blow, available=523879, required=943718 at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1044) at org.apache.geode.internal.cache.Oplog.createCrf(Oplog.java:1072) at org.apache.geode.internal.cache.Oplog.<init>(Oplog.java:645) at org.apache.geode.internal.cache.Oplog.switchOpLog(Oplog.java:3721) at org.apache.geode.internal.cache.Oplog.basicCreate(Oplog.java:3506) at org.apache.geode.internal.cache.Oplog.create(Oplog.java:3440) at org.apache.geode.internal.cache.PersistentOplogSet.create(PersistentOplogSet.java:181) at org.apache.geode.internal.cache.DiskStoreImpl.put(DiskStoreImpl.java:719) at org.apache.geode.internal.cache.DiskRegion.put(DiskRegion.java:338) at org.apache.geode.internal.cache.entries.DiskEntry$Helper.writeBytesToDisk(DiskEntry.java:826) at org.apache.geode.internal.cache.entries.DiskEntry$Helper.basicUpdate(DiskEntry.java:948) at org.apache.geode.internal.cache.entries.DiskEntry$Helper.update(DiskEntry.java:860) at org.apache.geode.internal.cache.entries.AbstractDiskRegionEntry.setValue(AbstractDiskRegionEntry.java:40) at org.apache.geode.internal.cache.entries.AbstractRegionEntry.setValueWithTombstoneCheck(AbstractRegionEntry.java:306) at org.apache.geode.internal.cache.EntryEventImpl.setNewValueInRegion(EntryEventImpl.java:1710) at org.apache.geode.internal.cache.EntryEventImpl.putNewEntry(EntryEventImpl.java:1614) at org.apache.geode.internal.cache.map.RegionMapPut.createEntry(RegionMapPut.java:420) at org.apache.geode.internal.cache.map.RegionMapPut.createOrUpdateEntry(RegionMapPut.java:244) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutAndDeliverEvent(AbstractRegionMapPut.java:297) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWithIndexUpdatingInProgress(AbstractRegionMapPut.java:305) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutIfPreconditionsSatisified(AbstractRegionMapPut.java:293) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnSynchronizedRegionEntry(AbstractRegionMapPut.java:279) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutOnRegionEntryInMap(AbstractRegionMapPut.java:270) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.addRegionEntryToMapAndDoPut(AbstractRegionMapPut.java:248) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPutRetryingIfNeeded(AbstractRegionMapPut.java:213) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doWithIndexInUpdateMode(AbstractRegionMapPut.java:195) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.doPut(AbstractRegionMapPut.java:177) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.runWhileLockedForCacheModification(AbstractRegionMapPut.java:119) at org.apache.geode.internal.cache.map.RegionMapPut.runWhileLockedForCacheModification(RegionMapPut.java:150) at org.apache.geode.internal.cache.map.AbstractRegionMapPut.put(AbstractRegionMapPut.java:167) at org.apache.geode.internal.cache.AbstractRegionMap.basicPut(AbstractRegionMap.java:2100) at org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5664) at org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:370) at org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:152) at org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5644) at org.apache.geode.internal.cache.LocalRegion.basicBridgePut(LocalRegion.java:5281) Number of entries to insert (type 0 for exit): at org.apache.geode.internal.cache.tier.sockets.command.Put65.cmdExecute(Put65.java:388) at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:178) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:844) at org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:74) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:594) at org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:121) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: not enough space left to pre-blow, available=523879, required=943718 at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1040) ... 45 more [warn 2019/04/15 10:58:40.572 CEST <poolTimer-DEFAULT-31> tid=0x3e] Pool unexpected closed socket on server connection=Pooled Connection to bovis-z1020-172-17-0-1.extern.sw.ericsson.se:34749: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
I think Geode should skip the next directory in the disk store if it has not enough space to store a new set of oplog files.
Attachments
Issue Links
- links to