Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Below are the hard limit and related configs:
ozone getconf -confKey ozone.om.lease.hard.limit 8m ozone getconf -confKey ozone.om.open.key.cleanup.service.interval 5m ozone getconf -confKey ozone.om.open.key.expire.threshold 6m
Created a file /hsyncvol/hsyncbuck/hsync/File_0.txt, wrote some data into it, did hsync and then kept it open. Final modification was done at 2024-04-04T16:12:39
{ "volumeName" : "hsyncvol", "bucketName" : "hsyncbuck", "name" : "hsync/File_0.txt", "dataSize" : 26214400, "creationTime" : "2024-04-04T16:12:38.263Z", "modificationTime" : "2024-04-04T16:12:39.660Z", "replicationConfig" : { "replicationFactor" : "THREE", "requiredNodes" : 3, "replicationType" : "RATIS" }, "metadata" : { "hsyncClientId" : "112213829764055054" }, "ozoneKeyLocations" : [ { "containerID" : 11, "localID" : 113750153625603015, "length" : 26214400, "offset" : 0, "keyOffset" : 0 } ], "file" : true }
It has been more than a hour and still the file is in OpenKeyTable
> date Thu Apr 4 17:22:06 UTC 2024 > ozone admin om lof --service-id=ozone1712158888 --prefix=/hsyncvol/hsyncbuck/ 0 total open files (est.). Showing 1 open files (limit 100) under path prefix: /hsyncvol/hsyncbuck/Client ID Creation time Hsync'ed Open File Path 112213829764055054 1712247158263 Yes /hsyncvol/hsyncbuck/-9223372036851973887/File_0.txt Reached the end of the list.
Checked the OM leader logs, there are periodic logs like below every 5 mins
2024-04-04 17:18:17,437 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{} KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163) at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406) at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45) at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404) at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) . . . 2024-04-04 17:23:17,436 ERROR [om74-OMStateMachineApplyTransactionThread - 0]-org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest: Key committed failed. Volume:hsyncvol, Bucket:hsyncbuck, Key:File_0.txt. Exception:{} KEY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: Failed to commit key, as /-9223372036851974912/-9223372036851974400/-9223372036851974400/File_0.txt/112213829764055054 entry is not found in the OpenKey table at org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequestWithFSO.validateAndUpdateCache(OMKeyCommitRequestWithFSO.java:163) at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406) at org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45) at org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404) at org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525) at org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) . . . .
Attachments
Issue Links
- links to