Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.1.0
-
None
-
ghx-label-14
Description
I keep seeing this error during a local perf test on my desktop machine:
E0410 07:04:10.691095 430 ShortCircuitCache.java:232] ShortCircuitCache(0x6e76c6a7): failed to release short-circuit shared memory slot Slot(slotIdx=0, shm=DfsClientShm(1effcf56a590fbc371938a368987f4e9)) by sending ReleaseShortCircuitAccessRequestProto to /var/lib/hadoop-hdfs/socket.31001. Closing shared memory segment. Java exception follows: java.io.IOException: ERROR_INVALID: there is no shared memory segment registered with shmId 1effcf56a590fbc371938a368987f4e9 at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:214) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
I can also find it in our Jenkins jobs, but it only happens in the data-loading phase. So I suspend it only happens in heavy workloads.
HDFS-14701 mentioned that this happens when the DataNode is stopped/restarted. But I didn't restart my HDFS cluster and I'm still able to see this error log.
It worth investigating if we are doing something wrong in short-circuit related stuffs.
Attachments
Attachments
Issue Links
- is caused by
-
HDFS-13639 SlotReleaser is not fast enough
- Resolved
- requires
-
HDFS-16535 SlotReleaser should reuse the domain socket based on socket paths
- Resolved