Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.3.6
-
None
-
None
-
Reviewed
Description
While writing a custom RegionObserver and using InternalScan.checkOnlyMemStore() I stumbled upon an issue. The number of files opened by region servers would grow steadily and eventually region servers would crash with
2021-11-15 00:54:34,290 ERROR [MemStoreFlusher.1] regionserver.HStore: Failed to commit store file hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/743071139057c819d7e6f7b59f065152/.tmp/f/394ba71102ec401d8779aa5f45819f84 java.io.IOException: Failed on local exception: java.io.IOException: Too many open files; Host Details : local host is: "<...removed...>/<...removed...>"; destination host is: "<...removed...>":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:805) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1544) at org.apache.hadoop.ipc.Client.call(Client.java:1486) at org.apache.hadoop.ipc.Client.call(Client.java:1385) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:800)
Another symptom is the following messages in region server logs:
2021-11-18 13:41:29,539 INFO [RS_COMPACTED_FILES_DISCHARGER-regionserver/<...removed...>:16020-7] regionserver.HStore: Can't archive compacted file hdfs://<...removed...>:8020/hbase/data/default/<...removed...>/ce6f08fdd82967df94d1c83e289d3142/f/9b01b8bdea324c22a92f1ba6b386e050.6261a24c6a689d5406ae1ea87dc9bb9f because of either isCompactedAway=true or file has reference, isReferencedInReads=true, refCount=7, skipping for now.
The culprit is KeyValueScanner not being closed in StoreScanner.selectScannersFrom() before `continue`
Attachments
Attachments
Issue Links
- links to