Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
The logs for later tests are messed up with error messages, like
2019-04-09 09:41:11,717 WARN [LeaseRenewer:jenkins.hfs.12@localhost:41108] hdfs.LeaseRenewer(468): Failed to renew lease for [DFSClient_NONMAPREDUCE_400481390_21] for 55 seconds. Will retry shortly ... java.net.ConnectException: Call From asf918.gq1.ygridcore.net/67.195.81.138 to localhost:41108 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor79.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.call(Client.java:1480) at org.apache.hadoop.ipc.Client.call(Client.java:1413) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy30.renewLease(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:595) at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy33.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372) at com.sun.proxy.$Proxy34.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372) at com.sun.proxy.$Proxy34.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:901) at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:423) at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:448) at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71) at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:304) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:713) at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) at org.apache.hadoop.ipc.Client.call(Client.java:1452) ... 26 more 2019-04-09 09:41:11,949 WARN [RS_OPEN_REGION-regionserver/asf918:33671-1] regionserver.HStore(1062): Failed flushing store file, retrying num=8 java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:817) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2114) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425) at org.apache.hadoop.hbase.regionserver.StoreFileWriter$Builder.build(StoreFileWriter.java:528) at org.apache.hadoop.hbase.regionserver.HStore.createWriterInTmp(HStore.java:1144) at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:64) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1045) at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2325) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2811) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2545) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4655) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:971) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:917) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7349) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7306) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7278) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7236) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7187) at org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:133) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Which makes it really hard to find out the actual problem...
As we always restart the mini cluster for every test, it will not increase the time we execute these tests if we split it into several UTs.
Attachments
Issue Links
- links to