Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.2.0, 2.3.0
    • test
    • None
    • Reviewed

    Description

      The logs for later tests are messed up with error messages, like

      2019-04-09 09:41:11,717 WARN  [LeaseRenewer:jenkins.hfs.12@localhost:41108] hdfs.LeaseRenewer(468): Failed to renew lease for [DFSClient_NONMAPREDUCE_400481390_21] for 55 seconds.  Will retry shortly ...
      java.net.ConnectException: Call From asf918.gq1.ygridcore.net/67.195.81.138 to localhost:41108 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
      	at sun.reflect.GeneratedConstructorAccessor79.newInstance(Unknown Source)
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
      	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
      	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1480)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1413)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
      	at com.sun.proxy.$Proxy30.renewLease(Unknown Source)
      	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:595)
      	at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
      	at com.sun.proxy.$Proxy33.renewLease(Unknown Source)
      	at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372)
      	at com.sun.proxy.$Proxy34.renewLease(Unknown Source)
      	at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372)
      	at com.sun.proxy.$Proxy34.renewLease(Unknown Source)
      	at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:901)
      	at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:423)
      	at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:448)
      	at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
      	at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:304)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.net.ConnectException: Connection refused
      	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
      	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
      	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
      	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
      	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:713)
      	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1452)
      	... 26 more
      
      2019-04-09 09:41:11,949 WARN  [RS_OPEN_REGION-regionserver/asf918:33671-1] regionserver.HStore(1062): Failed flushing store file, retrying num=8
      java.io.IOException: Filesystem closed
      	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:817)
      	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2114)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
      	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
      	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1425)
      	at org.apache.hadoop.hbase.regionserver.StoreFileWriter$Builder.build(StoreFileWriter.java:528)
      	at org.apache.hadoop.hbase.regionserver.HStore.createWriterInTmp(HStore.java:1144)
      	at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:64)
      	at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1045)
      	at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2325)
      	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2811)
      	at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2545)
      	at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4655)
      	at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:971)
      	at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:917)
      	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7349)
      	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7306)
      	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7278)
      	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7236)
      	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7187)
      	at org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:133)
      	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      

      Which makes it really hard to find out the actual problem...

      As we always restart the mini cluster for every test, it will not increase the time we execute these tests if we split it into several UTs.

      Attachments

        Issue Links

          Activity

            People

              zhangduo Duo Zhang
              zhangduo Duo Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: