Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4216

Figure out why tests randomly fail with master not able to initialize in 200 seconds

    XMLWordPrintableJSON

Details

    Description

      Sample failure:
      https://builds.apache.org/job/PreCommit-PHOENIX-Build/1450//testReport/

      apurtell - Looking at the thread dump in the above link, do you see why master startup failed? I couldn't see any obvious deadlocks

       

      Exception stacktrace:

      org.apache.hadoop.hbase.regionserver.HRegionServer(2414): Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.regionserver.HRegionServer(2414): Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 2a3b1691db3a,42899,1590685404919 has been rejected; Reported time is too far out of sync with master.  Time difference of 1590685396313ms > max allowed of 30000ms at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:411) at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:277) at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:368) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2417) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:186) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:166)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:95) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:85) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:372) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:331) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2412) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:960) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:158) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:142) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1744) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139) at java.lang.Thread.run(Thread.java:748)Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSyncException: Server 2a3b1691db3a,42899,1590685404919 has been rejected; Reported time is too far out of sync with master.  Time difference of 1590685396313ms > max allowed of 30000ms at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:411) at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:277) at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:368) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2417) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:186) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:166)
      at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1291) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:231) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:340) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2410) ... 10 more

      Attachments

        1. Precommit-3849.log
          200 kB
          Neha Gupta

        Activity

          People

            Unassigned Unassigned
            samarthjain Samarth Jain
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: