Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17919 HBase 2.x over hadoop 3.x umbrella
  3. HBASE-18458

Refactor TestRegionServerHostname to make it robust (Port HBASE-17922 to branch-1)

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • hadoop3
    • None
    • Reviewed

    Description

      The TestRegionServerHostname is passing in branch-1; however, it always fails locally. Running tests individually always pass. Failing to start RS in some combination of test run indicates some resource leak.

      Running org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
      Tests run: 4, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 46.042 sec <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
      testRegionServerHostnameReportedToMaster(org.apache.hadoop.hbase.regionserver.TestRegionServerHostname)  Time elapsed: 30.095 sec  <<< ERROR!
      org.junit.runners.model.TestTimedOutException: test timed out after 30000 milliseconds
      	at java.lang.Thread.sleep(Native Method)
      	at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:221)
      	at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445)
      	at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:225)
      	at org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:94)
      	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1072)
      	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1028)
      	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:900)
      	at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:894)
      	at org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostnameReportedToMaster(TestRegionServerHostname.java:158)
      

      When running the testRegionServerHostnameReportedToMaster alone or with another newly added test, the test passed without problem.
      When running the testRegionServerHostnameReportedToMaster test with testInvalidRegionServerHostnameAbortsServer in the same test suite TestRegionServerHostname, the region server failed to start:

      2017-07-25 15:34:24,132 FATAL [RS:0;192.168.1.7:64317] regionserver.HRegionServer(2182): ABORTING region server 192.168.1.7,64317,1501022063917: Unhandled: Failed suppression of fs shutdown hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
      java.lang.RuntimeException: Failed suppression of fs shutdown hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
      	at org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:204)
      	at org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:84)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:940)
      	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
      	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
      	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:360)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
      	at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307)
      	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
      	at java.lang.Thread.run(Thread.java:745)
      

      HBASE-17922 addressed similar issue in Hadoop 3. I think this change is more robust than the one in branch-1 right now. Porting the change to branch-1 (with small modification due to code difference between branch-1 and branch-2) is a good idea.

      Attachments

        1. HBASE-17922.v1-branch-1.patch
          5 kB
          Stephen Yuan Jiang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            syuanjiang Stephen Yuan Jiang
            syuanjiang Stephen Yuan Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment