HBase
  1. HBase
  2. HBASE-4773

HBaseAdmin may leak ZooKeeper connections

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.90.4
    • Fix Version/s: 0.90.6, 0.92.0
    • Component/s: Client
    • Labels:
      None

      Description

      When master crashs, HBaseAdmin will leaks ZooKeeper connections
      I think we should close the zk connetion when throw MasterNotRunningException

      public HBaseAdmin(Configuration c)
      throws MasterNotRunningException, ZooKeeperConnectionException {
      this.conf = HBaseConfiguration.create(c);
      this.connection = HConnectionManager.getConnection(this.conf);
      this.pause = this.conf.getLong("hbase.client.pause", 1000);
      this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
      this.retryLongerMultiplier = this.conf.getInt("hbase.client.retries.longer.multiplier", 10);

      //we should add this code and close the zk connection
      try

      { this.connection.getMaster(); }

      catch(MasterNotRunningException e)

      { HConnectionManager.deleteConnection(conf, false); throw e; }

      }

      1. trunk_4773_patch.patch
        0.9 kB
        xufeng
      2. branches_4773.patch
        2 kB
        xufeng
      3. 4773.patch
        0.8 kB
        xufeng

        Activity

        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed sometime back.

        Show
        ramkrishna.s.vasudevan added a comment - Committed sometime back.
        Hide
        Hudson added a comment -

        Integrated in HBase-0.92 #163 (See https://builds.apache.org/job/HBase-0.92/163/)
        HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng)

        tedyu :
        Files :

        • /hbase/branches/0.92/CHANGES.txt
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Show
        Hudson added a comment - Integrated in HBase-0.92 #163 (See https://builds.apache.org/job/HBase-0.92/163/ ) HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng) tedyu : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Hide
        xufeng added a comment -

        Thanks everyone.
        It is a milestone to me.

        Show
        xufeng added a comment - Thanks everyone. It is a milestone to me.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #13 (See https://builds.apache.org/job/HBase-TRUNK-security/13/)
        HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng)

        tedyu :
        Files :

        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #13 (See https://builds.apache.org/job/HBase-TRUNK-security/13/ ) HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng) tedyu : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.92-security #22 (See https://builds.apache.org/job/HBase-0.92-security/22/)
        HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng)

        tedyu :
        Files :

        • /hbase/branches/0.92/CHANGES.txt
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Show
        Hudson added a comment - Integrated in HBase-0.92-security #22 (See https://builds.apache.org/job/HBase-0.92-security/22/ ) HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng) tedyu : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2492 (See https://builds.apache.org/job/HBase-TRUNK/2492/)
        HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng)

        tedyu :
        Files :

        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2492 (See https://builds.apache.org/job/HBase-TRUNK/2492/ ) HBASE-4773 HBaseAdmin may leak ZooKeeper connections (Xufeng) tedyu : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        Hide
        Ted Yu added a comment -

        Integrated to 0.90, 0.92 and TRUNK.

        Thanks for the patch Xufeng.

        Thanks for the review Jinchao and Ramkrishna.

        Show
        Ted Yu added a comment - Integrated to 0.90, 0.92 and TRUNK. Thanks for the patch Xufeng. Thanks for the review Jinchao and Ramkrishna.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12505296/trunk_4773_patch.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 javadoc. The javadoc tool appears to have generated -162 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.TestFullLogReconstruction
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/392//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/392//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/392//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12505296/trunk_4773_patch.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -162 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestFullLogReconstruction org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/392//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/392//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/392//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        +1

        Show
        ramkrishna.s.vasudevan added a comment - +1
        Hide
        xufeng added a comment -

        @Ted
        yes,I have run patch for TRUNK through unit test suite in my env.

        Show
        xufeng added a comment - @Ted yes,I have run patch for TRUNK through unit test suite in my env.
        Hide
        Ted Yu added a comment -

        @Xufeng:
        HadoopQA isn't functional at the moment.
        Please clarify whether you have run patch for TRUNK through unit test suite.

        Show
        Ted Yu added a comment - @Xufeng: HadoopQA isn't functional at the moment. Please clarify whether you have run patch for TRUNK through unit test suite.
        Hide
        xufeng added a comment -

        submit the branches and trunk patch.
        unit test:ok.
        test in cluster:ok

        Show
        xufeng added a comment - submit the branches and trunk patch. unit test:ok. test in cluster:ok
        Hide
        gaojinchao added a comment -

        In TRUNK, before throwing exception, we should call deleteStaleConnection to clean the dirty data

        Show
        gaojinchao added a comment - In TRUNK, before throwing exception, we should call deleteStaleConnection to clean the dirty data
        Hide
        Ted Yu added a comment -

        In TRUNK, we retry connecting to master several times:

              } catch (MasterNotRunningException mnre) {
                HConnectionManager.deleteStaleConnection(this.connection);
                this.connection = HConnectionManager.getConnection(this.conf);
        

        @Xufeng:
        Can you implement similar retry loop for 0.90 ?

        Thanks

        Show
        Ted Yu added a comment - In TRUNK, we retry connecting to master several times: } catch (MasterNotRunningException mnre) { HConnectionManager.deleteStaleConnection( this .connection); this .connection = HConnectionManager.getConnection( this .conf); @Xufeng: Can you implement similar retry loop for 0.90 ? Thanks
        Hide
        ramkrishna.s.vasudevan added a comment -

        +1 on patch

        Show
        ramkrishna.s.vasudevan added a comment - +1 on patch
        Hide
        Ted Yu added a comment -

        +1 on patch.

        Can you make a patch for trunk ?

        Show
        Ted Yu added a comment - +1 on patch. Can you make a patch for trunk ?
        Hide
        xufeng added a comment -

        yes, I have tested it in my cluster.

        Here is my client test code:

        .....
          static void initHBase() throws ZooKeeperConnectionException
          {
            HBaseAdmin hbaseAdmin = null;
            Configuration config = HBaseConfiguration.create();
            config.set("hbase.zookeeper.quorum", "158.1.130.31,158.1.130.32,158.1.130.33");
            config.set("hbase.zookeeper.property.clientPort", "2181");
            
            try {
              hbaseAdmin = new HBaseAdmin(config);
              System.out.println("init sucess!");
            } catch (MasterNotRunningException e) {
              e.printStackTrace();
              initHBase();
              
            } catch (ZooKeeperConnectionException e) {
              e.printStackTrace();
              initHBase();
            }
          }
        }
        .....
        

        In my cluster I did not start HBase process.

        Run test,result of the lsof commondline is:

        java      16735       root   72w      REG              253,3   890569     524379 /opt/xf/hadoop.log
        java      16735       root   73w      REG              253,3   274338     524376 /opt/xf/HA_hadoop.log
        java      16735       root   74r     FIFO                0,8      0t0  110645029 pipe
        java      16735       root   75w     FIFO                0,8      0t0  110645029 pipe
        java      16735       root   76u     0000                0,9        0         21 anon_inode
        java      16735       root   77u     IPv6          110645030      0t0        TCP C3S31:35186->C3S33:eforward (ESTABLISHED)
        java      16735       root   78u     unix 0xffff8800cba90380      0t0  110645035 socket
        java      16735       root   79u     sock                0,6      0t0  110645032 can't identify protocol
        java      16735       root   80r     FIFO                0,8      0t0  110645037 pipe
        java      16735       root   81w     FIFO                0,8      0t0  110645037 pipe
        java      16735       root   82u     0000                0,9        0         21 anon_inode
        java      16735       root   83u     IPv6          110645038      0t0        TCP C3S31:53727->C3S31:eforward (ESTABLISHED)
        java      16735       root   84r     FIFO                0,8      0t0  110645043 pipe
        java      16735       root   85w     FIFO                0,8      0t0  110645043 pipe
        java      16735       root   86u     0000                0,9        0         21 anon_inode
        java      16735       root   87u     IPv6          110645044      0t0        TCP C3S31:53728->C3S31:eforward (ESTABLISHED)
        java      16735       root   88r     FIFO                0,8      0t0  110645047 pipe
        java      16735       root   89w     FIFO                0,8      0t0  110645047 pipe
        java      16735       root   90u     0000                0,9        0         21 anon_inode
        java      16735       root   91u     IPv6          110645048      0t0        TCP C3S31:47183->C3S32:eforward (ESTABLISHED)
        java      16735       root   92r     FIFO                0,8      0t0  110645050 pipe
        java      16735       root   93w     FIFO                0,8      0t0  110645050 pipe
        java      16735       root   94u     0000                0,9        0         21 anon_inode
        java      16735       root   95u     IPv6          110645051      0t0        TCP C3S31:53730->C3S31:eforward (ESTABLISHED)
        java      16735       root   96r     FIFO                0,8      0t0  110645135 pipe
        java      16735       root   97w     FIFO                0,8      0t0  110645135 pipe
        java      16735       root   98u     0000                0,9        0         21 anon_inode
        java      16735       root   99u     IPv6          110645136      0t0        TCP C3S31:49799->C3S31:eforward (ESTABLISHED)
        java      16735       root  100r     FIFO                0,8      0t0  110645143 pipe
        java      16735       root  101w     FIFO                0,8      0t0  110645143 pipe
        java      16735       root  102u     0000                0,9        0         21 anon_inode
        java      16735       root  103u     IPv6          110645144      0t0        TCP C3S31:38931->C3S32:eforward (ESTABLISHED)
        java      16735       root  104r     FIFO                0,8      0t0  110645148 pipe
        java      16735       root  105w     FIFO                0,8      0t0  110645148 pipe
        java      16735       root  106u     0000                0,9        0         21 anon_inode
        java      16735       root  107u     IPv6          110645149      0t0        TCP C3S31:59939->C3S33:eforward (ESTABLISHED)
        java      16735       root  108r     FIFO                0,8      0t0  110645507 pipe
        java      16735       root  109w     FIFO                0,8      0t0  110645507 pipe
        java      16735       root  110u     0000                0,9        0         21 anon_inode
        java      16735       root  111u     IPv6          110645508      0t0        TCP C3S31:59940->C3S33:eforward (ESTABLISHED)
        

        The [eforward] is port of zookeeper.

        It made the connection leak because did not delete connection between client and zookeeper when MasterNotRunningException happened.

        And I also tested my patch,the result of it is :

        java      16652       root   71r      REG              253,3   936397     524302 /opt/xf/lib/guava-r06.jar
        java      16652       root   72w      REG              253,3   786418     524379 /opt/xf/hadoop.log
        java      16652       root   73w      REG              253,3   262352     524376 /opt/xf/HA_hadoop.log
        java      16652       root   74r     FIFO                0,8      0t0  110644817 pipe
        java      16652       root   75w     FIFO                0,8      0t0  110644817 pipe
        java      16652       root   76u     0000                0,9        0         21 anon_inode
        java      16652       root   77u     IPv6          110644818      0t0        TCP C3S31:53993->C3S33:eforward (ESTABLISHED)
        java      16652       root   78u     unix 0xffff8800cbb1d9c0      0t0  110644491 socket
        java      16652       root   79u     sock                0,6      0t0  110644488 can't identify protocol
        
        Show
        xufeng added a comment - yes, I have tested it in my cluster. Here is my client test code: ..... static void initHBase() throws ZooKeeperConnectionException { HBaseAdmin hbaseAdmin = null; Configuration config = HBaseConfiguration.create(); config.set("hbase.zookeeper.quorum", "158.1.130.31,158.1.130.32,158.1.130.33"); config.set("hbase.zookeeper.property.clientPort", "2181"); try { hbaseAdmin = new HBaseAdmin(config); System.out.println("init sucess!"); } catch (MasterNotRunningException e) { e.printStackTrace(); initHBase(); } catch (ZooKeeperConnectionException e) { e.printStackTrace(); initHBase(); } } } ..... In my cluster I did not start HBase process. Run test,result of the lsof commondline is: java 16735 root 72w REG 253,3 890569 524379 /opt/xf/hadoop.log java 16735 root 73w REG 253,3 274338 524376 /opt/xf/HA_hadoop.log java 16735 root 74r FIFO 0,8 0t0 110645029 pipe java 16735 root 75w FIFO 0,8 0t0 110645029 pipe java 16735 root 76u 0000 0,9 0 21 anon_inode java 16735 root 77u IPv6 110645030 0t0 TCP C3S31:35186->C3S33:eforward (ESTABLISHED) java 16735 root 78u unix 0xffff8800cba90380 0t0 110645035 socket java 16735 root 79u sock 0,6 0t0 110645032 can't identify protocol java 16735 root 80r FIFO 0,8 0t0 110645037 pipe java 16735 root 81w FIFO 0,8 0t0 110645037 pipe java 16735 root 82u 0000 0,9 0 21 anon_inode java 16735 root 83u IPv6 110645038 0t0 TCP C3S31:53727->C3S31:eforward (ESTABLISHED) java 16735 root 84r FIFO 0,8 0t0 110645043 pipe java 16735 root 85w FIFO 0,8 0t0 110645043 pipe java 16735 root 86u 0000 0,9 0 21 anon_inode java 16735 root 87u IPv6 110645044 0t0 TCP C3S31:53728->C3S31:eforward (ESTABLISHED) java 16735 root 88r FIFO 0,8 0t0 110645047 pipe java 16735 root 89w FIFO 0,8 0t0 110645047 pipe java 16735 root 90u 0000 0,9 0 21 anon_inode java 16735 root 91u IPv6 110645048 0t0 TCP C3S31:47183->C3S32:eforward (ESTABLISHED) java 16735 root 92r FIFO 0,8 0t0 110645050 pipe java 16735 root 93w FIFO 0,8 0t0 110645050 pipe java 16735 root 94u 0000 0,9 0 21 anon_inode java 16735 root 95u IPv6 110645051 0t0 TCP C3S31:53730->C3S31:eforward (ESTABLISHED) java 16735 root 96r FIFO 0,8 0t0 110645135 pipe java 16735 root 97w FIFO 0,8 0t0 110645135 pipe java 16735 root 98u 0000 0,9 0 21 anon_inode java 16735 root 99u IPv6 110645136 0t0 TCP C3S31:49799->C3S31:eforward (ESTABLISHED) java 16735 root 100r FIFO 0,8 0t0 110645143 pipe java 16735 root 101w FIFO 0,8 0t0 110645143 pipe java 16735 root 102u 0000 0,9 0 21 anon_inode java 16735 root 103u IPv6 110645144 0t0 TCP C3S31:38931->C3S32:eforward (ESTABLISHED) java 16735 root 104r FIFO 0,8 0t0 110645148 pipe java 16735 root 105w FIFO 0,8 0t0 110645148 pipe java 16735 root 106u 0000 0,9 0 21 anon_inode java 16735 root 107u IPv6 110645149 0t0 TCP C3S31:59939->C3S33:eforward (ESTABLISHED) java 16735 root 108r FIFO 0,8 0t0 110645507 pipe java 16735 root 109w FIFO 0,8 0t0 110645507 pipe java 16735 root 110u 0000 0,9 0 21 anon_inode java 16735 root 111u IPv6 110645508 0t0 TCP C3S31:59940->C3S33:eforward (ESTABLISHED) The [eforward] is port of zookeeper. It made the connection leak because did not delete connection between client and zookeeper when MasterNotRunningException happened. And I also tested my patch,the result of it is : java 16652 root 71r REG 253,3 936397 524302 /opt/xf/lib/guava-r06.jar java 16652 root 72w REG 253,3 786418 524379 /opt/xf/hadoop.log java 16652 root 73w REG 253,3 262352 524376 /opt/xf/HA_hadoop.log java 16652 root 74r FIFO 0,8 0t0 110644817 pipe java 16652 root 75w FIFO 0,8 0t0 110644817 pipe java 16652 root 76u 0000 0,9 0 21 anon_inode java 16652 root 77u IPv6 110644818 0t0 TCP C3S31:53993->C3S33:eforward (ESTABLISHED) java 16652 root 78u unix 0xffff8800cbb1d9c0 0t0 110644491 socket java 16652 root 79u sock 0,6 0t0 110644488 can't identify protocol
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Xufeng
        Have you tested the patch in real cluster ?

        Show
        ramkrishna.s.vasudevan added a comment - @Xufeng Have you tested the patch in real cluster ?
        Hide
        xufeng added a comment -

        created the branches patch, here are test result:

        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.675 sec
        Running org.apache.hadoop.hbase.TestScanMultipleVersions
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.155 sec
        Running org.apache.hadoop.hbase.rest.model.TestStorageClusterVersionModel
        Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec
        Running org.apache.hadoop.hbase.client.TestHCM
        Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 43.686 sec

        Results :

        Tests run: 694, Failures: 0, Errors: 0, Skipped: 9

        Show
        xufeng added a comment - created the branches patch, here are test result: Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.675 sec Running org.apache.hadoop.hbase.TestScanMultipleVersions Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.155 sec Running org.apache.hadoop.hbase.rest.model.TestStorageClusterVersionModel Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.hadoop.hbase.client.TestHCM Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 43.686 sec Results : Tests run: 694, Failures: 0, Errors: 0, Skipped: 9

          People

          • Assignee:
            xufeng
            Reporter:
            gaojinchao
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development