HBase
  1. HBase
  2. HBASE-4568

Make zk dump jsp response more quickly

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0, 0.94.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      1) For each zk dump, currently hbase will create a zk client instance every time.
      This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

      <code>
      HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
      Configuration conf = master.getConfiguration();
      HBaseAdmin hbadmin = new HBaseAdmin(conf);
      HConnection connection = hbadmin.getConnection();
      ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
      </code>

      So we can simplify this:
      <code>
      HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
      ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
      </code>

      2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min.
      It would be nice to make this configurable and set it to a low time out.

      When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
      It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time.
      Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

      3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

      1. HBASE-4568.patch
        11 kB
        Nicolas Spiegelberg

        Activity

        Liyin Tang created issue -
        Liyin Tang made changes -
        Field Original Value New Value
        Description For each zk dump, currently hbase will create a zk client instance every time.
        This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          Configuration conf = master.getConfiguration();
          HBaseAdmin hbadmin = new HBaseAdmin(conf);
          HConnection connection = hbadmin.getConnection();
          ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


        So we can simplify this:
          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


        Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min.
        It would be nice to make this configurable and set it to a low time out.
        For each zk dump, currently hbase will create a zk client instance every time.
        This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          Configuration conf = master.getConfiguration();
          HBaseAdmin hbadmin = new HBaseAdmin(conf);
          HConnection connection = hbadmin.getConnection();
          ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


        So we can simplify this:
          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


        Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min.
        It would be nice to make this configurable and set it to a low time out.

        When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
        It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time.
        Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.


        Liyin Tang made changes -
        Description For each zk dump, currently hbase will create a zk client instance every time.
        This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          Configuration conf = master.getConfiguration();
          HBaseAdmin hbadmin = new HBaseAdmin(conf);
          HConnection connection = hbadmin.getConnection();
          ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


        So we can simplify this:
          HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
          ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


        Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min.
        It would be nice to make this configurable and set it to a low time out.

        When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
        It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time.
        Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.


        1) For each zk dump, currently hbase will create a zk client instance every time.
        This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

        <code>
        HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
        Configuration conf = master.getConfiguration();
        HBaseAdmin hbadmin = new HBaseAdmin(conf);
        HConnection connection = hbadmin.getConnection();
        ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
        </code>

        So we can simplify this:
        <code>
        HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
        ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
        </code>

        2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min.
        It would be nice to make this configurable and set it to a low time out.

        When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
        It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time.
        Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

        3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.
        Nicolas Spiegelberg made changes -
        Attachment HBASE-4568.patch [ 12498960 ]
        Nicolas Spiegelberg made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Fix Version/s 0.92.0 [ 12314223 ]
        Fix Version/s 0.94.0 [ 12316419 ]
        Resolution Fixed [ 1 ]
        Lars Hofhansl made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Liyin Tang
            Reporter:
            Liyin Tang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development