Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-4460

Support running an embedded ThriftServer within a RegionServer

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.0
    • Component/s: regionserver, Thrift
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Run a ThriftServer embedded within a RegionServer process by setting "hbase.regionserver.export.thrift" to true.

      Description

      Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together.

      This JIRA is just about the embedded ThriftServer. Will open others for the rest.

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK #2344 (See https://builds.apache.org/job/HBase-TRUNK/2344/)
          HBASE-4460 Support running an embedded ThriftServer within a RegionServer (jgray)

          jgray :
          Files :

          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
          Show
          hudson Hudson added a comment - Integrated in HBase-TRUNK #2344 (See https://builds.apache.org/job/HBase-TRUNK/2344/ ) HBASE-4460 Support running an embedded ThriftServer within a RegionServer (jgray) jgray : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
          Hide
          streamy Jonathan Gray added a comment -

          New feature, only going to trunk.

          Show
          streamy Jonathan Gray added a comment - New feature, only going to trunk.
          Hide
          streamy Jonathan Gray added a comment -

          Committed to trunk. Thanks for the reviews LarsH and KarthikR.

          Opened HBASE-4631 to implement some of your suggestions.

          Show
          streamy Jonathan Gray added a comment - Committed to trunk. Thanks for the reviews LarsH and KarthikR. Opened HBASE-4631 to implement some of your suggestions.
          Hide
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment -

          On 2011-10-20 00:34:31, Karthik Ranganathan wrote:

          > Looks good to me, one minor nit... +1

          Thanks for reviews guys. I'm going to file a follow-up JIRA to deal with your comments (cleanup and optimize).

          • Jonathan

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2410/#review2689
          -----------------------------------------------------------

          On 2011-10-17 22:37:43, Jonathan Gray wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2410/

          -----------------------------------------------------------

          (Updated 2011-10-17 22:37:43)

          Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell.

          Summary

          -------

          Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together.

          This addresses bug HBASE-4460.

          https://issues.apache.org/jira/browse/HBASE-4460

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION

          Diff: https://reviews.apache.org/r/2410/diff

          Testing

          -------

          Running this already on our hbase-92-based branch and running test site.

          Thanks,

          Jonathan

          Show
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment - On 2011-10-20 00:34:31, Karthik Ranganathan wrote: > Looks good to me, one minor nit... +1 Thanks for reviews guys. I'm going to file a follow-up JIRA to deal with your comments (cleanup and optimize). Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/#review2689 ----------------------------------------------------------- On 2011-10-17 22:37:43, Jonathan Gray wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/ ----------------------------------------------------------- (Updated 2011-10-17 22:37:43) Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. Summary ------- Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This addresses bug HBASE-4460 . https://issues.apache.org/jira/browse/HBASE-4460 Diffs ----- /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION Diff: https://reviews.apache.org/r/2410/diff Testing ------- Running this already on our hbase-92-based branch and running test site. Thanks, Jonathan
          Hide
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2410/#review2689
          -----------------------------------------------------------

          Looks good to me, one minor nit... +1

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
          <https://reviews.apache.org/r/2410/#comment6066>

          Just a code-readability comment - could we structure as:

          Get get = new Get(row);
          get.setTimeRange(Long.MIN_VALUE, timestamp);

          if (columns != null)

          { // ... }

          Result result = rs.get(regionName, get);
          return ThriftUtilities.rowResultFromHBase(result);

          • Karthik

          On 2011-10-17 22:37:43, Jonathan Gray wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2410/

          -----------------------------------------------------------

          (Updated 2011-10-17 22:37:43)

          Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell.

          Summary

          -------

          Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together.

          This addresses bug HBASE-4460.

          https://issues.apache.org/jira/browse/HBASE-4460

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION

          Diff: https://reviews.apache.org/r/2410/diff

          Testing

          -------

          Running this already on our hbase-92-based branch and running test site.

          Thanks,

          Jonathan

          Show
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/#review2689 ----------------------------------------------------------- Looks good to me, one minor nit... +1 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java < https://reviews.apache.org/r/2410/#comment6066 > Just a code-readability comment - could we structure as: Get get = new Get(row); get.setTimeRange(Long.MIN_VALUE, timestamp); if (columns != null) { // ... } Result result = rs.get(regionName, get); return ThriftUtilities.rowResultFromHBase(result); Karthik On 2011-10-17 22:37:43, Jonathan Gray wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/ ----------------------------------------------------------- (Updated 2011-10-17 22:37:43) Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. Summary ------- Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This addresses bug HBASE-4460 . https://issues.apache.org/jira/browse/HBASE-4460 Diffs ----- /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION Diff: https://reviews.apache.org/r/2410/diff Testing ------- Running this already on our hbase-92-based branch and running test site. Thanks, Jonathan
          Hide
          yuzhihong@gmail.com Ted Yu added a comment -

          w.r.t. Lars' comment, how about using the following code to obtain HRegionLocation ?

              this.connection = HConnectionManager.getConnection(this.conf);
          ...
              HRegionLocation location = connection.locateRegion(tableName.array(),
                HRegionInfo.createRegionName(tableName.array(), null, HConstants.NINES, false));
          
          Show
          yuzhihong@gmail.com Ted Yu added a comment - w.r.t. Lars' comment, how about using the following code to obtain HRegionLocation ? this .connection = HConnectionManager.getConnection( this .conf); ... HRegionLocation location = connection.locateRegion(tableName.array(), HRegionInfo.createRegionName(tableName.array(), null , HConstants.NINES, false ));
          Hide
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2410/#review2640
          -----------------------------------------------------------

          This makes much more sense than having to run a separate thrift server.

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
          <https://reviews.apache.org/r/2410/#comment5945>

          I think we're no longer supposed to put the copyright notice with year in, but just the license grant.

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
          <https://reviews.apache.org/r/2410/#comment5948>

          I wish there'd be a more lightweight way to do this. I know these are cached in BaseHandler, but each HTable will have it's ThreadPoolExecutor, etc, and having these "inside" RegionServer just to get the regionname seems wasteful.
          In fact this will establish another request to the same regionserver again, if it wasn't in the cache.

          On the other hand I don't see an easy way to avoid this...
          Maybe the RegionServer itself could provide a facility to find a region it manages by table/row, but that different from how HBae operates normally... (for another jira)

          • Lars

          On 2011-10-17 22:37:43, Jonathan Gray wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2410/

          -----------------------------------------------------------

          (Updated 2011-10-17 22:37:43)

          Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell.

          Summary

          -------

          Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together.

          This addresses bug HBASE-4460.

          https://issues.apache.org/jira/browse/HBASE-4460

          Diffs

          -----

          /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376

          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION

          Diff: https://reviews.apache.org/r/2410/diff

          Testing

          -------

          Running this already on our hbase-92-based branch and running test site.

          Thanks,

          Jonathan

          Show
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/#review2640 ----------------------------------------------------------- This makes much more sense than having to run a separate thrift server. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java < https://reviews.apache.org/r/2410/#comment5945 > I think we're no longer supposed to put the copyright notice with year in, but just the license grant. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java < https://reviews.apache.org/r/2410/#comment5948 > I wish there'd be a more lightweight way to do this. I know these are cached in BaseHandler, but each HTable will have it's ThreadPoolExecutor, etc, and having these "inside" RegionServer just to get the regionname seems wasteful. In fact this will establish another request to the same regionserver again, if it wasn't in the cache. On the other hand I don't see an easy way to avoid this... Maybe the RegionServer itself could provide a facility to find a region it manages by table/row, but that different from how HBae operates normally... (for another jira) Lars On 2011-10-17 22:37:43, Jonathan Gray wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/ ----------------------------------------------------------- (Updated 2011-10-17 22:37:43) Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. Summary ------- Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This addresses bug HBASE-4460 . https://issues.apache.org/jira/browse/HBASE-4460 Diffs ----- /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION Diff: https://reviews.apache.org/r/2410/diff Testing ------- Running this already on our hbase-92-based branch and running test site. Thanks, Jonathan
          Hide
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2410/
          -----------------------------------------------------------

          Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell.

          Summary
          -------

          Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together.

          This addresses bug HBASE-4460.
          https://issues.apache.org/jira/browse/HBASE-4460

          Diffs


          /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376
          /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION

          Diff: https://reviews.apache.org/r/2410/diff

          Testing
          -------

          Running this already on our hbase-92-based branch and running test site.

          Thanks,

          Jonathan

          Show
          jiraposter@reviews.apache.org jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2410/ ----------------------------------------------------------- Review request for hbase, Dhruba Borthakur, Gary Helmling, Michael Stack, and Andrew Purtell. Summary ------- Rather than a separate process, it can be advantageous in some situations for each RegionServer to embed their own ThriftServer. This allows each embedded ThriftServer to short-circuit any queries that should be executed on the local RS and skip the extra hop. This then enables the building of fat Thrift clients that cache region locations and avoid extra hops all together. This addresses bug HBASE-4460 . https://issues.apache.org/jira/browse/HBASE-4460 Diffs /src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1174376 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java PRE-CREATION Diff: https://reviews.apache.org/r/2410/diff Testing ------- Running this already on our hbase-92-based branch and running test site. Thanks, Jonathan
          Hide
          streamy Jonathan Gray added a comment -

          Since security stuff can be dealt with in a separate JIRA, what do people think of the patch I have up? Shall I submit to rb?

          Show
          streamy Jonathan Gray added a comment - Since security stuff can be dealt with in a separate JIRA, what do people think of the patch I have up? Shall I submit to rb?
          Hide
          stack stack added a comment -

          @Gary I wonder if thrift doesn't have this already?

          Show
          stack stack added a comment - @Gary I wonder if thrift doesn't have this already?
          Hide
          ghelmling Gary Helmling added a comment -

          Also just to second what Andy said for longer-term, we'll still want to somehow provide SASL auth for thrift clients, with the thrift server acting as a proxy on their behalf, but that seems a much bigger project.

          Show
          ghelmling Gary Helmling added a comment - Also just to second what Andy said for longer-term, we'll still want to somehow provide SASL auth for thrift clients, with the thrift server acting as a proxy on their behalf, but that seems a much bigger project.
          Hide
          streamy Jonathan Gray added a comment -

          Gary, want to open another JIRA and link it here?

          Show
          streamy Jonathan Gray added a comment - Gary, want to open another JIRA and link it here?
          Hide
          ghelmling Gary Helmling added a comment -

          .bq Open to your feedback on what I can do to better integrate with security stuff but not sure what I can do at this point.

          For the current patch on HBASE-4099, I think not much other than make sure we have a way of flagging that the ThriftServer is embedded so we skip the login. Though in that case I can't picture wanting to do embedded thrift + security at the same time, since all thrift clients would have effective access as the region server process user (circumventing security).

          The embedded thrift server + login + security might all work together if we:

          • add a User.loginAndReturnUser() variant that delegates to UserGroupInformation.loginUserFromKeytabAndReturnUGI(), then returns a wrapping User instance
          • call this method on startup for the embedded thrift server to get the thrift user instance
          • use User.runAs() to execute the body of HRegionThriftServer.run() as the logged in thrift user

          In any case, all of that seems like it should go in a separate JIRA.

          Show
          ghelmling Gary Helmling added a comment - .bq Open to your feedback on what I can do to better integrate with security stuff but not sure what I can do at this point. For the current patch on HBASE-4099 , I think not much other than make sure we have a way of flagging that the ThriftServer is embedded so we skip the login. Though in that case I can't picture wanting to do embedded thrift + security at the same time, since all thrift clients would have effective access as the region server process user (circumventing security). The embedded thrift server + login + security might all work together if we: add a User.loginAndReturnUser() variant that delegates to UserGroupInformation.loginUserFromKeytabAndReturnUGI(), then returns a wrapping User instance call this method on startup for the embedded thrift server to get the thrift user instance use User.runAs() to execute the body of HRegionThriftServer.run() as the logged in thrift user In any case, all of that seems like it should go in a separate JIRA.
          Hide
          streamy Jonathan Gray added a comment -

          Replacing HRPC is another story but I think many of us are in agreement that we'd like to do that eventually. The scope here is much smaller and I'm working on a set of changes to allow fat Thrift-based clients, not necessarily replacing normal HRPC.

          Open to your feedback on what I can do to better integrate with security stuff but not sure what I can do at this point.

          Show
          streamy Jonathan Gray added a comment - Replacing HRPC is another story but I think many of us are in agreement that we'd like to do that eventually. The scope here is much smaller and I'm working on a set of changes to allow fat Thrift-based clients, not necessarily replacing normal HRPC. Open to your feedback on what I can do to better integrate with security stuff but not sure what I can do at this point.
          Hide
          ghelmling Gary Helmling added a comment -

          Linking to ThriftServer authentication issue. If running embedded, the ThriftServer probably shouldn't do a separate login from the RegionServer.

          Show
          ghelmling Gary Helmling added a comment - Linking to ThriftServer authentication issue. If running embedded, the ThriftServer probably shouldn't do a separate login from the RegionServer.
          Hide
          apurtell Andrew Purtell added a comment -

          This is possibly a first step toward replacing HRPC with Thrift. Your thoughts there, please.

          If so, we should consider bringing the capability to wrap sockets with SASL into the ThriftServer.

          Show
          apurtell Andrew Purtell added a comment - This is possibly a first step toward replacing HRPC with Thrift. Your thoughts there, please. If so, we should consider bringing the capability to wrap sockets with SASL into the ThriftServer.
          Hide
          streamy Jonathan Gray added a comment -

          Adds HRegionThriftServer, a RegionServer hosted ThriftServer. Default is off, can be turned on with "hbase.regionserver.export.thrift" set to true.

          Show
          streamy Jonathan Gray added a comment - Adds HRegionThriftServer , a RegionServer hosted ThriftServer. Default is off, can be turned on with "hbase.regionserver.export.thrift" set to true.

            People

            • Assignee:
              streamy Jonathan Gray
              Reporter:
              streamy Jonathan Gray
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development