HBase
  1. HBase
  2. HBASE-11464 Abstract hbase-client from ZooKeeper
  3. HBASE-11467

New impl of Registry interface not using ZK + new RPCs on master protocol

    Details

    • Type: Sub-task Sub-task
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0
    • Fix Version/s: 2.0.0
    • Component/s: Client, Consensus, Zookeeper
    • Labels:
      None

      Description

      Currently there' only one implementation of Registry interface, which is using ZK to get info about meta. Need to create implementation which will be using RPC calls to master the client is connected to.

      Review of early version of patch is here: https://reviews.apache.org/r/24296/

      1. HBASE-11467.patch
        251 kB
        Mikhail Antonov
      2. HBASE-11467.patch
        303 kB
        Mikhail Antonov
      3. HBASE-11467.patch
        308 kB
        Mikhail Antonov

        Issue Links

          Activity

          Hide
          Mikhail Antonov added a comment -

          pinging stack - any thoughts on the latest patch up on RB?

          Show
          Mikhail Antonov added a comment - pinging stack - any thoughts on the latest patch up on RB?
          Hide
          Mikhail Antonov added a comment -

          Updated the diff on RB with latest changes.

          stack Right, I think we need to have a dedicated discussion around master changes/topology. On the question about

          If we partition the master – a far-out possibility that hopefully we never have to do where master 1. does tables A-D and master 2. does tables E-H, etc. – then how would this play out? The cluster string would have to list all masters ?

          I think the more robust solution would be forwarding from one master to another (i.e., clearly each master should know full cluster topology anyway, but clients shouldn't have to, they should be able to ask any master they know, i.e. a client may only know about some subset of masters, and get location of the master they actually need). More brainstorming is needed here, for sure.

          Show
          Mikhail Antonov added a comment - Updated the diff on RB with latest changes. stack Right, I think we need to have a dedicated discussion around master changes/topology. On the question about If we partition the master – a far-out possibility that hopefully we never have to do where master 1. does tables A-D and master 2. does tables E-H, etc. – then how would this play out? The cluster string would have to list all masters ? I think the more robust solution would be forwarding from one master to another (i.e., clearly each master should know full cluster topology anyway, but clients shouldn't have to, they should be able to ask any master they know, i.e. a client may only know about some subset of masters, and get location of the master they actually need). More brainstorming is needed here, for sure.
          Hide
          stack added a comment -

          ... so master-based cluster isn't what's coming with this patch.

          But master-character changes now in that for a client to connect, master must be up (currently, this is the case but only out of lazyness.... we need to make it so client can connect w/o having to test master is up). This goes against a tenet that is long-time, though soft-spoken that needs to be made more pronounced, that master can go away for periods of time and the cluster keeps running. If we partition the master – a far-out possibility that hopefully we never have to do where master 1. does tables A-D and master 2. does tables E-H, etc. – then how would this play out? The cluster string would have to list all masters ?

          Do you suggest to put this patch on hold, or modify it so it's more like "second option to connect to cluster"?

          The patch is great. Nice cleanup. Would be good to get some of it in, the least controversial parts. Post new rb and will review w/ that in mind.

          We need to chat on master topology on dev list and maybe stakeholders should meet and powwow...

          Show
          stack added a comment - ... so master-based cluster isn't what's coming with this patch. But master-character changes now in that for a client to connect, master must be up (currently, this is the case but only out of lazyness.... we need to make it so client can connect w/o having to test master is up). This goes against a tenet that is long-time, though soft-spoken that needs to be made more pronounced, that master can go away for periods of time and the cluster keeps running. If we partition the master – a far-out possibility that hopefully we never have to do where master 1. does tables A-D and master 2. does tables E-H, etc. – then how would this play out? The cluster string would have to list all masters ? Do you suggest to put this patch on hold, or modify it so it's more like "second option to connect to cluster"? The patch is great. Nice cleanup. Would be good to get some of it in, the least controversial parts. Post new rb and will review w/ that in mind. We need to chat on master topology on dev list and maybe stakeholders should meet and powwow...
          Hide
          Mikhail Antonov added a comment -

          How would we migrate a zk-based cluster to a master-based one?

          You mean, to the cluster which has multiple masters and doesn't have ZK connectivity on the client side, or to the cluster which doesn't have ZK completely? For the former case, migration would require updating client jars to pull in new code + modify config files to use masters connection instead of connecting to ZK.

          Show
          Mikhail Antonov added a comment - How would we migrate a zk-based cluster to a master-based one? You mean, to the cluster which has multiple masters and doesn't have ZK connectivity on the client side, or to the cluster which doesn't have ZK completely? For the former case, migration would require updating client jars to pull in new code + modify config files to use masters connection instead of connecting to ZK.
          Hide
          Mikhail Antonov added a comment -

          stack not the latest one, in sense that I'm going to publish new version on RB tonight or early this week, maybe tomorrow (mostly it's fixing the tests with changes invariants/expectations of master being up/down). In new version, this setRegistry() will go away. Found the way to get rid of it.

          At the high level - totally agree that it's change is a radical one, I targeted it for 2.0 for now. I'll start the discussion in the mail list, too. On the side note - this patch itself doesn't change the way how the servers themselves (HMs or HRSs) are talking to each other, so master-based cluster isn't what's coming with this patch.

          Do you suggest to put this patch on hold, or modify it so it's more like "second option to connect to cluster"?

          Show
          Mikhail Antonov added a comment - stack not the latest one, in sense that I'm going to publish new version on RB tonight or early this week, maybe tomorrow (mostly it's fixing the tests with changes invariants/expectations of master being up/down). In new version, this setRegistry() will go away. Found the way to get rid of it. At the high level - totally agree that it's change is a radical one, I targeted it for 2.0 for now. I'll start the discussion in the mail list, too. On the side note - this patch itself doesn't change the way how the servers themselves (HMs or HRSs) are talking to each other, so master-based cluster isn't what's coming with this patch. Do you suggest to put this patch on hold, or modify it so it's more like "second option to connect to cluster"?
          Hide
          stack added a comment -

          Mikhail Antonov is latest patch up on rb?

          Why a setRegistry in ClusterConnection? Why not just read from config or on construction?

          On the higher-level question, this patch makes a radical change – disruptive – in folks expectations undoing zk as first port-of-call connecting to a cluster instead putting in place a cluster of masters as the cluster id string. Needs surfacing on dev list for discussion. 2.0 hbase at least. How would we migrate a zk-based cluster to a master-based one?

          Could we take the bulk of this patch and keep zk string as the minimum to connect to a cluster until we put stake in the ground for topology in 'scaling 1M regions' and while ruminating on implications of going from zk to master quorum?

          Show
          stack added a comment - Mikhail Antonov is latest patch up on rb? Why a setRegistry in ClusterConnection? Why not just read from config or on construction? On the higher-level question, this patch makes a radical change – disruptive – in folks expectations undoing zk as first port-of-call connecting to a cluster instead putting in place a cluster of masters as the cluster id string. Needs surfacing on dev list for discussion. 2.0 hbase at least. How would we migrate a zk-based cluster to a master-based one? Could we take the bulk of this patch and keep zk string as the minimum to connect to a cluster until we put stake in the ground for topology in 'scaling 1M regions' and while ruminating on implications of going from zk to master quorum?
          Hide
          Mikhail Antonov added a comment -

          stack may be to bring back here in jira the conversation started on RB..

          So.. in the context of discussion happening in 'scaling to 1M regions" jira - how would you suggest to proceed with this patch? I addressed most of the comments from review board (thanks for reviews!), and incorporated with feedbacks from ryan rawson above, and working on fixing and reworking tests (some failures aren't related, but some are caused by certain changes in minicluster infrastructure). Do you think it may be 'committable' by itself, of it should be on-hold until further steps on cluster topology discussion? Would appreciate your thoughts on that.

          Show
          Mikhail Antonov added a comment - stack may be to bring back here in jira the conversation started on RB.. So.. in the context of discussion happening in 'scaling to 1M regions" jira - how would you suggest to proceed with this patch? I addressed most of the comments from review board (thanks for reviews!), and incorporated with feedbacks from ryan rawson above, and working on fixing and reworking tests (some failures aren't related, but some are caused by certain changes in minicluster infrastructure). Do you think it may be 'committable' by itself, of it should be on-hold until further steps on cluster topology discussion? Would appreciate your thoughts on that.
          Hide
          Mikhail Antonov added a comment -

          Thanks for review ryan rawson! I'll post updated patch soon with those fixed + modified handling of registries on server-side.

          Show
          Mikhail Antonov added a comment - Thanks for review ryan rawson ! I'll post updated patch soon with those fixed + modified handling of registries on server-side.
          Hide
          ryan rawson added a comment -

          So I checked this patch out, a few things really stood out to me:

          • Lets not have init() in the MasterAddressesProvider interface. The interface should describe what an object does, not how it does it. I talked to Mikhail and this is going to get moved to the constructor. This'll make it easier to reuse this object in other places as well.
          • Let's also not have it in Registry if possible. Consider this a stretch goal.
          • Having an extra REST RPC to find the cluster ID is an acceptable solution - it maintains RPC roundtrips compared to the current implementation. It'd be nice if we didn't have to have it, but I think that shouldn't hold this issue up.

          There are a few things, eg:
          TestClusterIdResource.java:
          assertThat(JSON.parse(new String(response.getBody())).toString(),
          is(TEST_UTIL.getHBaseCluster().getClusterStatus().getClusterId()));

          This could be made easier to read if we refactored elements of the functionality into static methods. for example we could have a JSON matcher so we have something like:

          assertThat(response, isJSONString(CLUSTER_ID));

          Note we refactored out the parsing, and the 'train wreck' call to get getClusterId (shouldn't this be a property of an HBaseCluster anyways?) and it's substantially cleaner to read. This is only one possibility here.

          We can also use hamcrest matchers to make things easier to read, eg:
          assertThat(registry.isTableOnlineState(TableName.META_TABLE_NAME, true), is(true));
          assertThat(registry.isTableOnlineState(TableName.META_TABLE_NAME, false), is(false));

          assertThat(registry, tableOnline(META_TABLE));

          Right now the boolean flags don't mean much to me just from looking at that snippet (unless I have also, for some reason, memorized the registry API).

          Also, I am seeing a few Thread.sleep() -> if we can get an async notice that the thing we are waiting for has happened, that'd be pretty sweet.

          Other than those things, this is shaping up nicely.

          Show
          ryan rawson added a comment - So I checked this patch out, a few things really stood out to me: Lets not have init() in the MasterAddressesProvider interface. The interface should describe what an object does, not how it does it. I talked to Mikhail and this is going to get moved to the constructor. This'll make it easier to reuse this object in other places as well. Let's also not have it in Registry if possible. Consider this a stretch goal. Having an extra REST RPC to find the cluster ID is an acceptable solution - it maintains RPC roundtrips compared to the current implementation. It'd be nice if we didn't have to have it, but I think that shouldn't hold this issue up. There are a few things, eg: TestClusterIdResource.java: assertThat(JSON.parse(new String(response.getBody())).toString(), is(TEST_UTIL.getHBaseCluster().getClusterStatus().getClusterId())); This could be made easier to read if we refactored elements of the functionality into static methods. for example we could have a JSON matcher so we have something like: assertThat(response, isJSONString(CLUSTER_ID)); Note we refactored out the parsing, and the 'train wreck' call to get getClusterId (shouldn't this be a property of an HBaseCluster anyways?) and it's substantially cleaner to read. This is only one possibility here. We can also use hamcrest matchers to make things easier to read, eg: assertThat(registry.isTableOnlineState(TableName.META_TABLE_NAME, true), is(true)); assertThat(registry.isTableOnlineState(TableName.META_TABLE_NAME, false), is(false)); assertThat(registry, tableOnline(META_TABLE)); Right now the boolean flags don't mean much to me just from looking at that snippet (unless I have also, for some reason, memorized the registry API). Also, I am seeing a few Thread.sleep() -> if we can get an async notice that the thing we are waiting for has happened, that'd be pretty sweet. Other than those things, this is shaping up nicely.
          Hide
          Mikhail Antonov added a comment -

          Rebased patch which applies against current master (still apparently draft, as the matter with ClusterId is being discussed)

          Show
          Mikhail Antonov added a comment - Rebased patch which applies against current master (still apparently draft, as the matter with ClusterId is being discussed)
          Hide
          Mikhail Antonov added a comment -

          Gary Helmling thanks for clarification!

          One thing which comes to my mind - imagine we have ZK string, or masters quorum string, as a cluster id. It's in format of host1:port1,host2:port2,hostN:portN.
          Presumably it doesn't change that often. And it's used as a selector, or key, to lookup token associated with this entity (cluster). It doesn't really have to be canonical, perhaps.

          On the very first connection, if there's no token, we fall back to GSSAPI, and when we have token, we use DIGEST-MD5. So in the scenario of masters connection string, of zk quorum string, we use it as a token selector. If at some point masters membership changes, and client needs to update his local quorum string - then the consequence will be that cached token may not longer be looked up using new connection string, so new GSSAPI pass will have to happen. Is that bad?

          Show
          Mikhail Antonov added a comment - Gary Helmling thanks for clarification! One thing which comes to my mind - imagine we have ZK string, or masters quorum string, as a cluster id. It's in format of host1:port1,host2:port2,hostN:portN. Presumably it doesn't change that often. And it's used as a selector, or key, to lookup token associated with this entity (cluster). It doesn't really have to be canonical, perhaps. On the very first connection, if there's no token, we fall back to GSSAPI, and when we have token, we use DIGEST-MD5. So in the scenario of masters connection string, of zk quorum string, we use it as a token selector. If at some point masters membership changes, and client needs to update his local quorum string - then the consequence will be that cached token may not longer be looked up using new connection string, so new GSSAPI pass will have to happen. Is that bad?
          Hide
          ryan rawson added a comment -

          Having an extra RPC to figure out who you're talking to isn't horrifyingly bad.

          Thinking broader, how does SSH do this? Could we emulate that? I believe that in SSH you can tell it via config that this key should be used for this host(s), and absent that, it walks thru the set of keys it has at its disposal.

          What is the simplest approach for the user? Magic strings tend to be not liked much

          Show
          ryan rawson added a comment - Having an extra RPC to figure out who you're talking to isn't horrifyingly bad. Thinking broader, how does SSH do this? Could we emulate that? I believe that in SSH you can tell it via config that this key should be used for this host(s), and absent that, it walks thru the set of keys it has at its disposal. What is the simplest approach for the user? Magic strings tend to be not liked much
          Hide
          Andrey Stepachev added a comment -
          Show
          Andrey Stepachev added a comment - even http://some.load.balancer:60010/clusterId will work,
          Hide
          Andrey Stepachev added a comment -

          Mikhail Antonov, great work!

          How about using some ClusterIdProvider which will have a couple implementations. Even so, it is not bad to have zk for configuration proposes, if it already exists in infrastructure. So as a result we can make a bunch of such providers and address them by some human readable cluster name.

          As a sketch, lets uri will look like this:
          zk://zk.host1:port,zk.host2:port,zk.host3:port/mycluster
          If such uri will be configured on both, server and client, server will be able to write all needed data into zk.

          Alternatively uri conf://mycluster can be provided, and in that case all configuration can be read from config (it can be some property names scheme to find out masters and such)

          Show
          Andrey Stepachev added a comment - Mikhail Antonov , great work! How about using some ClusterIdProvider which will have a couple implementations. Even so, it is not bad to have zk for configuration proposes, if it already exists in infrastructure. So as a result we can make a bunch of such providers and address them by some human readable cluster name. As a sketch, lets uri will look like this: zk://zk.host1:port,zk.host2:port,zk.host3:port/mycluster If such uri will be configured on both, server and client, server will be able to write all needed data into zk. Alternatively uri conf://mycluster can be provided, and in that case all configuration can be read from config (it can be some property names scheme to find out masters and such)
          Hide
          Gary Helmling added a comment -

          stack The client needs to know the ClusterId in order to select the token to use for token authentication from the bundle of credentials that it has access to. I believe when we first implemented this we discussed using the zk quorum string instead, but we settled on using a globally unique cluster ID as a cleaner solution, due to difficulties in canonicalizing the quorum string (short vs. fully qualified hostnames, ordering, whether or not client port is present...), and of course the zk quorum could change.

          It might be possible to have the server provide the cluster ID as part of the initial SASL negotiation, but I'm not sure how difficult this would be. Currently, if a token is found, we negotiate DIGEST-MD5 with SASL, otherwise pass through to GSSAPI. So having the server return a cluster ID in negotiation would require a step before that in order to determine whether we have a token or not.

          I agree that forcing the client to add the cluster ID in configuration somewhere is not a great option, but I haven't looked at the patch yet to see what it is doing. I will take a look at the details.

          Show
          Gary Helmling added a comment - stack The client needs to know the ClusterId in order to select the token to use for token authentication from the bundle of credentials that it has access to. I believe when we first implemented this we discussed using the zk quorum string instead, but we settled on using a globally unique cluster ID as a cleaner solution, due to difficulties in canonicalizing the quorum string (short vs. fully qualified hostnames, ordering, whether or not client port is present...), and of course the zk quorum could change. It might be possible to have the server provide the cluster ID as part of the initial SASL negotiation, but I'm not sure how difficult this would be. Currently, if a token is found, we negotiate DIGEST-MD5 with SASL, otherwise pass through to GSSAPI. So having the server return a cluster ID in negotiation would require a step before that in order to determine whether we have a token or not. I agree that forcing the client to add the cluster ID in configuration somewhere is not a great option, but I haven't looked at the patch yet to see what it is doing. I will take a look at the details.
          Hide
          stack added a comment -

          Gary Helmling Clusterid question above sir. Please put it straight. Thanks boss.

          Show
          stack added a comment - Gary Helmling Clusterid question above sir. Please put it straight. Thanks boss.
          Hide
          stack added a comment -

          Why does client need to provide clusterid authenticating? The cluster knows its clusterid. The client is specifying the cluster it wants to connect too when it choses a zk ensemble. I suppose the client could provide it as part of its connection string but seems like a pain to users having to specify cryptic string in local config for the client to volunteer on connection setup.

          Show
          stack added a comment - Why does client need to provide clusterid authenticating? The cluster knows its clusterid. The client is specifying the cluster it wants to connect too when it choses a zk ensemble. I suppose the client could provide it as part of its connection string but seems like a pain to users having to specify cryptic string in local config for the client to volunteer on connection setup.
          Hide
          Mikhail Antonov added a comment -

          As this discussion around using/not using clusterId in the token is about security, pinging Andrew Purtell and Devaraj Das.

          Show
          Mikhail Antonov added a comment - As this discussion around using/not using clusterId in the token is about security, pinging Andrew Purtell and Devaraj Das .
          Hide
          Mikhail Antonov added a comment -

          pinging stack - sorry to be insistent, just would appreciate very much if you could take a quick look at the RB. Thank you!

          Show
          Mikhail Antonov added a comment - pinging stack - sorry to be insistent, just would appreciate very much if you could take a quick look at the RB. Thank you!
          Hide
          Mikhail Antonov added a comment -

          Fixing broken Large tests.

          Posted on RB - https://reviews.apache.org/r/24296/

          Show
          Mikhail Antonov added a comment - Fixing broken Large tests. Posted on RB - https://reviews.apache.org/r/24296/
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12659460/HBASE-11467.patch
          against trunk revision .
          ATTACHMENT ID: 12659460

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 18 new or modified tests.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 lineLengths. The patch introduces the following lines longer than 100:
          + public MasterProtos.GetMetaLocationResponse getMetaRegionLocation(RpcController controller, MasterProtos.GetMetaLocationRequest request) throws ServiceException {
          + public MasterProtos.GetTableStateResponse getTableState(RpcController controller, MasterProtos.GetTableStateRequest request) throws ServiceException {
          + * @return initialized instance of

          {@link org.apache.hadoop.hbase.coordination.MasterAddressesProvider}

          ,
          + * <code>rpc GetMetaRegionLocation(.GetMetaLocationRequest) returns (.GetMetaLocationResponse);</code>
          + * <code>rpc GetMetaRegionLocation(.GetMetaLocationRequest) returns (.GetMetaLocationResponse);</code>

          +1 site. The mvn site goal succeeds with this patch.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.util.TestMergeTool
          org.apache.hadoop.hbase.master.TestMasterShutdown
          org.apache.hadoop.hbase.TestMetaTableAccessorNoCluster
          org.apache.hadoop.hbase.replication.TestReplicationKillMasterRS
          org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
          org.apache.hadoop.hbase.util.TestHBaseFsck
          org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
          org.apache.hadoop.hbase.replication.TestReplicationKillSlaveRS
          org.apache.hadoop.hbase.TestZooKeeper
          org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster
          org.apache.hadoop.hbase.regionserver.TestRegionReplicas
          org.apache.hadoop.hbase.replication.regionserver.TestReplicationSourceManager
          org.apache.hadoop.hbase.client.TestHCM
          org.apache.hadoop.hbase.client.TestReplicaWithCluster
          org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient
          org.apache.hadoop.hbase.replication.TestReplicationSyncUpTool
          org.apache.hadoop.hbase.client.TestReplicasClient
          org.apache.hadoop.hbase.master.TestDistributedLogSplitting
          org.apache.hadoop.hbase.replication.TestReplicationWithTags
          org.apache.hadoop.hbase.master.TestRestartCluster
          org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
          org.apache.hadoop.hbase.replication.TestReplicationEndpoint
          org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed
          org.apache.hadoop.hbase.replication.TestReplicationSmallTests
          org.apache.hadoop.hbase.replication.TestReplicationChangingPeerRegionservers
          org.apache.hadoop.hbase.regionserver.TestRSKilledWhenInitializing
          org.apache.hadoop.hbase.TestIOFencing
          org.apache.hadoop.hbase.client.replication.TestReplicationAdmin
          org.apache.hadoop.hbase.client.TestAdmin
          org.apache.hadoop.hbase.master.TestZKLessAMOnCluster

          -1 core zombie tests. There are 3 zombie test(s): at org.apache.hadoop.hbase.client.TestFromClientSide.testKeyOnlyFilter(TestFromClientSide.java:684)
          at org.apache.hadoop.hbase.TestClusterBootOrder.testBootRegionServerFirst(TestClusterBootOrder.java:104)
          at org.apache.hadoop.hbase.client.TestFromClientSide.testKeyOnlyFilter(TestFromClientSide.java:684)

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659460/HBASE-11467.patch against trunk revision . ATTACHMENT ID: 12659460 +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 18 new or modified tests. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 javadoc . The javadoc tool appears to have generated 1 warning messages. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces the following lines longer than 100: + public MasterProtos.GetMetaLocationResponse getMetaRegionLocation(RpcController controller, MasterProtos.GetMetaLocationRequest request) throws ServiceException { + public MasterProtos.GetTableStateResponse getTableState(RpcController controller, MasterProtos.GetTableStateRequest request) throws ServiceException { + * @return initialized instance of {@link org.apache.hadoop.hbase.coordination.MasterAddressesProvider} , + * <code>rpc GetMetaRegionLocation(.GetMetaLocationRequest) returns (.GetMetaLocationResponse);</code> + * <code>rpc GetMetaRegionLocation(.GetMetaLocationRequest) returns (.GetMetaLocationResponse);</code> +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.util.TestMergeTool org.apache.hadoop.hbase.master.TestMasterShutdown org.apache.hadoop.hbase.TestMetaTableAccessorNoCluster org.apache.hadoop.hbase.replication.TestReplicationKillMasterRS org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.util.TestHBaseFsck org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer org.apache.hadoop.hbase.replication.TestReplicationKillSlaveRS org.apache.hadoop.hbase.TestZooKeeper org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.replication.regionserver.TestReplicationSourceManager org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.client.TestReplicaWithCluster org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient org.apache.hadoop.hbase.replication.TestReplicationSyncUpTool org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.replication.TestReplicationWithTags org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestSplitLogWorker org.apache.hadoop.hbase.replication.TestReplicationEndpoint org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed org.apache.hadoop.hbase.replication.TestReplicationSmallTests org.apache.hadoop.hbase.replication.TestReplicationChangingPeerRegionservers org.apache.hadoop.hbase.regionserver.TestRSKilledWhenInitializing org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.client.replication.TestReplicationAdmin org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.master.TestZKLessAMOnCluster -1 core zombie tests . There are 3 zombie test(s): at org.apache.hadoop.hbase.client.TestFromClientSide.testKeyOnlyFilter(TestFromClientSide.java:684) at org.apache.hadoop.hbase.TestClusterBootOrder.testBootRegionServerFirst(TestClusterBootOrder.java:104) at org.apache.hadoop.hbase.client.TestFromClientSide.testKeyOnlyFilter(TestFromClientSide.java:684) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10276//console This message is automatically generated.
          Hide
          Mikhail Antonov added a comment -

          submitting to get hadoop-qa run

          Show
          Mikhail Antonov added a comment - submitting to get hadoop-qa run
          Hide
          Mikhail Antonov added a comment -

          Also I would put it up on the review board, if you guys think that it'd be better reviewed in one big patch, otherwise I can probably break it into some smaller ones.

          Show
          Mikhail Antonov added a comment - Also I would put it up on the review board, if you guys think that it'd be better reviewed in one big patch, otherwise I can probably break it into some smaller ones.
          Hide
          Mikhail Antonov added a comment -

          Second (still draft) version of patch

          • Mostly to assist mini cluster testing, changed hbase.masters.quorum property to contain hostname:port pairs.
          • Some methods from ConnectionUtils were moved to other locations. createShortCirctuitConnection() has gone to
            HRegionServer (the only place it was used in conjunction with ConnectionAdapter), and setServerSideHConnectionRetriesConfig() went to RsRpcServies. So in that sense I believe ConnectionUtils class it truly client-side class only, and it's in hbase-client module.
          • seems like it makes sense to have 2 types of Registry, at least as this stage. So I added 2 methods to ClusterConnection interface (it's private) to get and set Registry. Now the client-side HConnections are using RpcRegistry, and server-side connections are using ServerSideZkRegistry. That also helps to avoid chicken-and-egg problem with trying to read ClusterId from the web endpoint before web server is brought up.
          • moved ConnectionAdapter from hbase-client to hbase-module, as it seems right thing to do (please advise if I'm missing some consideration here?

          Looking for initial feedbacks. stack , Gary Helmling, ..?

          Another thing which seems the right one to do is to reconcile notions of CoprocessorHostConnection and shortCircuitConnection (running via ConnectionAdapter), as they seem to be conceptually the same or similar things?

          Show
          Mikhail Antonov added a comment - Second (still draft) version of patch Mostly to assist mini cluster testing, changed hbase.masters.quorum property to contain hostname:port pairs. Some methods from ConnectionUtils were moved to other locations. createShortCirctuitConnection() has gone to HRegionServer (the only place it was used in conjunction with ConnectionAdapter), and setServerSideHConnectionRetriesConfig() went to RsRpcServies. So in that sense I believe ConnectionUtils class it truly client-side class only, and it's in hbase-client module. seems like it makes sense to have 2 types of Registry, at least as this stage. So I added 2 methods to ClusterConnection interface (it's private) to get and set Registry. Now the client-side HConnections are using RpcRegistry, and server-side connections are using ServerSideZkRegistry. That also helps to avoid chicken-and-egg problem with trying to read ClusterId from the web endpoint before web server is brought up. moved ConnectionAdapter from hbase-client to hbase-module, as it seems right thing to do (please advise if I'm missing some consideration here? Looking for initial feedbacks. stack , Gary Helmling , ..? Another thing which seems the right one to do is to reconcile notions of CoprocessorHostConnection and shortCircuitConnection (running via ConnectionAdapter), as they seem to be conceptually the same or similar things?
          Hide
          Mikhail Antonov added a comment -

          Attaching draft version of patch (I'm still working on it) to get initial feedbacks from those who may be interested. Short overview:

          • I didn't yet figure out what to do with ClusterId, which is used as a text in the secured setup for the handshake. So for now actual code inside connection manager is using default cluster id, but as part of this change I did make an attempt to expose ClusterId at the rest endpoint (for that purpose I moved several classes from hbase-server to hbase-client, namely Client, Cluster and Response, and created REST resource and model in hbase-server module. All this code isn't being used now).
          • MasterAddressesProvider interface is assumed to provide callers with the address of next master he should attempt to contact, as well as return the full list of all known possible masters. It does not guarantee that next master returned to caller will be up and running at the time callers decides to connect it, so caller still expected to validate it itself (that's what happening in HConnectionImplementation). Also caller shall not assume that call to get next active master costs nothing, as implementation might be doing network-based lookup.
          • ConfigBasedMasterAddresses provider does simple round robin across all masters he knows of from the config file. Assumptions are that 1) caching of retrieved locations is handled at higher level, in HConnectionImplementation, and this class doesn't care 2) all masters are using the same RPC port, which is retrieved from hbase-site.com, and it's not part of masters quorum property 3) this class doesn't record information about failed servers, handling of failed servers is done separately in RpcClient.FailedServers. This part actually needs to be refined.
          • HConnectionKey was modified to include now masters quorum string and not ZK-related stuff.
          • Few new RPC calls (and message types) were added to MasterProtos
          • Unfortunately, I had to do quite a bit of changes in scaffolding code around mini HBase cluster, because for what I'm doing here I need everyone to know actual master's RPC port, and currently in mini HBase cluster it binds to ephemeral port .

          What else..The basic tests (including few humble tests I wrote) are passing, but some LargeTests, including master failover tests, are broken, so I'm working on them now. Some more notes to emphasize on testing.
          Now there are 2 working tests added (1 for config based master addresses provider, and one for RpcRegistry itself, and 1 test for ClusterIdResource REST endpoint, but latter one isn't being used.

          Show
          Mikhail Antonov added a comment - Attaching draft version of patch (I'm still working on it) to get initial feedbacks from those who may be interested. Short overview: I didn't yet figure out what to do with ClusterId, which is used as a text in the secured setup for the handshake. So for now actual code inside connection manager is using default cluster id, but as part of this change I did make an attempt to expose ClusterId at the rest endpoint (for that purpose I moved several classes from hbase-server to hbase-client, namely Client, Cluster and Response, and created REST resource and model in hbase-server module. All this code isn't being used now). MasterAddressesProvider interface is assumed to provide callers with the address of next master he should attempt to contact, as well as return the full list of all known possible masters. It does not guarantee that next master returned to caller will be up and running at the time callers decides to connect it, so caller still expected to validate it itself (that's what happening in HConnectionImplementation). Also caller shall not assume that call to get next active master costs nothing, as implementation might be doing network-based lookup. ConfigBasedMasterAddresses provider does simple round robin across all masters he knows of from the config file. Assumptions are that 1) caching of retrieved locations is handled at higher level, in HConnectionImplementation, and this class doesn't care 2) all masters are using the same RPC port, which is retrieved from hbase-site.com, and it's not part of masters quorum property 3) this class doesn't record information about failed servers, handling of failed servers is done separately in RpcClient.FailedServers. This part actually needs to be refined. HConnectionKey was modified to include now masters quorum string and not ZK-related stuff. Few new RPC calls (and message types) were added to MasterProtos Unfortunately, I had to do quite a bit of changes in scaffolding code around mini HBase cluster, because for what I'm doing here I need everyone to know actual master's RPC port, and currently in mini HBase cluster it binds to ephemeral port . What else..The basic tests (including few humble tests I wrote) are passing, but some LargeTests, including master failover tests, are broken, so I'm working on them now. Some more notes to emphasize on testing. Now there are 2 working tests added (1 for config based master addresses provider, and one for RpcRegistry itself, and 1 test for ClusterIdResource REST endpoint, but latter one isn't being used.
          Hide
          Mikhail Antonov added a comment -

          Also as I'm going through this patch, seems like we may need 2 different implementations of Registry for clients and nodes in the cluster. For example, when HMaster starts up and initializes, in multiple places it needs to initialize the connection and request meta location. So we would have to do kind of loopback call thru short-circuit connection, which won't work because the master isn't yet initialized.

          Show
          Mikhail Antonov added a comment - Also as I'm going through this patch, seems like we may need 2 different implementations of Registry for clients and nodes in the cluster. For example, when HMaster starts up and initializes, in multiple places it needs to initialize the connection and request meta location. So we would have to do kind of loopback call thru short-circuit connection, which won't work because the master isn't yet initialized.
          Hide
          Mikhail Antonov added a comment -

          (StartupManager -> ServerManager).

          Show
          Mikhail Antonov added a comment - (StartupManager -> ServerManager).
          Hide
          Mikhail Antonov added a comment -

          Looks like getting clusterId during HConnection initialization using REST call to master isn't that straightforward solution, as master itself creates hconneciton (for example when it creates StartupManager), and it happens well before Jetty container is initialized for master. Even if we change the sequence of steps for master initialization, making REST call from inside master code to the web endpoint seems a bit strange.

          So I guess the question is if we can make token selector use information which can be obtained from local configuration instead? Like store clusterId in client configuration (how often is it updated?), or may be use for example masters.quorum.addresses instead, as cluster identifier?

          Show
          Mikhail Antonov added a comment - Looks like getting clusterId during HConnection initialization using REST call to master isn't that straightforward solution, as master itself creates hconneciton (for example when it creates StartupManager), and it happens well before Jetty container is initialized for master. Even if we change the sequence of steps for master initialization, making REST call from inside master code to the web endpoint seems a bit strange. So I guess the question is if we can make token selector use information which can be obtained from local configuration instead? Like store clusterId in client configuration (how often is it updated?), or may be use for example masters.quorum.addresses instead, as cluster identifier?
          Hide
          ryan rawson added a comment -

          Maybe someone can explain to me the purpose of the cluster id in the authentication phase? If we're just reading it from the master, then providing it to the same machine via a different RPC, why not just drop the cluster-id from the authentication handshake? Or if you can't remove it, use a constant cluster id in that part?

          Show
          ryan rawson added a comment - Maybe someone can explain to me the purpose of the cluster id in the authentication phase? If we're just reading it from the master, then providing it to the same machine via a different RPC, why not just drop the cluster-id from the authentication handshake? Or if you can't remove it, use a constant cluster id in that part?
          Hide
          Mikhail Antonov added a comment -

          On a related topic about doing those RPC calls to master. Now on the client side we retrive ServerName stored in ZK by calling MasterAddressTracker.getMasterAddress(), which returns full servername including startcode. When we connect to masters using just client config, we know address and port, but can't know startcode upfront, so have to use 0L startcode. Looking into implications..

          Show
          Mikhail Antonov added a comment - On a related topic about doing those RPC calls to master. Now on the client side we retrive ServerName stored in ZK by calling MasterAddressTracker.getMasterAddress(), which returns full servername including startcode. When we connect to masters using just client config, we know address and port, but can't know startcode upfront, so have to use 0L startcode. Looking into implications..
          Hide
          Mikhail Antonov added a comment -

          Gary Helmling Thanks! Provided that each client has list of masters it round-robin's across, that's probably the best option so far. I think I'm going to stick with you suggestion.

          As I look at the registry interface, this one call is the only one which is hard to route thru the RPC, the others (getMetaRegionLocation, isTableOnlineState, getCurrentNrHRS) would go thru RPC once the connection is established.

          Show
          Mikhail Antonov added a comment - Gary Helmling Thanks! Provided that each client has list of masters it round-robin's across, that's probably the best option so far. I think I'm going to stick with you suggestion. As I look at the registry interface, this one call is the only one which is hard to route thru the RPC, the others (getMetaRegionLocation, isTableOnlineState, getCurrentNrHRS) would go thru RPC once the connection is established.
          Hide
          Gary Helmling added a comment -

          Maybe we could make ClusterId available through an HTTP endpoint on the master? That would allow the client to easily poll the endpoint on configured masters to obtain it.

          Obtaining ClusterId through normal RPC without authentication would be difficult, I think. Authentication happens at connection time, so allowing getting ClusterId to happen without auth would require something like setting up a separate RPC server just to handle that, or else some nasty hacks to allow that particular call to go through.

          Show
          Gary Helmling added a comment - Maybe we could make ClusterId available through an HTTP endpoint on the master? That would allow the client to easily poll the endpoint on configured masters to obtain it. Obtaining ClusterId through normal RPC without authentication would be difficult, I think. Authentication happens at connection time, so allowing getting ClusterId to happen without auth would require something like setting up a separate RPC server just to handle that, or else some nasty hacks to allow that particular call to go through.
          Hide
          Alex Newman added a comment -

          Stack I totally agree. Gary Helmling any ideas? How bad is it to get the cluserid without authentication?

          Show
          Alex Newman added a comment - Stack I totally agree. Gary Helmling any ideas? How bad is it to get the cluserid without authentication?
          Hide
          Mikhail Antonov added a comment -

          stack that token management (which involves clusterId access) happens not at individual protocol level (i.e. MasterProtos), but at the rpc channel level (BlockingRpcChannelImplementation, and RpcClient). So to do that, we would need to write new implementation of BlockingRpcChannel?

          Show
          Mikhail Antonov added a comment - stack that token management (which involves clusterId access) happens not at individual protocol level (i.e. MasterProtos), but at the rpc channel level (BlockingRpcChannelImplementation, and RpcClient). So to do that, we would need to write new implementation of BlockingRpcChannel?
          Hide
          stack added a comment -

          ...and make it not require authentication on the master (which will be tricky in itself since authentication happens on the connection level).

          How bad having a special, open RPC which always bypasses auth to read 'public' info like the clusterid? Which in essence is what is going on when we read from ZK? A new 'Interface'?

          Show
          stack added a comment - ...and make it not require authentication on the master (which will be tricky in itself since authentication happens on the connection level). How bad having a special, open RPC which always bypasses auth to read 'public' info like the clusterid? Which in essence is what is going on when we read from ZK? A new 'Interface'?
          Hide
          Mikhail Antonov added a comment -

          Thanks for comment Gary Helmling !

          I see several possible options here.

          • ClusterId is used in RpcClient code, in TokenSelector.selectToken(..). Since cluster id is publicly available information. We could use something else for token here, like hbase.masters.quorum property, containing list of active masters?
          • ClusterId is a tag used to identity cluster. Does it make sense to have it also on the client side in hbase-conf.xml? If it doesn't match the id stored in the cluster, then client can't connect?
          • ClusterId is also stored in HDFS in hdfs:/<namenode>:<port>/hbase/hbase.id. Clients who have connection to HDFS could read it from there. Several drawbacks - clients may not have access to HDFS at all, also this is implementation detail which we shall not disclose to clients.
          Show
          Mikhail Antonov added a comment - Thanks for comment Gary Helmling ! I see several possible options here. ClusterId is used in RpcClient code, in TokenSelector.selectToken(..). Since cluster id is publicly available information. We could use something else for token here, like hbase.masters.quorum property, containing list of active masters? ClusterId is a tag used to identity cluster. Does it make sense to have it also on the client side in hbase-conf.xml? If it doesn't match the id stored in the cluster, then client can't connect? ClusterId is also stored in HDFS in hdfs:/<namenode>:<port>/hbase/hbase.id. Clients who have connection to HDFS could read it from there. Several drawbacks - clients may not have access to HDFS at all, also this is implementation detail which we shall not disclose to clients.
          Hide
          Gary Helmling added a comment -

          If I understand the intent correctly, moving ClusterId discovery to a master RPC operation creates a sort of chicken-and-egg problem for token authentication. The RPC client needs to know the ClusterId of the cluster it is connecting to in order to select the correct authentication token to use. This was possible with ZK, as the ClusterId was stored in a public znode.

          If we move retrieval of the cluster ID to an RPC call on master, the client will not be able to authenticate, since, without the ClusterId, it does not know which token to select. I believe this will make token authentication unusable, or else we would have to special case that specific operation and make it not require authentication on the master (which will be tricky in itself since authentication happens on the connection level).

          Show
          Gary Helmling added a comment - If I understand the intent correctly, moving ClusterId discovery to a master RPC operation creates a sort of chicken-and-egg problem for token authentication. The RPC client needs to know the ClusterId of the cluster it is connecting to in order to select the correct authentication token to use. This was possible with ZK, as the ClusterId was stored in a public znode. If we move retrieval of the cluster ID to an RPC call on master, the client will not be able to authenticate, since, without the ClusterId, it does not know which token to select. I believe this will make token authentication unusable, or else we would have to special case that specific operation and make it not require authentication on the master (which will be tricky in itself since authentication happens on the connection level).

            People

            • Assignee:
              Mikhail Antonov
              Reporter:
              Mikhail Antonov
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:

                Development