HBase
  1. HBase
  2. HBASE-3777

Redefine Identity Of HBase Configuration

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.90.2
    • Fix Version/s: 0.92.0
    • Component/s: Client, IPC/RPC
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

      Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

      Note that "sharing connections makes clean up of HConnection instances a little awkward", unless of course, you apply the change described in HBASE-3766.

      1. 3777-TOF.patch
        0.9 kB
        Ted Yu
      2. HBASE-3777.patch
        16 kB
        Karthick Sankarachary
      3. HBASE-3777-V2.patch
        26 kB
        Karthick Sankarachary
      4. HBASE-3777-V3.patch
        38 kB
        Karthick Sankarachary
      5. HBASE-3777-V4.patch
        40 kB
        Karthick Sankarachary
      6. HBASE-3777-V6.patch
        54 kB
        Karthick Sankarachary
      7. HBASE-3777-V8.0.90.4.backport.patch
        76 kB
        Bright Fulton

        Issue Links

          Activity

          Karthick Sankarachary created issue -
          Karthick Sankarachary made changes -
          Field Original Value New Value
          Attachment HBASE-3777.patch [ 12476281 ]
          Karthick Sankarachary made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Karthick Sankarachary made changes -
          Link This issue is related to HBASE-3766 [ HBASE-3766 ]
          Hide
          Ted Yu added a comment -

          I think this JIRA and HBASE-3766 combined can be expressed by my comment on HBASE-3734 at 05/Apr/11 05:20

          Show
          Ted Yu added a comment - I think this JIRA and HBASE-3766 combined can be expressed by my comment on HBASE-3734 at 05/Apr/11 05:20
          Hide
          Karthick Sankarachary added a comment -

          Ted,

          I saw your comment on HBASE-3734. It:

          a) Proposes a neater way of comparing Configuration instances, for the purposes of HConnection lookup. In fact, the thought of comparing just the cluster-specific properties in HBaseConfiguration did cross my mind. However, at times, you may want the ability to have multiple connections per cluster, which would not be possible using your approach.

          b) Validates the need for having a reference count on the connection. Instead of using a (refcount, connection) tuple as the value in HBASE_INSTANCES though, HBASE-3766 puts the refcount in the connection itself. Do you see a specific advantage to separating out the refcount from the connection?

          Regards,
          Karthick

          Show
          Karthick Sankarachary added a comment - Ted, I saw your comment on HBASE-3734 . It: a) Proposes a neater way of comparing Configuration instances, for the purposes of HConnection lookup. In fact, the thought of comparing just the cluster-specific properties in HBaseConfiguration did cross my mind. However, at times, you may want the ability to have multiple connections per cluster, which would not be possible using your approach. b) Validates the need for having a reference count on the connection. Instead of using a (refcount, connection) tuple as the value in HBASE_INSTANCES though, HBASE-3766 puts the refcount in the connection itself. Do you see a specific advantage to separating out the refcount from the connection? Regards, Karthick
          Hide
          Ted Yu added a comment -

          For a), I like the idea of adding uniquifier to HBaseConfiguration. This is can be standardized through a well-known configuration parameter, such as "hbase.zookeeper.uniquifier" (a secondary key really).

          For b), I don't have strong opinion about particular implementation. What I have yet to propose is that we can implement (optional) timeout mechanism for connections to address the issue under the thread "hbase -0.90.x upgrade - zookeeper exception in mapreduce job" on user mailing list.
          Maybe it's easier to enforce timeout policy in HCM, hence the centralized reference counting.

          Show
          Ted Yu added a comment - For a), I like the idea of adding uniquifier to HBaseConfiguration. This is can be standardized through a well-known configuration parameter, such as "hbase.zookeeper.uniquifier" (a secondary key really). For b), I don't have strong opinion about particular implementation. What I have yet to propose is that we can implement (optional) timeout mechanism for connections to address the issue under the thread "hbase -0.90.x upgrade - zookeeper exception in mapreduce job" on user mailing list. Maybe it's easier to enforce timeout policy in HCM, hence the centralized reference counting.
          Dave Latham made changes -
          Link This issue relates to HBASE-2925 [ HBASE-2925 ]
          Hide
          Dave Latham added a comment -

          +1

          The identity of HBaseConfiguration's used to be value based, rather than instance based, before HBASE-2925.

          I do think we need a solution that doesn't depend on the object identity of a Configuration, since they are used for many things, such as MR jobs, and have a pattern of being cloned and changed.

          Show
          Dave Latham added a comment - +1 The identity of HBaseConfiguration's used to be value based, rather than instance based, before HBASE-2925 . I do think we need a solution that doesn't depend on the object identity of a Configuration, since they are used for many things, such as MR jobs, and have a pattern of being cloned and changed.
          Hide
          Ted Yu added a comment -

          J-D informed me that my initial proposal mirrors what used to be done in 0.89
          The current design is to bypass certain issues encountered by 0.89

          Shall we do the following ?
          Step 1, agree upon mechanism for determining identity of HBaseConfiguration's and reference counting. Enumerate the possibilities of error from experience of 0.89 development.
          Step 2, implement the new mechanism in trunk.
          Step 3, thoroughly test (YCSB, etc) before publishing.

          Show
          Ted Yu added a comment - J-D informed me that my initial proposal mirrors what used to be done in 0.89 The current design is to bypass certain issues encountered by 0.89 Shall we do the following ? Step 1, agree upon mechanism for determining identity of HBaseConfiguration's and reference counting. Enumerate the possibilities of error from experience of 0.89 development. Step 2, implement the new mechanism in trunk. Step 3, thoroughly test (YCSB, etc) before publishing.
          Hide
          Karthick Sankarachary added a comment -

          That sounds like a plan. Are there any threads that talk about the error cases we run into in 0.89?

          Show
          Karthick Sankarachary added a comment - That sounds like a plan. Are there any threads that talk about the error cases we run into in 0.89?
          Hide
          Jean-Daniel Cryans added a comment -

          This is one of the most important one, that also removed both hashCode and equals from HBaseConfiguration, HBASE-2925.

          Show
          Jean-Daniel Cryans added a comment - This is one of the most important one, that also removed both hashCode and equals from HBaseConfiguration, HBASE-2925 .
          Hide
          Karthick Sankarachary added a comment -

          I see. In that case, using a combination of conf.get("hbase.zookeeper.quorum") and conf.get("hbase.client.uniqueid") as the key, like Ted suggested, may be the way to go.

          Show
          Karthick Sankarachary added a comment - I see. In that case, using a combination of conf.get("hbase.zookeeper.quorum") and conf.get("hbase.client.uniqueid") as the key, like Ted suggested, may be the way to go.
          Hide
          Ted Yu added a comment -

          Allow me to add step 2.5:
          apply the implementation from step 2 on existing (and new) unit tests for validation.

          Show
          Ted Yu added a comment - Allow me to add step 2.5: apply the implementation from step 2 on existing (and new) unit tests for validation.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476367 ]
          Hide
          Karthick Sankarachary added a comment -

          About HBASE-2925, I'm not convinced that the root cause of the memory leak was due to the way HBaseConfiguration#equals was implemented. Just because the hash code of the Configuration instance changes by virtue of adding a new property, doesn't mean that it will go away from the LRU chain. Judging by the LinkedHashMap#addEntry method shown below, if a key is not accessed frequently enough, then it will get evicted no matter what.

              void addEntry(int hash, K key, V value, int bucketIndex) {
                  createEntry(hash, key, value, bucketIndex);
          
                  // Remove eldest entry if instructed, else grow capacity if appropriate
                  Entry<K,V> eldest = header.after;
                  if (removeEldestEntry(eldest)) {
                      removeEntryForKey(eldest.key);
                  } else {
                      if (size >= threshold)
                          resize(2 * table.length);
                  }
              }
          

          I suspect that the memory leak may have been caused by not deleting a HConnection that gets evicted, for which I suggest the following workaround:

              protected boolean removeEldestEntry(
                  Map.Entry<HConnectionKey, HConnectionImplementation> eldest) {
                boolean remove = size() > MAX_CACHED_HBASE_INSTANCES;
                if (remove) {
                  deleteConnection(eldest.getKey().conf, false);
                }
                return remove;
              }
          

          To address the Configuration identity issue (crisis?), I introduced the notion of a HConnectionKey which considers the connection-specific properties for the sake of checking equality and hash code, and rewrote HConnectionManager#HBASE_INSTANCES in terms of that. Further, there's a new property called HConstants.HBASE_CLIENT_INSTANCE_ID, which if not-null, can be used to uniquely identify its connection key.

          Can you please take a look at the latest patch to see if we're on the right track? Note that I've deliberately kept the reference count changes out of this patch since it's not absolutely required here. I still feel that adding the reference count to the HConnection interface makes more sense, since the HConnectionManager has no idea when to change it, only its consumers (i.e. HTable) do. Given that, is it okay if we talk about reference counting in HBASE-3766?

          Show
          Karthick Sankarachary added a comment - About HBASE-2925 , I'm not convinced that the root cause of the memory leak was due to the way HBaseConfiguration#equals was implemented. Just because the hash code of the Configuration instance changes by virtue of adding a new property, doesn't mean that it will go away from the LRU chain. Judging by the LinkedHashMap#addEntry method shown below, if a key is not accessed frequently enough, then it will get evicted no matter what. void addEntry( int hash, K key, V value, int bucketIndex) { createEntry(hash, key, value, bucketIndex); // Remove eldest entry if instructed, else grow capacity if appropriate Entry<K,V> eldest = header.after; if (removeEldestEntry(eldest)) { removeEntryForKey(eldest.key); } else { if (size >= threshold) resize(2 * table.length); } } I suspect that the memory leak may have been caused by not deleting a HConnection that gets evicted, for which I suggest the following workaround: protected boolean removeEldestEntry( Map.Entry<HConnectionKey, HConnectionImplementation> eldest) { boolean remove = size() > MAX_CACHED_HBASE_INSTANCES; if (remove) { deleteConnection(eldest.getKey().conf, false ); } return remove; } To address the Configuration identity issue (crisis?), I introduced the notion of a HConnectionKey which considers the connection-specific properties for the sake of checking equality and hash code, and rewrote HConnectionManager#HBASE_INSTANCES in terms of that. Further, there's a new property called HConstants.HBASE_CLIENT_INSTANCE_ID, which if not-null, can be used to uniquely identify its connection key. Can you please take a look at the latest patch to see if we're on the right track? Note that I've deliberately kept the reference count changes out of this patch since it's not absolutely required here. I still feel that adding the reference count to the HConnection interface makes more sense, since the HConnectionManager has no idea when to change it, only its consumers (i.e. HTable ) do. Given that, is it okay if we talk about reference counting in HBASE-3766 ?
          Hide
          Karthick Sankarachary added a comment -

          Note that even with the HConnectionKey model, it is possible to change the identity of the Configuration by say modifying its HConstants.ZOOKEEPER_QUORUM. That said, I believe all of the connection-specific properties defined in HConnectionKey should ideally be treated as being "final" by the developer.

          Show
          Karthick Sankarachary added a comment - Note that even with the HConnectionKey model, it is possible to change the identity of the Configuration by say modifying its HConstants.ZOOKEEPER_QUORUM. That said, I believe all of the connection-specific properties defined in HConnectionKey should ideally be treated as being "final" by the developer.
          Hide
          Ted Yu added a comment -

          I think the following check in equals() can be relaxed a little:

          +          if (thisValue == null || thatValue == null
          +              || !thisValue.equals(thatValue)) {
          

          the clause (thatValue == null) can be omitted.

          I agree that reference counting can be handled in HBASE-3766.

          In the future, please use https://reviews.apache.org/ for review.

          Running the new test, I got:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.client.TestConnectionManager
          -------------------------------------------------------------------------------
          Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 28.505 sec <<< FAILURE!
          testConnectionSameness(org.apache.hadoop.hbase.client.TestConnectionManager)  Time elapsed: 16.789 sec  <<< ERROR!
          org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information.
                  at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:159)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1076)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:372)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:363)
                  at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
                  at org.apache.hadoop.hbase.client.TestConnectionManager.testConnectionSameness(TestConnectionManager.java:76)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
          ...
          Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
                  at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
                  at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
                  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
                  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
                  at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
                  at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:137)
                  ... 31 more
          
          Show
          Ted Yu added a comment - I think the following check in equals() can be relaxed a little: + if (thisValue == null || thatValue == null + || !thisValue.equals(thatValue)) { the clause (thatValue == null) can be omitted. I agree that reference counting can be handled in HBASE-3766 . In the future, please use https://reviews.apache.org/ for review. Running the new test, I got: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.client.TestConnectionManager ------------------------------------------------------------------------------- Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 28.505 sec <<< FAILURE! testConnectionSameness(org.apache.hadoop.hbase.client.TestConnectionManager) Time elapsed: 16.789 sec <<< ERROR! org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default ). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:159) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1076) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:363) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156) at org.apache.hadoop.hbase.client.TestConnectionManager.testConnectionSameness(TestConnectionManager.java:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:137) ... 31 more
          Hide
          Ted Yu added a comment -

          I am not sure if the following is related.
          TestMultipleTimestamps hangs:

          "main" prio=5 tid=103000800 nid=0x100601000 waiting on condition [1005ff000]
             java.lang.Thread.State: TIMED_WAITING (sleeping)
                  at java.lang.Thread.sleep(Native Method)
                  at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:196)
                  at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420)
                  at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:280)
                  at org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:79)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:387)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:367)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:323)
                  at org.apache.hadoop.hbase.client.TestMultipleTimestamps.setUpBeforeClass(TestMultipleTimestamps.java:54)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          

          I will remove changes from HBASE-1364 on my computer and try again.

          Show
          Ted Yu added a comment - I am not sure if the following is related. TestMultipleTimestamps hangs: "main" prio=5 tid=103000800 nid=0x100601000 waiting on condition [1005ff000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:196) at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:280) at org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:79) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:387) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:367) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:323) at org.apache.hadoop.hbase.client.TestMultipleTimestamps.setUpBeforeClass(TestMultipleTimestamps.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) I will remove changes from HBASE-1364 on my computer and try again.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476371 ]
          Hide
          Ted Yu added a comment -

          TestConnectionManager passed.
          But TestMultipleTimestamps still hangs.

          Show
          Ted Yu added a comment - TestConnectionManager passed. But TestMultipleTimestamps still hangs.
          Hide
          Karthick Sankarachary added a comment -

          the clause (thatValue == null) can be omitted.

          For some reason, I can't log into https://reviews.apache.org/, so for now I've updated the patch per your comment above here.

          Running the new test, I got: ...FAILURES...

          My bad. When I ran that test, I was pointing to a real server, when I should've built upon the test framework. I rewrote the test case in terms of the TEST_UTIL framework, and that seems to work as well.

          Show
          Karthick Sankarachary added a comment - the clause (thatValue == null) can be omitted. For some reason, I can't log into https://reviews.apache.org/ , so for now I've updated the patch per your comment above here. Running the new test, I got: ...FAILURES... My bad. When I ran that test, I was pointing to a real server, when I should've built upon the test framework. I rewrote the test case in terms of the TEST_UTIL framework, and that seems to work as well.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476375 ]
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476281 ]
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476367 ]
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476371 ]
          Hide
          Ted Yu added a comment -

          It turns out HBASE-3708 broke the build.
          After getting over that bug, I got:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.client.TestHCM
          -------------------------------------------------------------------------------
          Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 38.137 sec <<< FAILURE!
          testManyNewConnectionsDoesnotOOME(org.apache.hadoop.hbase.client.TestHCM)  Time elapsed: 7.349 sec  <<< FAILURE!
          java.lang.AssertionError: expected:<31> but was:<1>
                  at org.junit.Assert.fail(Assert.java:91)
                  at org.junit.Assert.failNotEquals(Assert.java:645)
                  at org.junit.Assert.assertEquals(Assert.java:126)
                  at org.junit.Assert.assertEquals(Assert.java:470)
                  at org.junit.Assert.assertEquals(Assert.java:454)
                  at org.apache.hadoop.hbase.client.TestHCM.createNewConfigurations(TestHCM.java:109)
                  at org.apache.hadoop.hbase.client.TestHCM.testManyNewConnectionsDoesnotOOME(TestHCM.java:78)
          
          Show
          Ted Yu added a comment - It turns out HBASE-3708 broke the build. After getting over that bug, I got: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.client.TestHCM ------------------------------------------------------------------------------- Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 38.137 sec <<< FAILURE! testManyNewConnectionsDoesnotOOME(org.apache.hadoop.hbase.client.TestHCM) Time elapsed: 7.349 sec <<< FAILURE! java.lang.AssertionError: expected:<31> but was:<1> at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.junit.Assert.assertEquals(Assert.java:454) at org.apache.hadoop.hbase.client.TestHCM.createNewConfigurations(TestHCM.java:109) at org.apache.hadoop.hbase.client.TestHCM.testManyNewConnectionsDoesnotOOME(TestHCM.java:78)
          Hide
          Karthick Sankarachary added a comment -

          I believe that the above TestHCM#testManyNewConnectionsDoesnotOOME failure is to be expected, given that the patch maps Configuration to HConnection based on the connection-specific properties. In other words, no matter how many times you try to get a connection using Configuration instances that differ only in their non-connection-specific properties, you should see but one HConnection instance in the HCM's cache. Shall I change that assertion to check for <1> instead of <31>?

          Show
          Karthick Sankarachary added a comment - I believe that the above TestHCM#testManyNewConnectionsDoesnotOOME failure is to be expected, given that the patch maps Configuration to HConnection based on the connection-specific properties. In other words, no matter how many times you try to get a connection using Configuration instances that differ only in their non-connection-specific properties, you should see but one HConnection instance in the HCM's cache. Shall I change that assertion to check for <1> instead of <31>?
          Hide
          Ted Yu added a comment -

          I think the assertion should be changed.
          Please run through TestHCM and see if other assertion(s) needs to be changed.

          Show
          Ted Yu added a comment - I think the assertion should be changed. Please run through TestHCM and see if other assertion(s) needs to be changed.
          Hide
          Ted Yu added a comment -

          For immutability (see comment @ 14/Apr/11 20:26), I think we can utilize the following from Guava to represent the conf field in HConnectionKey:

                ImmutableMap.Builder<String, String> builder = ImmutableMap.builder();
                builder.put("a", "b");
          
          Show
          Ted Yu added a comment - For immutability (see comment @ 14/Apr/11 20:26), I think we can utilize the following from Guava to represent the conf field in HConnectionKey: ImmutableMap.Builder< String , String > builder = ImmutableMap.builder(); builder.put( "a" , "b" );
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476391 ]
          Hide
          Karthick Sankarachary added a comment -

          I changed the assertion in TestHCM#testManyNewConneciontsDoesnotOOME, moved my test cases (viz., testConnectionSameness and testConnectionUniqueness) there, and verified that it passes.

          About making the HConnectionKey#conf immutable, the problem is that the user (client) will still have a reference to the "mutable" instance of the Configuration, so there's really no way to enforce its immutability, short of marking the connection properties as "final" in the hbase-default.xml (but then again, the user can choose to not make it final). In any case, it's not the end of the world if a connection property were to change after the fact, because the initial entry for that Configuration will eventually get evicted, but not before deleting the HConnection, so we should be okay.

          Show
          Karthick Sankarachary added a comment - I changed the assertion in TestHCM#testManyNewConneciontsDoesnotOOME , moved my test cases (viz., testConnectionSameness and testConnectionUniqueness ) there, and verified that it passes. About making the HConnectionKey#conf immutable, the problem is that the user (client) will still have a reference to the "mutable" instance of the Configuration , so there's really no way to enforce its immutability, short of marking the connection properties as "final" in the hbase-default.xml (but then again, the user can choose to not make it final). In any case, it's not the end of the world if a connection property were to change after the fact, because the initial entry for that Configuration will eventually get evicted, but not before deleting the HConnection , so we should be okay.
          Hide
          Ted Yu added a comment -

          I think the reason outlined above makes case for at least duplicating the Configuration in HConnectionKey ctor.
          Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B.
          Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B.

          Since we expect the connections to be reused a lot (evidenced by TestHCM#testManyNewConnectionsDoesnotOOME), the cost of duplicating/making Configuration immutable is low.

          Show
          Ted Yu added a comment - I think the reason outlined above makes case for at least duplicating the Configuration in HConnectionKey ctor. Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B. Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B. Since we expect the connections to be reused a lot (evidenced by TestHCM#testManyNewConnectionsDoesnotOOME), the cost of duplicating/making Configuration immutable is low.
          Hide
          Karthick Sankarachary added a comment -

          Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B.
          Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B.

          Ah, I see. By freezing the config in the key, we ensure that subsequent changes to the config will only affect the client that is making that change. I will update the patch to do that momentarily.

          Show
          Karthick Sankarachary added a comment - Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B. Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B. Ah, I see. By freezing the config in the key, we ensure that subsequent changes to the config will only affect the client that is making that change. I will update the patch to do that momentarily.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476398 ]
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476391 ]
          Karthick Sankarachary made changes -
          Attachment HBASE-3777.patch [ 12476375 ]
          Hide
          Ted Yu added a comment -

          Latest version of patch looks good.
          Next we need to pass all unit tests.

          Show
          Ted Yu added a comment - Latest version of patch looks good. Next we need to pass all unit tests.
          Dave Latham made changes -
          Link This issue relates to HBASE-3792 [ HBASE-3792 ]
          Hide
          Ted Yu added a comment -

          hbase.zookeeper.property.clientPort should also be one of the connection-specific properties

          Show
          Ted Yu added a comment - hbase.zookeeper.property.clientPort should also be one of the connection-specific properties
          Karthick Sankarachary made changes -
          Attachment HBASE-3777-V2.patch [ 12476567 ]
          Hide
          Karthick Sankarachary added a comment -

          Please review the updated version (V2) of the patch, which:

          1. Adds the following properties to the connection key:
            • The zookeeper client port, which is pulled in by ZKConfig#makeZKProps.
            • The recoverable zookeeper wait time, which is pulled in by the ZooKeeperWatcher.
          2. Closes the HConnection only if there are no strong references to it. This is necessitated by the fact that they can now potentially be shared by multiple clients and configurations. Note that it relies on the garbage collector to clean up the connection, which I feel is a safer approach. Alternatively, we can have the HCM implement a reference counting mechanism, but that would call for a strict clean up strategy.
          3. As far as testing is concerned, all but five tests passed. FWIW, those failures occur even without the patch, so in that sense, no regressions were found.
          Show
          Karthick Sankarachary added a comment - Please review the updated version (V2) of the patch, which: Adds the following properties to the connection key: The zookeeper client port, which is pulled in by ZKConfig#makeZKProps . The recoverable zookeeper wait time, which is pulled in by the ZooKeeperWatcher . Closes the HConnection only if there are no strong references to it. This is necessitated by the fact that they can now potentially be shared by multiple clients and configurations. Note that it relies on the garbage collector to clean up the connection, which I feel is a safer approach. Alternatively, we can have the HCM implement a reference counting mechanism, but that would call for a strict clean up strategy. As far as testing is concerned, all but five tests passed. FWIW, those failures occur even without the patch, so in that sense, no regressions were found.
          Hide
          Ted Yu added a comment -

          I am running tests based on HBASE-3777-V2.patch
          Can you disclose which five tests failed ?

          Good job.

          Show
          Ted Yu added a comment - I am running tests based on HBASE-3777 -V2.patch Can you disclose which five tests failed ? Good job.
          Hide
          Ted Yu added a comment -

          I see TestHFileOutputFormat timeout:

           
          "main" prio=5 tid=101801000 nid=0x100601000 waiting on condition [1005ff000]
             java.lang.Thread.State: TIMED_WAITING (sleeping)
                  at java.lang.Thread.sleep(Native Method)
                  at org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:543)
                  at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:352)
                  at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:163)
          

          Another test failure was:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.replication.TestReplication
          -------------------------------------------------------------------------------
          Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 110.1 sec <<< FAILURE!
          testVerifyRepJob(org.apache.hadoop.hbase.replication.TestReplication)  Time elapsed: 8.15 sec  <<< FAILURE!
          java.lang.AssertionError: expected:<0> but was:<100>
                  at org.junit.Assert.fail(Assert.java:91)
                  at org.junit.Assert.failNotEquals(Assert.java:645)
                  at org.junit.Assert.assertEquals(Assert.java:126)
                  at org.junit.Assert.assertEquals(Assert.java:470)
                  at org.junit.Assert.assertEquals(Assert.java:454)
                  at org.apache.hadoop.hbase.replication.TestReplication.testVerifyRepJob(TestReplication.java:510)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          
          Show
          Ted Yu added a comment - I see TestHFileOutputFormat timeout: "main" prio=5 tid=101801000 nid=0x100601000 waiting on condition [1005ff000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:543) at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:352) at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:163) Another test failure was: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.replication.TestReplication ------------------------------------------------------------------------------- Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 110.1 sec <<< FAILURE! testVerifyRepJob(org.apache.hadoop.hbase.replication.TestReplication) Time elapsed: 8.15 sec <<< FAILURE! java.lang.AssertionError: expected:<0> but was:<100> at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.junit.Assert.assertEquals(Assert.java:454) at org.apache.hadoop.hbase.replication.TestReplication.testVerifyRepJob(TestReplication.java:510) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          Hide
          Ted Yu added a comment -

          TestHFileOutputFormat passed on a Linux machine.
          testVerifyRepJob still failed.

          Show
          Ted Yu added a comment - TestHFileOutputFormat passed on a Linux machine. testVerifyRepJob still failed.
          Hide
          Ted Yu added a comment -

          See Effective Java, 2nd edition, page 31 about using finalizer.

          Show
          Ted Yu added a comment - See Effective Java, 2nd edition, page 31 about using finalizer.
          Hide
          Karthick Sankarachary added a comment -

          Can you disclose which five tests failed ?

          The failures, mostly confined to the HDFS layer, include:

          • org.apache.hadoop.hbase.master.TestHMasterRPCException#testRPCException(TestHMasterRPCException.java:57) - "100 millis timeout while waiting for channel to be ready for read"
          • org.apache.hadoop.hbase.replication.TestReplication#setUp(TestReplication.java:174) - "Waited too much time for truncate"
          • org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniZooKeeper(TestHBaseTestingUtility.java:142) - "Unable to create data directory"
          • org.apache.hadoop.hbase.coprocessor.TestRegionObserverStacking#testRegionObserverStacking(TestRegionObserverStacking.java:112) - "Cannot get log writer"
          • org.apache.hadoop.hbase.client.TestGetRowVersions#testGetRowMultipleVersions(TestGetRowVersions.java:67) - "CRC check failed"
          Show
          Karthick Sankarachary added a comment - Can you disclose which five tests failed ? The failures, mostly confined to the HDFS layer, include: org.apache.hadoop.hbase.master.TestHMasterRPCException#testRPCException(TestHMasterRPCException.java:57) - "100 millis timeout while waiting for channel to be ready for read" org.apache.hadoop.hbase.replication.TestReplication#setUp(TestReplication.java:174) - "Waited too much time for truncate" org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniZooKeeper(TestHBaseTestingUtility.java:142) - "Unable to create data directory" org.apache.hadoop.hbase.coprocessor.TestRegionObserverStacking#testRegionObserverStacking(TestRegionObserverStacking.java:112) - "Cannot get log writer" org.apache.hadoop.hbase.client.TestGetRowVersions#testGetRowMultipleVersions(TestGetRowVersions.java:67) - "CRC check failed"
          Hide
          Ted Yu added a comment -

          Please consider running the tests on a Linux box, such as:
          Linux x-grid07.ciq.com 2.6.18-194.8.1.el5 #1 SMP Thu Jul 1 19:04:48 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

          None of the tests above failed on the above machine. This was the only credible failure I saw:

          Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 229.043 sec <<< FAILURE!
          testMasterFailoverWithMockedRIT(org.apache.hadoop.hbase.master.TestMasterFailover)  Time elapsed: 180.039 sec  <<< ERROR!
          java.lang.Exception: test timed out after 180000 milliseconds
                  at java.lang.Thread.sleep(Native Method)
                  at org.apache.hadoop.hbase.MiniHBaseCluster.waitForActiveAndReadyMaster(MiniHBaseCluster.java:478)
                  at org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRIT(TestMasterFailover.java:429)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          
          Show
          Ted Yu added a comment - Please consider running the tests on a Linux box, such as: Linux x-grid07.ciq.com 2.6.18-194.8.1.el5 #1 SMP Thu Jul 1 19:04:48 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux None of the tests above failed on the above machine. This was the only credible failure I saw: Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 229.043 sec <<< FAILURE! testMasterFailoverWithMockedRIT(org.apache.hadoop.hbase.master.TestMasterFailover) Time elapsed: 180.039 sec <<< ERROR! java.lang.Exception: test timed out after 180000 milliseconds at java.lang. Thread .sleep(Native Method) at org.apache.hadoop.hbase.MiniHBaseCluster.waitForActiveAndReadyMaster(MiniHBaseCluster.java:478) at org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRIT(TestMasterFailover.java:429) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          Hide
          Karthick Sankarachary added a comment -

          See Effective Java, 2nd edition, page 31 about using finalizer.

          Point well taken. The only reason I went with the finalizer approach, which may not be ideal, is that it is safe, in the sense that the connection will eventually get closed, if there are truly no references to it.

          Alternatively, we can take the approach you described (not unlike the one in HBASE-3766), where the reference count is maintained by the connection manager. Specifically, the count could be incremented in HConnectionManager#getConnection and decremented in HConnectionManager#deleteConnection. As long as there are references to a connection, it will stay in the cache. When the count drop to zero, then at that point, we go ahead and close it.The catch with this approach is that every connection acquired must be explicitly deleted, otherwise we run the risk of never being able to close it. (currently, there are 33 calls to HConnectionManager#getConnection, but only 9 calls to HConnectionManager#deleteConnection). Note that the finalizer approach does not have this problem, since the JVM keeps track of the connection's reference count for us.

          In any case, this patch requires some sort of a reference count mechanism, regardless of how it is implemented.

          Show
          Karthick Sankarachary added a comment - See Effective Java, 2nd edition, page 31 about using finalizer. Point well taken. The only reason I went with the finalizer approach, which may not be ideal, is that it is safe, in the sense that the connection will eventually get closed, if there are truly no references to it. Alternatively, we can take the approach you described (not unlike the one in HBASE-3766 ), where the reference count is maintained by the connection manager. Specifically, the count could be incremented in HConnectionManager#getConnection and decremented in HConnectionManager#deleteConnection . As long as there are references to a connection, it will stay in the cache. When the count drop to zero, then at that point, we go ahead and close it.The catch with this approach is that every connection acquired must be explicitly deleted, otherwise we run the risk of never being able to close it. (currently, there are 33 calls to HConnectionManager#getConnection , but only 9 calls to HConnectionManager#deleteConnection ). Note that the finalizer approach does not have this problem, since the JVM keeps track of the connection's reference count for us. In any case, this patch requires some sort of a reference count mechanism, regardless of how it is implemented.
          Hide
          Ted Yu added a comment -

          Talking about matching HConnectionManager#getConnection with HConnectionManager#deleteConnection, we now know why TableOutputFormat calls HConnectionManager.deleteAllConnections(true) because it's the easiest answer to connection leak.
          I did hear someone complain about this call on IRC though.

          Show
          Ted Yu added a comment - Talking about matching HConnectionManager#getConnection with HConnectionManager#deleteConnection , we now know why TableOutputFormat calls HConnectionManager.deleteAllConnections(true) because it's the easiest answer to connection leak. I did hear someone complain about this call on IRC though.
          Hide
          Karthick Sankarachary added a comment -

          Ah, speaking of HConnectionManager#deleteAllConnections, notice that we didn't used to remove the connection from the cache after closing it. That's not good, because someone might end up getting a connection that's been already closed. The V2 version of the patch fixes that by clearing the cache.

          Show
          Karthick Sankarachary added a comment - Ah, speaking of HConnectionManager#deleteAllConnections , notice that we didn't used to remove the connection from the cache after closing it. That's not good, because someone might end up getting a connection that's been already closed. The V2 version of the patch fixes that by clearing the cache.
          Hide
          Ted Yu added a comment -

          I wasn't sure about this claim about finalizer for Java 1.6 and beyond (http://forums.whirlpool.net.au/archive/754353):
          In fact, it is perfectly permissible for a Java VM to never call it.

          y.s.ramakrishna@oracle.com answered:

          Yes; indeed, the spec is deliberately loose because it
          is difficult in practice to implement any hard promptness
          guarantees in general.

          Show
          Ted Yu added a comment - I wasn't sure about this claim about finalizer for Java 1.6 and beyond ( http://forums.whirlpool.net.au/archive/754353): In fact, it is perfectly permissible for a Java VM to never call it. y.s.ramakrishna@oracle.com answered: Yes; indeed, the spec is deliberately loose because it is difficult in practice to implement any hard promptness guarantees in general.
          Hide
          Karthick Sankarachary added a comment -

          I took a stab at implementing the reference count in the HCM, but left the finalize in there as a last line of defense. In order for this to work, every connection acquired through HCM#getConnection must be closed when no longer needed. For more details, please take a look at the V3 version of the patch. This time around, I ran the tests on a Linux box, and saw a couple of test cases fail in TestReplication and TestHBaseTestingUtility, but they were caused by some flaky file access issues.

          Show
          Karthick Sankarachary added a comment - I took a stab at implementing the reference count in the HCM , but left the finalize in there as a last line of defense. In order for this to work, every connection acquired through HCM#getConnection must be closed when no longer needed. For more details, please take a look at the V3 version of the patch. This time around, I ran the tests on a Linux box, and saw a couple of test cases fail in TestReplication and TestHBaseTestingUtility , but they were caused by some flaky file access issues.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777-V3.patch [ 12476816 ]
          Hide
          Ted Yu added a comment -

          There was one little conflict in HConnection.java where J-D recently put in:

            public int getCurrentNrHRS() throws IOException;
          

          I will run tests on Linux.

          Show
          Ted Yu added a comment - There was one little conflict in HConnection.java where J-D recently put in: public int getCurrentNrHRS() throws IOException; I will run tests on Linux.
          Hide
          Ted Yu added a comment -

          We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ?

                    if (entry.getValue().decRef() > 0) {
                      connectionKey = null;
                    } else break;
          
          Show
          Ted Yu added a comment - We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ? if (entry.getValue().decRef() > 0) { connectionKey = null ; } else break ;
          Hide
          Ted Yu added a comment -

          I don't find the following method in HConnectionManager called elsewhere:

            public static void putConnection(Configuration conf) {
              deleteConnection(conf, false);
            }
          
          Show
          Ted Yu added a comment - I don't find the following method in HConnectionManager called elsewhere: public static void putConnection(Configuration conf) { deleteConnection(conf, false ); }
          Hide
          Ted Yu added a comment -

          The following method is called only by finalizer:

              void close(boolean stopProxy) {
          

          I think we should call it when reference count reaches 0.

          Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected.

          Show
          Ted Yu added a comment - The following method is called only by finalizer: void close( boolean stopProxy) { I think we should call it when reference count reaches 0. Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected.
          Hide
          Ted Yu added a comment -

          Should MAX_CACHED_HBASE_INSTANCES be increased ?

          Show
          Ted Yu added a comment - Should MAX_CACHED_HBASE_INSTANCES be increased ?
          Hide
          Ted Yu added a comment -

          HTable.close() throws IOException, so this.connection.close() should be enclosed in finally block:

                try {
                  flushCommits();
                  this.pool.shutdown();
                } finally {
                  this.connection.close();
                }
          
          Show
          Ted Yu added a comment - HTable.close() throws IOException, so this.connection.close() should be enclosed in finally block: try { flushCommits(); this .pool.shutdown(); } finally { this .connection.close(); }
          Hide
          Ted Yu added a comment -

          HTablePool.closeTablePool() no longer calls this:

          -    HConnectionManager.deleteConnection(this.config, true);
          

          I think it should be kept because HTablePool.closeTablePool() is "a 'shutdown' of the given table pool" (according to javadoc).

          Show
          Ted Yu added a comment - HTablePool.closeTablePool() no longer calls this: - HConnectionManager.deleteConnection( this .config, true ); I think it should be kept because HTablePool.closeTablePool() is "a 'shutdown' of the given table pool" (according to javadoc).
          Hide
          Ted Yu added a comment -

          I think the following test failure is a regression (on Linux):

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility
          -------------------------------------------------------------------------------
          Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 191.418 sec <<< FAILURE!
          multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility)  Time elapsed: 180.095 sec  <<< ERROR!
          java.lang.Exception: test timed out after 180000 milliseconds
                  at java.lang.Object.wait(Native Method)
                  at java.lang.Thread.join(Thread.java:1186)
                  at java.lang.Thread.join(Thread.java:1239)
                  at org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:407)
                  at org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:501)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:457)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431)
                  at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:126)
          
          Show
          Ted Yu added a comment - I think the following test failure is a regression (on Linux): ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility ------------------------------------------------------------------------------- Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 191.418 sec <<< FAILURE! multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility) Time elapsed: 180.095 sec <<< ERROR! java.lang.Exception: test timed out after 180000 milliseconds at java.lang. Object .wait(Native Method) at java.lang. Thread .join( Thread .java:1186) at java.lang. Thread .join( Thread .java:1239) at org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:407) at org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:501) at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:457) at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431) at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:126)
          Karthick Sankarachary made changes -
          Attachment HBASE-3777-V4.patch [ 12476948 ]
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          Review request for hbase and Ted Yu.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a
          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Karthick Sankarachary added a comment -

          Ted,

          First off, thanks for keeping me honest. To answer your comments:

          There was one little conflict in HConnection.java where J-D recently put in:

          Resolved.

          We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ?

          Yes, we can break out of that loop if we find the connection we're looking for.

          I don't find the following method in HConnectionManager (putConnection) called elsewhere

          That's correct, I'll take it out.

          The following method (i.e. close()) is called only by finalizer. I think we should call it when reference count reaches 0.

          Ideally, the reference count should already be zero by the time HConnectionImplementation#finalize is called. Now, the fact that that method was invoked implies that all references to that connection have already gone out of scope, so it's not necessary to check our reference count. In fact, on the off chance that someone forgets to close the connection, that reference count will not be zero, and so if we were to check for that, we would not release the connection's resources, even though the JVM says that it is no longer in use.

          Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected.

          I believe you are referring to the getConnection method. I fixed it so that all accesses to HBASE_INSTANCES are protected.

          Should MAX_CACHED_HBASE_INSTANCES be increased ?

          In theory, the number of connections from a given client to the zookeeper can be changed using the "hbase.zookeeper.property.maxClientCnxns" property. So, it's not clear to me why MAX_CACHED_HBASE_INSTANCES.is even a constant to begin with. I think this topic deserves its own (separate?) issue.

          HTablePool.closeTablePool() no longer calls deleteConnection:

          I put it back - given the shutdown semantics of HTablePool

          I think the following test failure is a regression (on Linux) (TestHBaseTestingUtility)

          Yes, that's one of the two test cases that failed for me - the log leads me to believe it is unable to delete a certain file, but I'll take a closer look.

          Please review the revised patch (version V4), which has also been uploaded to the review board.

          Regards,
          Karthick

          Show
          Karthick Sankarachary added a comment - Ted, First off, thanks for keeping me honest. To answer your comments: There was one little conflict in HConnection.java where J-D recently put in: Resolved. We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ? Yes, we can break out of that loop if we find the connection we're looking for. I don't find the following method in HConnectionManager (putConnection) called elsewhere That's correct, I'll take it out. The following method (i.e. close()) is called only by finalizer. I think we should call it when reference count reaches 0. Ideally, the reference count should already be zero by the time HConnectionImplementation#finalize is called. Now, the fact that that method was invoked implies that all references to that connection have already gone out of scope, so it's not necessary to check our reference count. In fact, on the off chance that someone forgets to close the connection, that reference count will not be zero, and so if we were to check for that, we would not release the connection's resources, even though the JVM says that it is no longer in use. Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected. I believe you are referring to the getConnection method. I fixed it so that all accesses to HBASE_INSTANCES are protected. Should MAX_CACHED_HBASE_INSTANCES be increased ? In theory, the number of connections from a given client to the zookeeper can be changed using the "hbase.zookeeper.property.maxClientCnxns" property. So, it's not clear to me why MAX_CACHED_HBASE_INSTANCES.is even a constant to begin with. I think this topic deserves its own (separate?) issue. HTablePool.closeTablePool() no longer calls deleteConnection : I put it back - given the shutdown semantics of HTablePool I think the following test failure is a regression (on Linux) (TestHBaseTestingUtility) Yes, that's one of the two test cases that failed for me - the log leads me to believe it is unable to delete a certain file, but I'll take a closer look. Please review the revised patch (version V4), which has also been uploaded to the review board . Regards, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review511
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1033>

          I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here.

          It's Okay to do that in another JIRA. It's your call.

          • Ted

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review511 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1033 > I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here. It's Okay to do that in another JIRA. It's your call. Ted On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Ted Yu added a comment -

          Since the close method below doesn't show up in diff on review board, I want to comment here:

              void close(boolean stopProxy) {
          

          What I meant was that the original call in deleteConnection(Configuration conf, boolean stopProxy):

                if (t != null) {
                  t.close(stopProxy);
                }
          

          can be used when reference count reaches zero. So we would have:

            public static void deleteConnection(Configuration conf, boolean stopProxy) {
              synchronized (HBASE_INSTANCES) {
                HConnectionKey connectionKey = new HConnectionKey(conf);
                HConnectionImplementation connection = HBASE_INSTANCES
                    .get(connectionKey);
                if (connection != null) {
                  if (connection.decRef() == 0) {
                    HBASE_INSTANCES.remove(connectionKey);
                    connection.close(stopProxy);
                  } else if (stopProxy) {
                    connection.stopProxyOnClose(stopProxy);
                  }
                }
              }
            }
          
          Show
          Ted Yu added a comment - Since the close method below doesn't show up in diff on review board, I want to comment here: void close( boolean stopProxy) { What I meant was that the original call in deleteConnection(Configuration conf, boolean stopProxy): if (t != null ) { t.close(stopProxy); } can be used when reference count reaches zero. So we would have: public static void deleteConnection(Configuration conf, boolean stopProxy) { synchronized (HBASE_INSTANCES) { HConnectionKey connectionKey = new HConnectionKey(conf); HConnectionImplementation connection = HBASE_INSTANCES .get(connectionKey); if (connection != null ) { if (connection.decRef() == 0) { HBASE_INSTANCES.remove(connectionKey); connection.close(stopProxy); } else if (stopProxy) { connection.stopProxyOnClose(stopProxy); } } } }
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review512
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HConstants.java
          <https://reviews.apache.org/r/643/#comment1034>

          specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right?

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
          <https://reviews.apache.org/r/643/#comment1035>

          why not final?

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
          <https://reviews.apache.org/r/643/#comment1043>

          assert !stopped, or even Preconditions.checkState(!stopped)

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
          <https://reviews.apache.org/r/643/#comment1036>

          I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign.

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/643/#comment1037>

          why not final anymore?

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1038>

          this construction can go outside synch block

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1039>

          this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value.

          Similar for incCount above.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1040>

          when is this method ever safe to use? I think it can be removed

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1041>

          this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0.

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1042>

          all of these finally blocks should instead use something like IOUtils.cleanUp – and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try

          {...}
          • Todd

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review512 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HConstants.java < https://reviews.apache.org/r/643/#comment1034 > specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right? src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java < https://reviews.apache.org/r/643/#comment1035 > why not final? src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java < https://reviews.apache.org/r/643/#comment1043 > assert !stopped, or even Preconditions.checkState(!stopped) src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java < https://reviews.apache.org/r/643/#comment1036 > I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/643/#comment1037 > why not final anymore? src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1038 > this construction can go outside synch block src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1039 > this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value. Similar for incCount above. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1040 > when is this method ever safe to use? I think it can be removed src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1041 > this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0. src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1042 > all of these finally blocks should instead use something like IOUtils.cleanUp – and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try {...} Todd On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Karthick Sankarachary added a comment -

          I concur on the above comment. I'll update the patch after you're done with the review of this version.

          Show
          Karthick Sankarachary added a comment - I concur on the above comment. I'll update the patch after you're done with the review of this version.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review513
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1044>

          I think close(boolean stopProxy) should check this.closed at the beginning.

          • Ted

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review513 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1044 > I think close(boolean stopProxy) should check this.closed at the beginning. Ted On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Ted Yu added a comment -

          TestHBaseTestingUtility.multiClusters() uses the following as uniquifier:

              htu1.getConfiguration().set(HConstants.ZOOKEEPER_ZNODE_PARENT, "/1");
          

          Once I added HConstants.ZOOKEEPER_ZNODE_PARENT to ONNECTION_PROPERTIES for HConnectionKey, TestHBaseTestingUtility passed.

          Show
          Ted Yu added a comment - TestHBaseTestingUtility.multiClusters() uses the following as uniquifier: htu1.getConfiguration().set(HConstants.ZOOKEEPER_ZNODE_PARENT, "/1" ); Once I added HConstants.ZOOKEEPER_ZNODE_PARENT to ONNECTION_PROPERTIES for HConnectionKey, TestHBaseTestingUtility passed.
          Hide
          Ted Yu added a comment -

          I meant CONNECTION_PROPERTIES for HConnectionKey.

          Show
          Ted Yu added a comment - I meant CONNECTION_PROPERTIES for HConnectionKey.
          Hide
          Ted Yu added a comment -

          TestHBaseTestingUtility.multiClusters() should be improved with a catch block.
          Previously we faced timeout exception when in fact the cause was TableExistsException.
          I propose adding the following before the finally block:

              } catch (Exception e) {
                LOG.error("multiClusters failed: ", e);
              }
          

          BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT

          We should document HConnectionKey.CONNECTION_PROPERTIES so that developers know where to add new uniquifiers.

          Show
          Ted Yu added a comment - TestHBaseTestingUtility.multiClusters() should be improved with a catch block. Previously we faced timeout exception when in fact the cause was TableExistsException. I propose adding the following before the finally block: } catch (Exception e) { LOG.error( "multiClusters failed: " , e); } BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT We should document HConnectionKey.CONNECTION_PROPERTIES so that developers know where to add new uniquifiers.
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 417

          > <https://reviews.apache.org/r/643/diff/1/?file=16721#file16721line417>

          >

          > specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right?

          I made a note of that (verbatim) in that property's comment.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 61

          > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line61>

          >

          > why not final?

          Its back to being final now. In one of the earlier avatars of the patch, I was relying just on Object#finalize to release the connection, and wanted to set the connection to null in the hopes that the GC will it to it quicker, but we don't need to do that anymore.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, lines 147-148

          > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line147>

          >

          > assert !stopped, or even Preconditions.checkState(!stopped)

          That might be a little more harsh than is warranted. A safer approach would be to make the stop "destructor" method idempotent, which can be accomplished by not doing anything if the object is already "stopped", and that is what I did here for now. That's the way the destructor for most of the other objects (e.g., HTable) are implemented. Please let me know if we should do the assert anyway.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 150

          > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line150>

          >

          > I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign.

          Ah, this is good question. If you look at all of the references to the CatalogTracker, you'll notice that the Connection instance passed to it is never used outside the context of the CatalogTracker. In other words, the callee creates the Connection from the Configuration instance on behalf of the CatalogTracker. For the sake of consistency, I rewrote the CatalogTracker so that it takes a Configuration instead of a Connection instance. That way, we can be rest assured that the CatalogTracker will only release resources that it itself takes.

          The CatalogTracker was a big reason why I had to make so many changes to the HConnectionManager. After rewriting the former, the change to the latter is now relatively minimal.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 66

          > <https://reviews.apache.org/r/643/diff/1/?file=16723#file16723line66>

          >

          > why not final anymore?

          I made it final again.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 176

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line176>

          >

          > this construction can go outside synch block

          Agreed. Also, I rewrote the two delete connection methods so that they reuse the logic shared by them.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, lines 1633-1638

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1633>

          >

          > this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value.

          >

          > Similar for incCount above.

          The reference count methods were plagiarised from the HBaseClient class. Not sure why I added the return statement in the increment/decrement methods, but now they're gone.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1645

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1645>

          >

          > when is this method ever safe to use? I think it can be removed

          Given the change above, this is the only way now for us to tell if a connection can be released.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652>

          >

          > this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0.

          Actually, the finalize method does not check the reference count, nor does it need to, before it tries to close the connection. The assumption I made before was that the close(stopProxy) method was idempotent, but that wasn't completely true (it is possible that we might try to stop the region servers twice). To address that, I check if the connection is already closed or not before trying to release it, as Ted suggests in his comment below.

          On 2011-04-21 00:36:11, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1315

          > <https://reviews.apache.org/r/643/diff/1/?file=16726#file16726line1315>

          >

          > all of these finally blocks should instead use something like IOUtils.cleanUp – and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try {...}

          Good point. I replaced connection.close() with HConnectionManager.deleteConnection (which is kind of like the IOUtils.cleanUp method you wanted). As a result, we don't need HConnection to implement Closeable anymore.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review512
          -----------------------------------------------------------

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 417 > < https://reviews.apache.org/r/643/diff/1/?file=16721#file16721line417 > > > specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right? I made a note of that (verbatim) in that property's comment. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 61 > < https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line61 > > > why not final? Its back to being final now. In one of the earlier avatars of the patch, I was relying just on Object#finalize to release the connection, and wanted to set the connection to null in the hopes that the GC will it to it quicker, but we don't need to do that anymore. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, lines 147-148 > < https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line147 > > > assert !stopped, or even Preconditions.checkState(!stopped) That might be a little more harsh than is warranted. A safer approach would be to make the stop "destructor" method idempotent, which can be accomplished by not doing anything if the object is already "stopped", and that is what I did here for now. That's the way the destructor for most of the other objects (e.g., HTable) are implemented. Please let me know if we should do the assert anyway. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 150 > < https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line150 > > > I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign. Ah, this is good question. If you look at all of the references to the CatalogTracker, you'll notice that the Connection instance passed to it is never used outside the context of the CatalogTracker. In other words, the callee creates the Connection from the Configuration instance on behalf of the CatalogTracker. For the sake of consistency, I rewrote the CatalogTracker so that it takes a Configuration instead of a Connection instance. That way, we can be rest assured that the CatalogTracker will only release resources that it itself takes. The CatalogTracker was a big reason why I had to make so many changes to the HConnectionManager. After rewriting the former, the change to the latter is now relatively minimal. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 66 > < https://reviews.apache.org/r/643/diff/1/?file=16723#file16723line66 > > > why not final anymore? I made it final again. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 176 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line176 > > > this construction can go outside synch block Agreed. Also, I rewrote the two delete connection methods so that they reuse the logic shared by them. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, lines 1633-1638 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1633 > > > this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value. > > Similar for incCount above. The reference count methods were plagiarised from the HBaseClient class. Not sure why I added the return statement in the increment/decrement methods, but now they're gone. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1645 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1645 > > > when is this method ever safe to use? I think it can be removed Given the change above, this is the only way now for us to tell if a connection can be released. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652 > > > this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0. Actually, the finalize method does not check the reference count, nor does it need to, before it tries to close the connection. The assumption I made before was that the close(stopProxy) method was idempotent, but that wasn't completely true (it is possible that we might try to stop the region servers twice). To address that, I check if the connection is already closed or not before trying to release it, as Ted suggests in his comment below. On 2011-04-21 00:36:11, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1315 > < https://reviews.apache.org/r/643/diff/1/?file=16726#file16726line1315 > > > all of these finally blocks should instead use something like IOUtils.cleanUp – and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try {...} Good point. I replaced connection.close() with HConnectionManager.deleteConnection (which is kind of like the IOUtils.cleanUp method you wanted). As a result, we don't need HConnection to implement Closeable anymore. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review512 ----------------------------------------------------------- On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-21 00:47:22, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652>

          >

          > I think close(boolean stopProxy) should check this.closed at the beginning.

          Yes, that's a good practice in general. In fact, the existing implementation wasn't completely idempotent, as explained above.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review513
          -----------------------------------------------------------

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-21 00:47:22, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652 > > > I think close(boolean stopProxy) should check this.closed at the beginning. Yes, that's a good practice in general. In fact, the existing implementation wasn't completely idempotent, as explained above. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review513 ----------------------------------------------------------- On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-21 00:12:09, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 129

          > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line129>

          >

          > I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here.

          >

          > It's Okay to do that in another JIRA. It's your call.

          Here's what I did:

          MAX_CACHED_HBASE_INSTANCES = HBaseConfiguration.create().getInt(
          HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS,
          HConstants.DEFAULT_ZOOKEPER_MAX_CLIENT_CNXNS) + 1;

          The assumption here is that the value for "hbase.zookeeper.property.maxClientCnxns" will be the same across all of the configuration instances, which typically is the case.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review511
          -----------------------------------------------------------

          On 2011-04-20 23:56:13, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-20 23:56:13)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-21 00:12:09, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 129 > < https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line129 > > > I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here. > > It's Okay to do that in another JIRA. It's your call. Here's what I did: MAX_CACHED_HBASE_INSTANCES = HBaseConfiguration.create().getInt( HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS, HConstants.DEFAULT_ZOOKEPER_MAX_CLIENT_CNXNS) + 1; The assumption here is that the value for "hbase.zookeeper.property.maxClientCnxns" will be the same across all of the configuration instances, which typically is the case. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review511 ----------------------------------------------------------- On 2011-04-20 23:56:13, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-20 23:56:13) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-04-21 07:23:09.807767)

          Review request for hbase and Ted Yu.

          Changes
          -------

          This patch incorporates Ted's and Tod's comments on the earlier version.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a
          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-21 07:23:09.807767) Review request for hbase and Ted Yu. Changes ------- This patch incorporates Ted's and Tod's comments on the earlier version. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Karthick Sankarachary added a comment -

          Ted,

          BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT

          Excellent catch. Added the missing property to the connection key.

          Show
          Karthick Sankarachary added a comment - Ted, BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT Excellent catch. Added the missing property to the connection key.
          Hide
          Ted Yu added a comment -

          For version 5, I got:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility
          -------------------------------------------------------------------------------
          Tests run: 6, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 50.514 sec <<< FAILURE!
          multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility)  Time elapsed: 36.058 sec  <<< ERROR!
          java.util.ConcurrentModificationException
                  at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
                  at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:384)
                  at org.apache.hadoop.hbase.client.HConnectionManager.deleteAllConnections(HConnectionManager.java:219)
                  at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:512)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:455)
                  at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431)
                  at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:128)
          
          Show
          Ted Yu added a comment - For version 5, I got: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility ------------------------------------------------------------------------------- Tests run: 6, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 50.514 sec <<< FAILURE! multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility) Time elapsed: 36.058 sec <<< ERROR! java.util.ConcurrentModificationException at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373) at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:384) at org.apache.hadoop.hbase.client.HConnectionManager.deleteAllConnections(HConnectionManager.java:219) at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:512) at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:455) at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431) at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:128)
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review521
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1069>

          deleteConnection() may remove connectionKey and lead to ConcurrentModificationException.

          After restoring deleteAllConnections() to that of version 4, TestHBaseTestingUtility passes.

          • Ted

          On 2011-04-21 07:23:09, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-21 07:23:09)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review521 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1069 > deleteConnection() may remove connectionKey and lead to ConcurrentModificationException. After restoring deleteAllConnections() to that of version 4, TestHBaseTestingUtility passes. Ted On 2011-04-21 07:23:09, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-21 07:23:09) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Ted Yu added a comment -

          Got a new test failure:

          Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 194.698 sec <<< FAILURE!
          testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)  Time elapsed: 176.014 sec  <<< ERROR!
          java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server null for region , row 'hhh', but failed after 10 attempts.
          Exceptions:
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
          java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
                  at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1228)
                  at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verifyAttempt(TestTableMapReduce.java:189)
                  at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verify(TestTableMapReduce.java:158)
                  at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.runTestOnTable(TestTableMapReduce.java:140)
                  at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.testMultiRegionTable(TestTableMapReduce.java:114)
          
          Show
          Ted Yu added a comment - Got a new test failure: Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 194.698 sec <<< FAILURE! testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce) Time elapsed: 176.014 sec <<< ERROR! java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server null for region , row 'hhh', but failed after 10 attempts. Exceptions: java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1228) at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verifyAttempt(TestTableMapReduce.java:189) at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verify(TestTableMapReduce.java:158) at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.runTestOnTable(TestTableMapReduce.java:140) at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.testMultiRegionTable(TestTableMapReduce.java:114)
          Hide
          Ted Yu added a comment -

          This failure is more interesting:

          Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 0.394 sec <<< FAILURE!
          testThatIfMETAMovesWeAreNotified(org.apache.hadoop.hbase.catalog.TestCatalogTracker)  Time elapsed: 0.182 sec  <<< ERROR!
          java.lang.NullPointerException
                  at java.lang.Class.forName0(Native Method)
                  at java.lang.Class.forName(Class.java:169)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:407)
                  at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186)
                  at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:127)
                  at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:110)
                  at org.apache.hadoop.hbase.catalog.TestCatalogTracker.constructAndStartCatalogTracker(TestCatalogTracker.java:102)
                  at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testThatIfMETAMovesWeAreNotified(TestCatalogTracker.java:115)
          
          Show
          Ted Yu added a comment - This failure is more interesting: Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 0.394 sec <<< FAILURE! testThatIfMETAMovesWeAreNotified(org.apache.hadoop.hbase.catalog.TestCatalogTracker) Time elapsed: 0.182 sec <<< ERROR! java.lang.NullPointerException at java.lang. Class .forName0(Native Method) at java.lang. Class .forName( Class .java:169) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:407) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:127) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:110) at org.apache.hadoop.hbase.catalog.TestCatalogTracker.constructAndStartCatalogTracker(TestCatalogTracker.java:102) at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testThatIfMETAMovesWeAreNotified(TestCatalogTracker.java:115)
          Hide
          Karthick Sankarachary added a comment -

          My bad - I didn't get a chance to test version 5 yet. Will apply your fixes above and keep you posted on my findings. Are we on the right track, in general?

          Show
          Karthick Sankarachary added a comment - My bad - I didn't get a chance to test version 5 yet. Will apply your fixes above and keep you posted on my findings. Are we on the right track, in general?
          Hide
          Ted Yu added a comment -

          The exception from TestCatalogTracker was due to the following mock:

              final CatalogTracker ct = constructAndStartCatalogTracker(Mockito
                  .mock(Configuration.class));
          

          which leads to serverClassName being null:

                String serverClassName = conf.get(HConstants.REGION_SERVER_CLASS,
                  HConstants.DEFAULT_REGION_SERVER_CLASS);
          
          Show
          Ted Yu added a comment - The exception from TestCatalogTracker was due to the following mock: final CatalogTracker ct = constructAndStartCatalogTracker(Mockito .mock(Configuration.class)); which leads to serverClassName being null: String serverClassName = conf.get(HConstants.REGION_SERVER_CLASS, HConstants.DEFAULT_REGION_SERVER_CLASS);
          Hide
          Ted Yu added a comment -

          I replaced the mocking of configuration with UTIL.getConfiguration().
          Other tests in TestCatalogTracker passed except for testNoTimeoutWaitForMeta which hung:

          2011-04-22 03:25:25,241 ERROR [main-EventThread] zookeeper.ClientCnxn$EventThread(532): Error while calling watcher 
          java.lang.IllegalArgumentException: Can't build a writable with empty bytes array
          	at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:123)
          	at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:102)
          	at org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:238)
          	at org.apache.hadoop.hbase.zookeeper.ZKUtil.logRetrievedMsg(ZKUtil.java:1124)
          	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:550)
          	at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeCreated(ZooKeeperNodeTracker.java:149)
          	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279)
          	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
          	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
          

          I think the above is caused by ZKUtil.createAndFailSilent() passing empty data to zk.create() - not related to the changes of this JIRA.

          Show
          Ted Yu added a comment - I replaced the mocking of configuration with UTIL.getConfiguration(). Other tests in TestCatalogTracker passed except for testNoTimeoutWaitForMeta which hung: 2011-04-22 03:25:25,241 ERROR [main-EventThread] zookeeper.ClientCnxn$EventThread(532): Error while calling watcher java.lang.IllegalArgumentException: Can't build a writable with empty bytes array at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:123) at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:102) at org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:238) at org.apache.hadoop.hbase.zookeeper.ZKUtil.logRetrievedMsg(ZKUtil.java:1124) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:550) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeCreated(ZooKeeperNodeTracker.java:149) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) I think the above is caused by ZKUtil.createAndFailSilent() passing empty data to zk.create() - not related to the changes of this JIRA.
          Hide
          Ted Yu added a comment -

          TestZooKeeper failed when I ran the whole test suite.
          When I ran the test alone, it passed.

          Show
          Ted Yu added a comment - TestZooKeeper failed when I ran the whole test suite. When I ran the test alone, it passed.
          Karthick Sankarachary made changes -
          Attachment HBASE-3777-V6.patch [ 12477166 ]
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59.765464)

          Review request for hbase and Ted Yu.

          Changes
          -------

          The V6 version of the patch fixes the test failures in V5 by:

          a) Adding a package-level CatalogTracker constructor so that TestCatalogTracker can continue to use its Connection mock object.
          b) Deleting the connection from HConnectionManager#HBASE_INSTANCES if its finalize method is called.
          c) Removing the HTable#finalize method which might cause a closed connection to be returned to the current thread.

          There were no test failures with the V6 version of the patch. Please let me know if we need to tweak this further.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a
          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59.765464) Review request for hbase and Ted Yu. Changes ------- The V6 version of the patch fixes the test failures in V5 by: a) Adding a package-level CatalogTracker constructor so that TestCatalogTracker can continue to use its Connection mock object. b) Deleting the connection from HConnectionManager#HBASE_INSTANCES if its finalize method is called. c) Removing the HTable#finalize method which might cause a closed connection to be returned to the current thread. There were no test failures with the V6 version of the patch. Please let me know if we need to tweak this further. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review531
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1107>

          Shall we break out of the loop here ?

          • Ted

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review531 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1107 > Shall we break out of the loop here ? Ted On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review532
          -----------------------------------------------------------

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java
          <https://reviews.apache.org/r/643/#comment1108>

          Here we mix user code with test cluster management code.
          I think table.close() should be called first in the finally block.

          • Ted

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review532 ----------------------------------------------------------- src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java < https://reviews.apache.org/r/643/#comment1108 > Here we mix user code with test cluster management code. I think table.close() should be called first in the finally block. Ted On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          Ted Yu added a comment -

          This patch changes the way TableOutputFormat closes connection.
          Similar change would be applied to mapred.TableOutputFormat

          Show
          Ted Yu added a comment - This patch changes the way TableOutputFormat closes connection. Similar change would be applied to mapred.TableOutputFormat
          Ted Yu made changes -
          Attachment 3777-TOF.patch [ 12477188 ]
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review543
          -----------------------------------------------------------

          Good work lads (Karthick and Ted reviewing). Small nitpicks below. Lets get this in if all tests pass.

          src/main/java/org/apache/hadoop/hbase/HConstants.java
          <https://reviews.apache.org/r/643/#comment1136>

          Copy/paste issue (minor)

          src/main/java/org/apache/hadoop/hbase/HConstants.java
          <https://reviews.apache.org/r/643/#comment1137>

          Thanks for moving these configs. in here.

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
          <https://reviews.apache.org/r/643/#comment1141>

          This looks like a good change.

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/643/#comment1143>

          Implement Closeable now you've added close?

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/643/#comment1142>

          Good

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1146>

          This is painful, but makes sense.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1147>

          Not important but if closed, just return immediately and then you can save indenting whole method. Not important. Just style diff.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1148>

          So, this is just insurance as you say in the issue. Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize)

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1149>

          Good.

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1150>

          Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1151>

          Just remove this.

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java
          <https://reviews.apache.org/r/643/#comment1152>

          Just remove.

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          <https://reviews.apache.org/r/643/#comment1153>

          Interesting but I go along w/ it. Looks like we only made this connection for CT? If so, bad design fixed by your CT change.

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          <https://reviews.apache.org/r/643/#comment1154>

          ditto

          • Michael

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review543 ----------------------------------------------------------- Good work lads (Karthick and Ted reviewing). Small nitpicks below. Lets get this in if all tests pass. src/main/java/org/apache/hadoop/hbase/HConstants.java < https://reviews.apache.org/r/643/#comment1136 > Copy/paste issue (minor) src/main/java/org/apache/hadoop/hbase/HConstants.java < https://reviews.apache.org/r/643/#comment1137 > Thanks for moving these configs. in here. src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java < https://reviews.apache.org/r/643/#comment1141 > This looks like a good change. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/643/#comment1143 > Implement Closeable now you've added close? src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/643/#comment1142 > Good src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1146 > This is painful, but makes sense. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1147 > Not important but if closed, just return immediately and then you can save indenting whole method. Not important. Just style diff. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1148 > So, this is just insurance as you say in the issue. Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize) src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1149 > Good. src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1150 > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters? src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1151 > Just remove this. src/main/java/org/apache/hadoop/hbase/client/HTablePool.java < https://reviews.apache.org/r/643/#comment1152 > Just remove. src/main/java/org/apache/hadoop/hbase/master/HMaster.java < https://reviews.apache.org/r/643/#comment1153 > Interesting but I go along w/ it. Looks like we only made this connection for CT? If so, bad design fixed by your CT change. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < https://reviews.apache.org/r/643/#comment1154 > ditto Michael On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 116

          > <https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line116>

          >

          > Copy/paste issue (minor)

          Will change it to "Default limit on concurrent client-side zookeeper connections".

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 442

          > <https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line442>

          >

          > Thanks for moving these configs. in here.

          Yeah, the HConnectionKey would not have looked pretty if we hadn't moves those configs to HConstants.

          I will remove the trailing space.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 177

          > <https://reviews.apache.org/r/643/diff/3/?file=16909#file16909line177>

          >

          > This looks like a good change.

          As a matter of fact, the CatalogTracker was the only class that was being handed a connection, which made cleanup tricky since it didn't really own that connection (as Todd rightly pointed out). Making it take a configuration seemed like the most pragmatic thing to do.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 63

          > <https://reviews.apache.org/r/643/diff/3/?file=16910#file16910line63>

          >

          > Implement Closeable now you've added close?

          Yes, we can. I'll make HConnection implement Closeable as well. If you want, we can make HTablePool implement Closeable by calling closeTablePool on all of its tables.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 265

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line265>

          >

          > This is painful, but makes sense.

          A small price to pay, in my opinion.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1207

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1207>

          >

          > Not important but if closed, just return immediately and then you can save indenting whole method. Not important. Just style diff.

          Will do.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1667

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1667>

          >

          > So, this is just insurance as you say in the issue. Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize)

          Exactly - it's just insurance, a fall-back in case some thread somewhere was unable to close the connection for whatever reason.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 355

          > <https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line355>

          >

          > Interesting but I go along w/ it. Looks like we only made this connection for CT? If so, bad design fixed by your CT change.

          Yes, for the most part, the connection that was being given to CT was not used for anything else. There was one exception though (TestCatalogTracker), which was doing all kinds of things on the connection outside of the CT, and to accomodate that, I left open a package-level constructor in CT that is visible only by that test case (it'd be too much trouble to change it).

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTablePool.java, line 150

          > <https://reviews.apache.org/r/643/diff/3/?file=16913#file16913line150>

          >

          > Just remove.

          Ok, will remove all dead code.

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259

          > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>

          >

          > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?

          Just a thought - how about if we hide the ugliness in HCM, like so:

          public abstract class Connectable<T> {
          public Configuration conf;

          public Connectable(Configuration conf)

          { this.conf = conf; }

          public abstract T connect(Connection connection);
          }

          public static <T> T execute(Connectable<T> connectable) {
          if (connectable == null || connectable.conf == null)

          { return null; }

          HConfiguration conf = connectable.conf;
          HConnection connection = HConnectionManager.getConnection(conf);
          try

          { return connectable.connect(connection); }

          finally

          { HConnectionManager.deleteConnection(conf, false); }

          }

          That way, the HTable call would look somewhat prettier:

          HConnectionManager.execute(new Connectable<Boolean>(conf) {
          public Boolean connect(Connection connection)

          { return connection.isTableEnabled(tableName); }

          });

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review543
          -----------------------------------------------------------

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 116 > < https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line116 > > > Copy/paste issue (minor) Will change it to "Default limit on concurrent client-side zookeeper connections". On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 442 > < https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line442 > > > Thanks for moving these configs. in here. Yeah, the HConnectionKey would not have looked pretty if we hadn't moves those configs to HConstants. I will remove the trailing space. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 177 > < https://reviews.apache.org/r/643/diff/3/?file=16909#file16909line177 > > > This looks like a good change. As a matter of fact, the CatalogTracker was the only class that was being handed a connection, which made cleanup tricky since it didn't really own that connection (as Todd rightly pointed out). Making it take a configuration seemed like the most pragmatic thing to do. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 63 > < https://reviews.apache.org/r/643/diff/3/?file=16910#file16910line63 > > > Implement Closeable now you've added close? Yes, we can. I'll make HConnection implement Closeable as well. If you want, we can make HTablePool implement Closeable by calling closeTablePool on all of its tables. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 265 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line265 > > > This is painful, but makes sense. A small price to pay, in my opinion. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1207 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1207 > > > Not important but if closed, just return immediately and then you can save indenting whole method. Not important. Just style diff. Will do. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1667 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1667 > > > So, this is just insurance as you say in the issue. Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize) Exactly - it's just insurance, a fall-back in case some thread somewhere was unable to close the connection for whatever reason. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 355 > < https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line355 > > > Interesting but I go along w/ it. Looks like we only made this connection for CT? If so, bad design fixed by your CT change. Yes, for the most part, the connection that was being given to CT was not used for anything else. There was one exception though (TestCatalogTracker), which was doing all kinds of things on the connection outside of the CT, and to accomodate that, I left open a package-level constructor in CT that is visible only by that test case (it'd be too much trouble to change it). On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HTablePool.java, line 150 > < https://reviews.apache.org/r/643/diff/3/?file=16913#file16913line150 > > > Just remove. Ok, will remove all dead code. On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259 > < https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259 > > > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters? Just a thought - how about if we hide the ugliness in HCM, like so: public abstract class Connectable<T> { public Configuration conf; public Connectable(Configuration conf) { this.conf = conf; } public abstract T connect(Connection connection); } public static <T> T execute(Connectable<T> connectable) { if (connectable == null || connectable.conf == null) { return null; } HConfiguration conf = connectable.conf; HConnection connection = HConnectionManager.getConnection(conf); try { return connectable.connect(connection); } finally { HConnectionManager.deleteConnection(conf, false); } } That way, the HTable call would look somewhat prettier: HConnectionManager.execute(new Connectable<Boolean>(conf) { public Boolean connect(Connection connection) { return connection.isTableEnabled(tableName); } }); Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review543 ----------------------------------------------------------- On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-23 03:02:04, Ted Yu wrote:

          > src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java, line 147

          > <https://reviews.apache.org/r/643/diff/3/?file=16927#file16927line147>

          >

          > Here we mix user code with test cluster management code.

          > I think table.close() should be called first in the finally block.

          Closing the table before shutting down the cluster seems appropriate.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review532
          -----------------------------------------------------------

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-23 03:02:04, Ted Yu wrote: > src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java, line 147 > < https://reviews.apache.org/r/643/diff/3/?file=16927#file16927line147 > > > Here we mix user code with test cluster management code. > I think table.close() should be called first in the finally block. Closing the table before shutting down the cluster seems appropriate. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review532 ----------------------------------------------------------- On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-23 02:14:11, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 228

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line228>

          >

          > Shall we break out of the loop here ?

          Will do.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review531
          -----------------------------------------------------------

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-23 02:14:11, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 228 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line228 > > > Shall we break out of the loop here ? Will do. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review531 ----------------------------------------------------------- On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259

          > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>

          >

          > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?

          Karthick Sankarachary wrote:

          Just a thought - how about if we hide the ugliness in HCM, like so:

          public abstract class Connectable<T> {

          public Configuration conf;

          public Connectable(Configuration conf) { bq. this.conf = conf; bq. }

          public abstract T connect(Connection connection);

          }

          public static <T> T execute(Connectable<T> connectable) {

          if (connectable == null || connectable.conf == null) { bq. return null; bq. }

          HConfiguration conf = connectable.conf;

          HConnection connection = HConnectionManager.getConnection(conf);

          try { bq. return connectable.connect(connection); bq. } finally { bq. HConnectionManager.deleteConnection(conf, false); bq. }

          }

          That way, the HTable call would look somewhat prettier:

          HConnectionManager.execute(new Connectable<Boolean>(conf) {

          public Boolean connect(Connection connection) { bq. return connection.isTableEnabled(tableName); bq. }

          });

          BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review543
          -----------------------------------------------------------

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259 > < https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259 > > > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters? Karthick Sankarachary wrote: Just a thought - how about if we hide the ugliness in HCM, like so: public abstract class Connectable<T> { public Configuration conf; public Connectable(Configuration conf) { bq. this.conf = conf; bq. } public abstract T connect(Connection connection); } public static <T> T execute(Connectable<T> connectable) { if (connectable == null || connectable.conf == null) { bq. return null; bq. } HConfiguration conf = connectable.conf; HConnection connection = HConnectionManager.getConnection(conf); try { bq. return connectable.connect(connection); bq. } finally { bq. HConnectionManager.deleteConnection(conf, false); bq. } } That way, the HTable call would look somewhat prettier: HConnectionManager.execute(new Connectable<Boolean>(conf) { public Boolean connect(Connection connection) { bq. return connection.isTableEnabled(tableName); bq. } }); BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review543 ----------------------------------------------------------- On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-25 20:05:54, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259

          > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>

          >

          > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?

          Karthick Sankarachary wrote:

          Just a thought - how about if we hide the ugliness in HCM, like so:

          public abstract class Connectable<T> {

          public Configuration conf;

          public Connectable(Configuration conf) { bq. this.conf = conf; bq. }

          public abstract T connect(Connection connection);

          }

          public static <T> T execute(Connectable<T> connectable) {

          if (connectable == null || connectable.conf == null) { bq. return null; bq. }

          HConfiguration conf = connectable.conf;

          HConnection connection = HConnectionManager.getConnection(conf);

          try { bq. return connectable.connect(connection); bq. } finally { bq. HConnectionManager.deleteConnection(conf, false); bq. }

          }

          That way, the HTable call would look somewhat prettier:

          HConnectionManager.execute(new Connectable<Boolean>(conf) {

          public Boolean connect(Connection connection) { bq. return connection.isTableEnabled(tableName); bq. }

          });

          Karthick Sankarachary wrote:

          BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error.

          Your proposal is also ugly but I think less ugly than what we currently have so I would prefer it; it has the benefit of moving the ref counting back into HCM, not letting it out of the class (I'm fine w/ all your other comments Karthick)

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review543
          -----------------------------------------------------------

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-25 20:05:54, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259 > < https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259 > > > Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters? Karthick Sankarachary wrote: Just a thought - how about if we hide the ugliness in HCM, like so: public abstract class Connectable<T> { public Configuration conf; public Connectable(Configuration conf) { bq. this.conf = conf; bq. } public abstract T connect(Connection connection); } public static <T> T execute(Connectable<T> connectable) { if (connectable == null || connectable.conf == null) { bq. return null; bq. } HConfiguration conf = connectable.conf; HConnection connection = HConnectionManager.getConnection(conf); try { bq. return connectable.connect(connection); bq. } finally { bq. HConnectionManager.deleteConnection(conf, false); bq. } } That way, the HTable call would look somewhat prettier: HConnectionManager.execute(new Connectable<Boolean>(conf) { public Boolean connect(Connection connection) { bq. return connection.isTableEnabled(tableName); bq. } }); Karthick Sankarachary wrote: BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error. Your proposal is also ugly but I think less ugly than what we currently have so I would prefer it; it has the benefit of moving the ref counting back into HCM, not letting it out of the class (I'm fine w/ all your other comments Karthick) Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review543 ----------------------------------------------------------- On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review569
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          <https://reviews.apache.org/r/643/#comment1213>

          Same comment as in HRS, I think this is creating a second connection for the master.

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          <https://reviews.apache.org/r/643/#comment1212>

          IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has.

          • Jean-Daniel

          On 2011-04-22 21:16:59, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-22 21:16:59)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review569 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/master/HMaster.java < https://reviews.apache.org/r/643/#comment1213 > Same comment as in HRS, I think this is creating a second connection for the master. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < https://reviews.apache.org/r/643/#comment1212 > IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has. Jean-Daniel On 2011-04-22 21:16:59, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-22 21:16:59) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01.305553)

          Review request for hbase and Ted Yu.

          Summary (updated)
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a
          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01.305553) Review request for hbase and Ted Yu. Summary (updated) ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review586
          -----------------------------------------------------------

          Can you include TableOutputFormat.java in the patch please ?

          • Ted

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review586 ----------------------------------------------------------- Can you include TableOutputFormat.java in the patch please ? Ted On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-27 18:43:04, Ted Yu wrote:

          > Can you include TableOutputFormat.java in the patch please ?

          Will do.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review586
          -----------------------------------------------------------

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-27 18:43:04, Ted Yu wrote: > Can you include TableOutputFormat.java in the patch please ? Will do. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review586 ----------------------------------------------------------- On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote:

          > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 360

          > <https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line360>

          >

          > Same comment as in HRS, I think this is creating a second connection for the master.

          Same comment as in HRS. Again, please correct me if I'm wrong.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review569
          -----------------------------------------------------------

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote: > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 360 > < https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line360 > > > Same comment as in HRS, I think this is creating a second connection for the master. Same comment as in HRS. Again, please correct me if I'm wrong. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review569 ----------------------------------------------------------- On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote:

          > src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 513

          > <https://reviews.apache.org/r/643/diff/3/?file=16918#file16918line513>

          >

          > IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has.

          Please correct me if I'm wrong, but the RS creates the connection (at least the HConnection kind) just for the sake of CT. As a matter of fact, I was able to safely remove the RS#connection field altogether. What I should also have done, but forgot to do, was remove the call to delete the connection at the end of RS' run method. In the upcoming patch, the RS will not try to delete the connection, since it doesn't acquire it, at least not directly, in the first place. Now, the CT takes over the ownership of the connection resource.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review569
          -----------------------------------------------------------

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote: > src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 513 > < https://reviews.apache.org/r/643/diff/3/?file=16918#file16918line513 > > > IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has. Please correct me if I'm wrong, but the RS creates the connection (at least the HConnection kind) just for the sake of CT. As a matter of fact, I was able to safely remove the RS#connection field altogether. What I should also have done, but forgot to do, was remove the call to delete the connection at the end of RS' run method. In the upcoming patch, the RS will not try to delete the connection, since it doesn't acquire it, at least not directly, in the first place. Now, the CT takes over the ownership of the connection resource. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review569 ----------------------------------------------------------- On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review591
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1230>

          We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException:

          public boolean isTableEnabled(byte[] tableName) throws IOException;

          Looks like we should use Connectable that Karthick proposed.

          • Ted

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review591 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1230 > We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException: public boolean isTableEnabled(byte[] tableName) throws IOException; Looks like we should use Connectable that Karthick proposed. Ted On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-27 23:03:49, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 261

          > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line261>

          >

          > We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException:

          >

          > public boolean isTableEnabled(byte[] tableName) throws IOException;

          >

          > Looks like we should use Connectable that Karthick proposed.

          Yes, the delete now happens inside of a finally block in the current version, which was just rebased with the trunk, and is currently undergoing testing.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review591
          -----------------------------------------------------------

          On 2011-04-27 18:33:01, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-27 18:33:01)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-27 23:03:49, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 261 > < https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line261 > > > We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException: > > public boolean isTableEnabled(byte[] tableName) throws IOException; > > Looks like we should use Connectable that Karthick proposed. Yes, the delete now happens inside of a finally block in the current version, which was just rebased with the trunk, and is currently undergoing testing. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review591 ----------------------------------------------------------- On 2011-04-27 18:33:01, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-27 18:33:01) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-04-28 00:13:52.902673)

          Review request for hbase and Ted Yu.

          Changes
          -------

          The V7 version of the patch make the following additional changes:

          a) Adds a HCM#execute method for executing blocks that require short-lived connections.
          b) Removes the HCM#deleteConnection from the HMaster and HRegionServer classes, as they no longer directly get connections.
          c) Adds a connection field in the ServerManager class, which is gotten in its constructor and deleted when it's stopped.

          All but two tests (viz., TestSplitLogWorker and TestCatalogJanitor) passed. FWIW, those two failures happen without the patch as well, and only if the "hbase.master.distributed.log.splitting" is true.

          PS: Just heard from Ted Yu that Stack checked in a patch for HBASE-1502, which I'll rebase with and test tomorrow. In the meantime, please review the three critical changes described above.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0
          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb
          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf
          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 00:13:52.902673) Review request for hbase and Ted Yu. Changes ------- The V7 version of the patch make the following additional changes: a) Adds a HCM#execute method for executing blocks that require short-lived connections. b) Removes the HCM#deleteConnection from the HMaster and HRegionServer classes, as they no longer directly get connections. c) Adds a connection field in the ServerManager class, which is gotten in its constructor and deleted when it's stopped. All but two tests (viz., TestSplitLogWorker and TestCatalogJanitor) passed. FWIW, those two failures happen without the patch as well, and only if the "hbase.master.distributed.log.splitting" is true. PS: Just heard from Ted Yu that Stack checked in a patch for HBASE-1502 , which I'll rebase with and test tomorrow. In the meantime, please review the three critical changes described above. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review593
          -----------------------------------------------------------

          throughout patch, many cases where you need the returning of a conn to the pool to be in a finally {} clause to avoid leaks when exceptions are thrown

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1232>

          hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1233>

          This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1234>

          this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int.

          src/main/java/org/apache/hadoop/hbase/client/HTable.java
          <https://reviews.apache.org/r/643/#comment1235>

          these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection.

          Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong.

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
          <https://reviews.apache.org/r/643/#comment1236>

          again try..finally

          • Todd

          On 2011-04-28 00:13:52, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-28 00:13:52)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review593 ----------------------------------------------------------- throughout patch, many cases where you need the returning of a conn to the pool to be in a finally {} clause to avoid leaks when exceptions are thrown src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1232 > hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1233 > This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1234 > this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int. src/main/java/org/apache/hadoop/hbase/client/HTable.java < https://reviews.apache.org/r/643/#comment1235 > these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection. Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong. src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < https://reviews.apache.org/r/643/#comment1236 > again try..finally Todd On 2011-04-28 00:13:52, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 00:13:52) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-28 00:21:00, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 130

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line130>

          >

          > hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true.

          Changed the comment to read "A LRU Map of HConnectionKey -> HConnection (TableServer)".

          On 2011-04-28 00:21:00, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 179

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line179>

          >

          > This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection.

          I believe the increment is inside the synchronized block.

          On 2011-04-28 00:21:00, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 411

          > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line411>

          >

          > this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int.

          Will make it an int.

          On 2011-04-28 00:21:00, Todd Lipcon wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1361

          > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line1361>

          >

          > these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection.

          >

          > Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong.

          I'll make sure the connection is cleaned up inside a finally block across the board.

          As far as the closeable is concerned, that was indeed implemented in an earlier version, but I went back to using deleteConnection based on your comment about having IOUtils.cleanUp method, which I think I must've misunderstood. In any case, I can add the closeable back. Finally (no pun intended), I'll make sure that if the close were to fail, that it wouldn't mask any exception in the try block, if any.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review593
          -----------------------------------------------------------

          On 2011-04-28 00:13:52, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-28 00:13:52)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-28 00:21:00, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 130 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line130 > > > hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true. Changed the comment to read "A LRU Map of HConnectionKey -> HConnection (TableServer)". On 2011-04-28 00:21:00, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 179 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line179 > > > This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection. I believe the increment is inside the synchronized block. On 2011-04-28 00:21:00, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 411 > < https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line411 > > > this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int. Will make it an int. On 2011-04-28 00:21:00, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1361 > < https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line1361 > > > these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection. > > Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong. I'll make sure the connection is cleaned up inside a finally block across the board. As far as the closeable is concerned, that was indeed implemented in an earlier version, but I went back to using deleteConnection based on your comment about having IOUtils.cleanUp method, which I think I must've misunderstood. In any case, I can add the closeable back. Finally (no pun intended), I'll make sure that if the close were to fail, that it wouldn't mask any exception in the try block, if any. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review593 ----------------------------------------------------------- On 2011-04-28 00:13:52, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 00:13:52) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review596
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/643/#comment1241>

          try-with-resources in JDK 7 would be useful in our case:
          http://hg.openjdk.java.net/jdk7/tl/jdk/rev/6e33b377aa6e

          • Ted

          On 2011-04-28 00:13:52, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-28 00:13:52)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0

          src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review596 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/643/#comment1241 > try-with-resources in JDK 7 would be useful in our case: http://hg.openjdk.java.net/jdk7/tl/jdk/rev/6e33b377aa6e Ted On 2011-04-28 00:13:52, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 00:13:52) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-04-28 22:01:18.165256)

          Review request for hbase and Ted Yu.

          Changes
          -------

          The V8 version of the diff addresses Todd's concerns around leaks in the event of exceptions. In short, it wraps all (method-level) blocks that access the connection around the HCM#execute method, which takes care of acquiring and closing the connection. Specifically, exceptions thrown by close will be swallowed if (and only if) the block itself throws one. There were no regressions, AFAIK, although the TestHRegionLocation and TestCatalogTracker tests did fail.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb
          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153
          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53
          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2
          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 22:01:18.165256) Review request for hbase and Ted Yu. Changes ------- The V8 version of the diff addresses Todd's concerns around leaks in the event of exceptions. In short, it wraps all (method-level) blocks that access the connection around the HCM#execute method, which takes care of acquiring and closing the connection. Specifically, exceptions thrown by close will be swallowed if (and only if) the block itself throws one. There were no regressions, AFAIK, although the TestHRegionLocation and TestCatalogTracker tests did fail. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review607
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/643/#comment1250>

          A log statement should be added for the case where connectSucceeded is false.

          • Ted

          On 2011-04-28 22:01:18, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-28 22:01:18)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53

          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2

          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review607 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/643/#comment1250 > A log statement should be added for the case where connectSucceeded is false. Ted On 2011-04-28 22:01:18, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 22:01:18) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-04-28 22:14:13, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 338

          > <https://reviews.apache.org/r/643/diff/5/?file=17633#file17633line338>

          >

          > A log statement should be added for the case where connectSucceeded is false.

          Will do. Note that the only reason the close method throws an IOException is because Closeable says so. In practice, the HConnectionImplementation#close() would not throw one, AFAIK.

          • Karthick

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review607
          -----------------------------------------------------------

          On 2011-04-28 22:01:18, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-04-28 22:01:18)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53

          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2

          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-04-28 22:14:13, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 338 > < https://reviews.apache.org/r/643/diff/5/?file=17633#file17633line338 > > > A log statement should be added for the case where connectSucceeded is false. Will do. Note that the only reason the close method throws an IOException is because Closeable says so. In practice, the HConnectionImplementation#close() would not throw one, AFAIK. Karthick ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review607 ----------------------------------------------------------- On 2011-04-28 22:01:18, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-04-28 22:01:18) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          M. C. Srivas added a comment -

          It seems error-prone to compare conf's to identify clusters.

          The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?

          Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry.

          Show
          M. C. Srivas added a comment - It seems error-prone to compare conf's to identify clusters. The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed? Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry.
          Hide
          Karthick Sankarachary added a comment -

          The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?

          The thing is that a HConnection's behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific.

          Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?

          As per the ZK/HBase use cases wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time.

          Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry.

          The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's equals method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the HConnectionKey defined in the HCM will serve as a happy medium between the two ends of the spectrum.

          Show
          Karthick Sankarachary added a comment - The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed? The thing is that a HConnection 's behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed? As per the ZK/HBase use cases wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time. Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry. The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's equals method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the HConnectionKey defined in the HCM will serve as a happy medium between the two ends of the spectrum.
          Hide
          M. C. Srivas added a comment -

          The thing is that a HConnection's behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific.

          I am beginning to understand the reasons behind taking this approach. Thanks for explaining.

          As per the ZK/HBase use cases wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time.

          Even in the presence of multiple hmasters, does it really matter if we connect back to the same hmaster? It probably is important for the hmasters themselves which hmaster they connect to (and perhaps for region-servers as well). But it should not matter for clients. Agree? (of course, I am stating all this without knowing any details about Hbase, so don't kill me for it).

          The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's equals method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the HConnectionKey defined in the HCM will serve as a happy medium between the two ends of the spectrum.

          Ted Yu pointed out the work being done here, so I started reading the JIRA. I am not familiar with where/how the HConnection instance gets used, and this JIRA was pretty long to understand with the code changes and all.

          I started to comment on this Jira due to the problems we faced trying to scale up the YCSB benchmark. We tried to run about 500 threads in the YCSB HBase client, and ran out of connections to ZK. It was a complete, unexpected, surprise that the HBase client needed to maintain multiple connections to ZK, and it seemed to be using one per thread (ie, per HTable).

          We share the same goal: with this patch, we hope to be able to scale YCSB to 50 client machines, with 500 threads per client, and see how HBase holds up.

          Would you agree, that in the long run, the HBase client should use ZK only to find the hmaster and region-servers, but not keep the connection to ZK open? Otherwise ZK may go under as we try to scale the number of HBase clients.

          Show
          M. C. Srivas added a comment - The thing is that a HConnection's behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific. I am beginning to understand the reasons behind taking this approach. Thanks for explaining. As per the ZK/HBase use cases wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time. Even in the presence of multiple hmasters, does it really matter if we connect back to the same hmaster? It probably is important for the hmasters themselves which hmaster they connect to (and perhaps for region-servers as well). But it should not matter for clients. Agree? (of course, I am stating all this without knowing any details about Hbase, so don't kill me for it). The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's equals method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the HConnectionKey defined in the HCM will serve as a happy medium between the two ends of the spectrum. Ted Yu pointed out the work being done here, so I started reading the JIRA. I am not familiar with where/how the HConnection instance gets used, and this JIRA was pretty long to understand with the code changes and all. I started to comment on this Jira due to the problems we faced trying to scale up the YCSB benchmark. We tried to run about 500 threads in the YCSB HBase client, and ran out of connections to ZK. It was a complete, unexpected, surprise that the HBase client needed to maintain multiple connections to ZK, and it seemed to be using one per thread (ie, per HTable). We share the same goal: with this patch, we hope to be able to scale YCSB to 50 client machines, with 500 threads per client, and see how HBase holds up. Would you agree, that in the long run, the HBase client should use ZK only to find the hmaster and region-servers, but not keep the connection to ZK open? Otherwise ZK may go under as we try to scale the number of HBase clients.
          Hide
          stack added a comment -

          Just FYI, we have cluster uuid as of HBASE-3677.

          Show
          stack added a comment - Just FYI, we have cluster uuid as of HBASE-3677 .
          Hide
          stack added a comment -

          I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit). Unfortunately, it won't do for 0.90.x since its now polluted with TRUNKisms – i.e. ServerName – but thats probably ok since this is a big change. Let me try this patch out on a cluster in the meantime to make sure it basically works.

          Show
          stack added a comment - I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit). Unfortunately, it won't do for 0.90.x since its now polluted with TRUNKisms – i.e. ServerName – but thats probably ok since this is a big change. Let me try this patch out on a cluster in the meantime to make sure it basically works.
          Hide
          Karthick Sankarachary added a comment -

          I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit)

          Just an update, I ran the test today after rebasing it (yet again), and this time there were no failures period. I'll update the patch on the review board, so you don't have to rebase it.

          Show
          Karthick Sankarachary added a comment - I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit) Just an update, I ran the test today after rebasing it (yet again), and this time there were no failures period. I'll update the patch on the review board, so you don't have to rebase it.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-05-02 20:59:23.844076)

          Review request for hbase and Ted Yu.

          Changes
          -------

          I ran the test today after rebasing it (yet again), and this time there were no failures period.

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb
          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411
          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275
          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2
          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-05-02 20:59:23.844076) Review request for hbase and Ted Yu. Changes ------- I ran the test today after rebasing it (yet again), and this time there were no failures period. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/
          -----------------------------------------------------------

          (Updated 2011-05-02 21:34:35.203784)

          Review request for hbase and Ted Yu.

          Changes
          -------

          As Ted suggsted, added "a log statement for the case where connectSucceeded is false."

          Summary
          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.
          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b
          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31
          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333
          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb
          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395
          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411
          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275
          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33
          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b
          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2
          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4
          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8
          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e
          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75
          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8
          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b
          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing
          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-05-02 21:34:35.203784) Review request for hbase and Ted Yu. Changes ------- As Ted suggsted, added "a log statement for the case where connectSucceeded is false." Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/643/#review632
          -----------------------------------------------------------

          Ship it!

          I think a patch for 0.90 should be produced separately.
          We have informed hbase users of this change. They would expect to benefit from it in 0.90

          • Ted

          On 2011-05-02 21:34:35, Karthick Sankarachary wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/643/

          -----------------------------------------------------------

          (Updated 2011-05-02 21:34:35)

          Review request for hbase and Ted Yu.

          Summary

          -------

          Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

          Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".

          This addresses bug HBASE-3777.

          https://issues.apache.org/jira/browse/HBASE-3777

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b

          src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31

          src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333

          src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb

          src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395

          src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275

          src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33

          src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b

          src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2

          src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4

          src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8

          src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb

          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e

          src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75

          src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8

          src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b

          src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb

          Diff: https://reviews.apache.org/r/643/diff

          Testing

          -------

          mvn test

          Thanks,

          Karthick

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review632 ----------------------------------------------------------- Ship it! I think a patch for 0.90 should be produced separately. We have informed hbase users of this change. They would expect to benefit from it in 0.90 Ted On 2011-05-02 21:34:35, Karthick Sankarachary wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ ----------------------------------------------------------- (Updated 2011-05-02 21:34:35) Review request for hbase and Ted Yu. Summary ------- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable". This addresses bug HBASE-3777 . https://issues.apache.org/jira/browse/HBASE-3777 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing ------- mvn test Thanks, Karthick
          Hide
          stack added a comment -

          Committed to TRUNK after trying w/ 500 ycsb clients (It comes up and runs rather than pre-patch it fails).

          Thank you for your persistence Karthick (and to the reviewers).

          Show
          stack added a comment - Committed to TRUNK after trying w/ 500 ycsb clients (It comes up and runs rather than pre-patch it fails). Thank you for your persistence Karthick (and to the reviewers).
          stack made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Eugene Koontz made changes -
          Link This issue relates to HBASE-3861 [ HBASE-3861 ]
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #1909 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1909/)

          Show
          Hudson added a comment - Integrated in HBase-TRUNK #1909 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1909/ )
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #1951 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1951/)
          HBASE-3592 Guava snuck back in as a dependency via hbase-3777

          Show
          Hudson added a comment - Integrated in HBase-TRUNK #1951 (See https://builds.apache.org/hudson/job/HBase-TRUNK/1951/ ) HBASE-3592 Guava snuck back in as a dependency via hbase-3777
          Hide
          Bright Fulton added a comment -

          Attached backport of fix to 0.90.4.

          Show
          Bright Fulton added a comment - Attached backport of fix to 0.90.4.
          Bright Fulton made changes -
          Attachment HBASE-3777-V8.0.90.4.backport.patch [ 12496883 ]
          Hide
          Ted Yu added a comment -

          @Bright:
          Do all tests in 0.90 pass ?

          I got the following when applying your patch:

          Hunk #11 succeeded at 1416 (offset 8 lines).
          1 out of 11 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/client/HTable.java.rej
          

          This is minor.

          Running test suite.

          Show
          Ted Yu added a comment - @Bright: Do all tests in 0.90 pass ? I got the following when applying your patch: Hunk #11 succeeded at 1416 (offset 8 lines). 1 out of 11 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/client/HTable.java.rej This is minor. Running test suite.
          Hide
          Ted Yu added a comment -

          Test suite didn't go very far - TestLogRolling hangs

          "main" prio=10 tid=0x0000000057197000 nid=0x3f12 waiting on condition [0x00000000406c8000]
             java.lang.Thread.State: TIMED_WAITING (sleeping)
                  at java.lang.Thread.sleep(Native Method)
                  at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:446)
                  at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:393)
                  at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:415)
          
          Show
          Ted Yu added a comment - Test suite didn't go very far - TestLogRolling hangs "main" prio=10 tid=0x0000000057197000 nid=0x3f12 waiting on condition [0x00000000406c8000] java.lang. Thread .State: TIMED_WAITING (sleeping) at java.lang. Thread .sleep(Native Method) at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:446) at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:393) at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:415)
          Ted Yu made changes -
          Link This issue is depended upon by HBASE-4508 [ HBASE-4508 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          1m 34s 1 Karthick Sankarachary 13/Apr/11 21:40
          Patch Available Patch Available Resolved Resolved
          19d 8h 4m 1 stack 03/May/11 05:44

            People

            • Assignee:
              Karthick Sankarachary
              Reporter:
              Karthick Sankarachary
            • Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development