Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5573

Replace client ZooKeeper watchers by simple ZooKeeper reads

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.95.0
    • Component/s: Client, Zookeeper
    • Labels:
      None

      Description

      Some code in the package needs to read data in ZK. This could be done by a simple read, but is actually implemented with a watcher. This holds ZK resources.

      Fixing this could also be an opportunity to remove the need for the client to provide the master address and port.

      1. 5573.v1.patch
        37 kB
        Nicolas Liochon
      2. 5573.v2.patch
        46 kB
        Nicolas Liochon
      3. 5573.v4.patch
        86 kB
        Nicolas Liochon
      4. 5573.v6.patch
        15 kB
        Nicolas Liochon
      5. 5573.v7.patch
        16 kB
        Nicolas Liochon
      6. 5573.v8.patch
        16 kB
        Nicolas Liochon

        Issue Links

          Activity

          Hide
          stack stack added a comment -

          Hurray!

          Show
          stack stack added a comment - Hurray!
          Hide
          nkeywal Nicolas Liochon added a comment -

          Patch to get a first feedback.

          Unfortunately, it's more a hack than anything else, because I'am trying to keep the existing code & interface and not rewriting everything.

          Today HBase considers any ZK client as a client that will watch the values, and does not distinguish simple readers vs. watchers. To change this, I:

          • Split ZooKeeperWatcher in two classes, one ZooKeeperWatcher with the same responsibilities as today, and another, ZooKeeperHBaseNodes, that contains the hbase znode definition. ZooKeeperWatcher extends ZooKeeperHBaseNodes.
          • In ZKUtils, depending if a watch is involved or not, changed the expected type from ZooKeeperWatcher to ZooKeeperHBaseNodes.

          That's not a hack yet. The issues are:

          • The client is supposed to wait if the root location znode is not yet created in ZK. I don't think that the trunk implementation actually works. But it's done with a watcher. As we don't want a watcher, I changed it to a loop.
          • As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated.
          • In trunk, the current dependencies are:
          • RecovableZooKeeper depends(contains) on ZooKeeper
          • ZooKeeper depends(contains) on ZooKeeperWatcher
          • ZooKeeperWatcher depends(contains) RecovableZooKeeper
          • ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher

          That makes it difficult to reuse any part of code without having a ZooKeeperWatcher. To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null.

          I still have to do a lot of renaming if we go for this approach.

          I had some failure that could be unrelated, but I haven't looked at them yet:
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
          org.apache.hadoop.hbase.io.encoding.TestLoadAndSwitchEncodeOnDisk

          Show
          nkeywal Nicolas Liochon added a comment - Patch to get a first feedback. Unfortunately, it's more a hack than anything else, because I'am trying to keep the existing code & interface and not rewriting everything. Today HBase considers any ZK client as a client that will watch the values, and does not distinguish simple readers vs. watchers. To change this, I: Split ZooKeeperWatcher in two classes, one ZooKeeperWatcher with the same responsibilities as today, and another, ZooKeeperHBaseNodes, that contains the hbase znode definition. ZooKeeperWatcher extends ZooKeeperHBaseNodes. In ZKUtils, depending if a watch is involved or not, changed the expected type from ZooKeeperWatcher to ZooKeeperHBaseNodes. That's not a hack yet. The issues are: The client is supposed to wait if the root location znode is not yet created in ZK. I don't think that the trunk implementation actually works. But it's done with a watcher. As we don't want a watcher, I changed it to a loop. As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated. In trunk, the current dependencies are: RecovableZooKeeper depends(contains) on ZooKeeper ZooKeeper depends(contains) on ZooKeeperWatcher ZooKeeperWatcher depends(contains) RecovableZooKeeper ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher That makes it difficult to reuse any part of code without having a ZooKeeperWatcher. To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null. I still have to do a lot of renaming if we go for this approach. I had some failure that could be unrelated, but I haven't looked at them yet: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.io.encoding.TestLoadAndSwitchEncodeOnDisk
          Hide
          stack stack added a comment -

          ZooKeeperHBaseNodes => ReadOnlyZooKeeper?

          The client is supposed to wait if the root location znode is not yet created in ZK.

          Does it have to? We're talking of removing ROOT for 0.96? You say you converted it to a loop? In the loop it just reads zk looking for root to show up? That seems fine.

          bq, As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated.

          This seems fine too. You mean that the simple connection and the watcher are distinct? Thats ok. We only make the watcher if someone asks for it?

          The below is crazy:

          In trunk, the current dependencies are:
          RecovableZooKeeper depends(contains) on ZooKeeper
          ZooKeeper depends(contains) on ZooKeeperWatcher
          ZooKeeperWatcher depends(contains) RecovableZooKeeper
          ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher
          

          This is before your patch?

          ZooKeeper depends(contains) on ZooKeeperWatcher

          I don't understand? How does zk contain a zkw? That seems wonky.

          ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher

          It should only use rzk? Now zkw.

          To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null.

          Sorry. Why does ReadOnlyZooKeeper have to have a zkw? Because rzk has one? Can you fix that?

          Good stuff N.

          Show
          stack stack added a comment - ZooKeeperHBaseNodes => ReadOnlyZooKeeper? The client is supposed to wait if the root location znode is not yet created in ZK. Does it have to? We're talking of removing ROOT for 0.96? You say you converted it to a loop? In the loop it just reads zk looking for root to show up? That seems fine. bq, As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated. This seems fine too. You mean that the simple connection and the watcher are distinct? Thats ok. We only make the watcher if someone asks for it? The below is crazy: In trunk, the current dependencies are: RecovableZooKeeper depends(contains) on ZooKeeper ZooKeeper depends(contains) on ZooKeeperWatcher ZooKeeperWatcher depends(contains) RecovableZooKeeper ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher This is before your patch? ZooKeeper depends(contains) on ZooKeeperWatcher I don't understand? How does zk contain a zkw? That seems wonky. ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher It should only use rzk? Now zkw. To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null. Sorry. Why does ReadOnlyZooKeeper have to have a zkw? Because rzk has one? Can you fix that? Good stuff N.
          Hide
          stack stack added a comment -

          ZooKeeperHBaseNodes => ReadOnlyZooKeeper?

          The client is supposed to wait if the root location znode is not yet created in ZK.

          Does it have to? We're talking of removing ROOT for 0.96? You say you converted it to a loop? In the loop it just reads zk looking for root to show up? That seems fine.

          bq, As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated.

          This seems fine too. You mean that the simple connection and the watcher are distinct? Thats ok. We only make the watcher if someone asks for it?

          The below is crazy:

          In trunk, the current dependencies are:
          RecovableZooKeeper depends(contains) on ZooKeeper
          ZooKeeper depends(contains) on ZooKeeperWatcher
          ZooKeeperWatcher depends(contains) RecovableZooKeeper
          ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher
          

          This is before your patch?

          ZooKeeper depends(contains) on ZooKeeperWatcher

          I don't understand? How does zk contain a zkw? That seems wonky.

          ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher

          It should only use rzk? Now zkw.

          To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null.

          Sorry. Why does ReadOnlyZooKeeper have to have a zkw? Because rzk has one? Can you fix that?

          Good stuff N.

          Show
          stack stack added a comment - ZooKeeperHBaseNodes => ReadOnlyZooKeeper? The client is supposed to wait if the root location znode is not yet created in ZK. Does it have to? We're talking of removing ROOT for 0.96? You say you converted it to a loop? In the loop it just reads zk looking for root to show up? That seems fine. bq, As HConnectionImplementation now uses a simple connection and not a Watcher, the deprecated interface (that returns a ZooKeeperWatcher) cannot reuse the internal connection to ZK, but must be duplicated. This seems fine too. You mean that the simple connection and the watcher are distinct? Thats ok. We only make the watcher if someone asks for it? The below is crazy: In trunk, the current dependencies are: RecovableZooKeeper depends(contains) on ZooKeeper ZooKeeper depends(contains) on ZooKeeperWatcher ZooKeeperWatcher depends(contains) RecovableZooKeeper ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher This is before your patch? ZooKeeper depends(contains) on ZooKeeperWatcher I don't understand? How does zk contain a zkw? That seems wonky. ZKUtils depends(uses) RecovableZooKeeper and ZooKeeperWatcher It should only use rzk? Now zkw. To be able to reuse it, what's happening when using a ZooKeeperHBaseNodes is that the underlying ZooKeeperWatcher is actually null. Sorry. Why does ReadOnlyZooKeeper have to have a zkw? Because rzk has one? Can you fix that? Good stuff N.
          Hide
          stack stack added a comment -

          Patch makes sense. No need of copyright line above license when you create next version of patch. ZKHBaseNodes needs javadoc.

          Should this be in ZKUtils and not in ROZK?

          +  // Used by ZKUtil:waitForZKConnectionIfAuthenticating to wait for SASL
          +  // negotiation to complete
          +  public CountDownLatch saslLatch = new CountDownLatch(1);
          

          I see. ROZK has a RZK. Can RZK have a ZooKeeper only?

          Looks good. Looks like hard work.

          Show
          stack stack added a comment - Patch makes sense. No need of copyright line above license when you create next version of patch. ZKHBaseNodes needs javadoc. Should this be in ZKUtils and not in ROZK? + // Used by ZKUtil:waitForZKConnectionIfAuthenticating to wait for SASL + // negotiation to complete + public CountDownLatch saslLatch = new CountDownLatch(1); I see. ROZK has a RZK. Can RZK have a ZooKeeper only? Looks good. Looks like hard work.
          Hide
          nkeywal Nicolas Liochon added a comment -

          For ZK and ZKW dependency, prefixing zk for the ZooKeeper api objects and hb for the HBase objects:
          a zk.ZooKeeper can be created with a zk.Watcher as a parameter for the constructor. a hb.ZooKeeperWatcher implements the zk.Watcher interface. So we have:

          hb.ZooKeeperWatcher implements zk.ZooKeeper.
          hb.ZooKeeperWatcher contains hb.RecoverableZooKeeper
          hb.RecoverableZooKeeper contains zk.ZooKeeper
          zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)

          loop done.

          It was like this before my patch. After my patch there are two cases:
          1) As above
          2) for ZooKeeperHBaseNodes, the watcher is actually "null", so it becomes:
          hb.ZooKeeperHBaseNodes contains hb.RecoverableZooKeeper
          hb.RecoverableZooKeeper contains zk.ZooKeeper
          zk.ZooKeeper contains null

          It allows to share the code, but it makes it more complex.

          I see. ROZK has a RZK. Can RZK have a ZooKeeper only?

          Yes, but it would lead to some code duplication and we would lose the recoverable feature (or we would need to duplicate it as well).

          ZooKeeperHBaseNodes => ReadOnlyZooKeeper

          It's not really a readonly zookeeper: you can write with it.

          Looks good

          Ok, I am gonna finish it with this approach then.

          Show
          nkeywal Nicolas Liochon added a comment - For ZK and ZKW dependency, prefixing zk for the ZooKeeper api objects and hb for the HBase objects: a zk.ZooKeeper can be created with a zk.Watcher as a parameter for the constructor. a hb.ZooKeeperWatcher implements the zk.Watcher interface. So we have: hb.ZooKeeperWatcher implements zk.ZooKeeper. hb.ZooKeeperWatcher contains hb.RecoverableZooKeeper hb.RecoverableZooKeeper contains zk.ZooKeeper zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper) loop done. It was like this before my patch. After my patch there are two cases: 1) As above 2) for ZooKeeperHBaseNodes, the watcher is actually "null", so it becomes: hb.ZooKeeperHBaseNodes contains hb.RecoverableZooKeeper hb.RecoverableZooKeeper contains zk.ZooKeeper zk.ZooKeeper contains null It allows to share the code, but it makes it more complex. I see. ROZK has a RZK. Can RZK have a ZooKeeper only? Yes, but it would lead to some code duplication and we would lose the recoverable feature (or we would need to duplicate it as well). ZooKeeperHBaseNodes => ReadOnlyZooKeeper It's not really a readonly zookeeper: you can write with it. Looks good Ok, I am gonna finish it with this approach then.
          Hide
          stack stack added a comment -

          a zk.ZooKeeper can be created with a zk.Watcher as a parameter for the constructor. a hb.ZooKeeperWatcher implements the zk.Watcher interface.

          Oh, so ZK takes a Watcher Interface or Instance, not necessarily a ZKW (but I suppose in essence it the same thing). Is ZKW doing everything, not just Watching? If so, would fixing this help?

          On

          hb.ZooKeeperWatcher implements zk.ZooKeeper.
          hb.ZooKeeperWatcher contains hb.RecoverableZooKeeper
          hb.RecoverableZooKeeper contains zk.ZooKeeper
          zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)
          

          I see:

          ZKW implements ZK.Watcher (I dont' see how it implements ZK)
          ZKW has a RZK

          This seems way broke Nicolas: "zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)"

          Ok on ROZK not being a good name. HZK? Thats kinda lame but generic enough for a base r/w zk'er? Or DumbZK or NoWatchZK. Or InAndOutZK (smile).

          Show
          stack stack added a comment - a zk.ZooKeeper can be created with a zk.Watcher as a parameter for the constructor. a hb.ZooKeeperWatcher implements the zk.Watcher interface. Oh, so ZK takes a Watcher Interface or Instance, not necessarily a ZKW (but I suppose in essence it the same thing). Is ZKW doing everything, not just Watching? If so, would fixing this help? On hb.ZooKeeperWatcher implements zk.ZooKeeper. hb.ZooKeeperWatcher contains hb.RecoverableZooKeeper hb.RecoverableZooKeeper contains zk.ZooKeeper zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper) I see: ZKW implements ZK.Watcher (I dont' see how it implements ZK) ZKW has a RZK This seems way broke Nicolas: "zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)" Ok on ROZK not being a good name. HZK? Thats kinda lame but generic enough for a base r/w zk'er? Or DumbZK or NoWatchZK. Or InAndOutZK (smile).
          Hide
          nkeywal Nicolas Liochon added a comment -

          bq; This seems way broke Nicolas: "zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)"
          I wanted to say that in ZK API, you want give a Watcher as a parameter to the ZooKeeper object. In HBase, this watcher is the ZooKeeperWatcher. And this ZooKeeperWatcher contains the RecoverableZK that contains the ZooKeeper object, so we have a loop.

          Is ZKW doing everything, not just Watching? If so, would fixing this help?

          Yes it does everything. With the split there is now a new object when we just want to read/write.

          For the name, let's go for NoWatchZK.

          I'm currently testing the patch. There is still an issue with stuff like ZKAssign.getData: it sets a watcher, but is it really needed?

          Show
          nkeywal Nicolas Liochon added a comment - bq; This seems way broke Nicolas: "zk.ZooKeeper contains (hb.ZooKeeperWatcher implements zk.ZooKeeper)" I wanted to say that in ZK API, you want give a Watcher as a parameter to the ZooKeeper object. In HBase, this watcher is the ZooKeeperWatcher. And this ZooKeeperWatcher contains the RecoverableZK that contains the ZooKeeper object, so we have a loop. Is ZKW doing everything, not just Watching? If so, would fixing this help? Yes it does everything. With the split there is now a new object when we just want to read/write. For the name, let's go for NoWatchZK. I'm currently testing the patch. There is still an issue with stuff like ZKAssign.getData: it sets a watcher, but is it really needed?
          Hide
          nkeywal Nicolas Liochon added a comment -

          Oops; there is another big issue: you need a watcher if you want to get the info on session expiry... So isolating it is not possible. That breaks the patch it seems.

          Show
          nkeywal Nicolas Liochon added a comment - Oops; there is another big issue: you need a watcher if you want to get the info on session expiry... So isolating it is not possible. That breaks the patch it seems.
          Hide
          nkeywal Nicolas Liochon added a comment -

          v6. We keep the watcher structure, but there is no wach set on data during the connection creation or the startup + cleanup. Less ambitious, simpler, smaller.

          Show
          nkeywal Nicolas Liochon added a comment - v6. We keep the watcher structure, but there is no wach set on data during the connection creation or the startup + cleanup. Less ambitious, simpler, smaller.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12519310/5573.v6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519310/5573.v6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1242//console This message is automatically generated.
          Hide
          zhihyu@ebaysf.com Ted Yu added a comment -

          The failed test was due to OOME.
          Patch v6 looks good.

          +        }finally {
          +          zkw.close();
          

          Please insert a space between } and finally.

          Run 'arc lint' to see all formatting warnings.

          Show
          zhihyu@ebaysf.com Ted Yu added a comment - The failed test was due to OOME. Patch v6 looks good. + } finally { + zkw.close(); Please insert a space between } and finally. Run 'arc lint' to see all formatting warnings.
          Hide
          stack stack added a comment -

          This is radical in the ReplicationAdmin:

          +        System.exit(1);
          

          This is a client only? Maybe get the Abortable the this.connection is using? Would that make sense?

          Hmm... you do it in hbasefsck too.

          Why not add a create method to ZooKeeperWatcher that takes a name, conf, and Abortable? Or is that a ZKW Constructor altogether?

          Creating the ZKW each time is probably expensive, takes time? But its ok in ReplicationAdmin and in HBaseFSCK I would say?

          In testing, do we want to rethrow what caused an abort? Perhaps rethrow as a RuntimeException?

          +        @Override public void abort(String why, Throwable e) {}
          

          N, can you explain more about what is going on here. How is it that we are not taking a Watcher when we are creating a ZKW? Because we don't call start? (If so, that'd be 'elegant' solution)

          Show
          stack stack added a comment - This is radical in the ReplicationAdmin: + System .exit(1); This is a client only? Maybe get the Abortable the this.connection is using? Would that make sense? Hmm... you do it in hbasefsck too. Why not add a create method to ZooKeeperWatcher that takes a name, conf, and Abortable? Or is that a ZKW Constructor altogether? Creating the ZKW each time is probably expensive, takes time? But its ok in ReplicationAdmin and in HBaseFSCK I would say? In testing, do we want to rethrow what caused an abort? Perhaps rethrow as a RuntimeException? + @Override public void abort( String why, Throwable e) {} N, can you explain more about what is going on here. How is it that we are not taking a Watcher when we are creating a ZKW? Because we don't call start? (If so, that'd be 'elegant' solution)
          Hide
          nkeywal Nicolas Liochon added a comment -

          System.exit(1);

          Actually is was already like that in hbasefsck, I replaced a tracker by a watcher that does not watch to read the data, that's all.

          Why not add a create method to ZooKeeperWatcher that takes a name, conf, and Abortable? Or is that a ZKW Constructor altogether?

          Yes, the question is what to do when you're asked to abort. Here I reused the approach in hbasefsck, just exit.

          N, can you explain more about what is going on here. How is it that we are not taking a Watcher when we are creating a ZKW? Because we don't call start? (If so, that'd be 'elegant' solution)

          A ZKW is a watcher. When you create a ZKW, you create a RecoverableZooKeeper with yourself as a parameter. Pseudo code is:

          class RecoverableZooKeeper {
           ZooKeeper zk;
           RecoverableZooKeeper (Watcher w){ zk=new ZooKeeper(w) }
          }
          
          class ZooKeeperWatcher implements Watcher  {
           RecoverableZooKeeper rz;
           ZooKeeperWatcher (){ rz = new RecoverableZooKeeper(this); }
          }
          

          Using 'this' in a constructor is looking for problems but it works in this case (remember, that's the existing code, not mine ). Basically all these classes are very strongly coupled. When I tried to partially decouple them it exploded in my hands because you anyway need a watcher to manage the session expiry stuff. I don't have a middle solution here: it's either a full rewriting with a lot of fun to keep the existing interfaces for backward compatibility or nothing.

          So in the final patch I've just done some cleanup (removed the last usage of getZooKeeperWatcher) and the usage of any watcher. So there's no proof in the code, just that actually all the functions we use on the client don't use a watcher. Anyway, they have a session in the ZK servers so they are expensive. But thanks to #5399 the session on ZK will be closed after 5 minutes. So if you have an architecture with clients coming up and down, you will be able to increase the number of clients.

          Three last comments:

          • one of the design issue is that there ate two API: you can use directly any of the ZKW, RZK, RK object or you can go through the static ZKUtils. May be the intermediate solutions lie around this area.
          • even if the existing design should not be shown to innocent scholars it's not that terrible, because it's small. I didn't really like my first patches because I was adding more classes and complexity without fixing the design.
          • On the long term, I think that it actually make sense to have a watcher in the client. It's not about the previous code: The previous code was not really using watchers. The previous code was setting watchers without using them. The new code (after #5399 and #5573) does not use or set watchers. But when you have a fat client architecture like we have, it makes sense to share some global state information, and it scales better when the info is pushed vs. pulled. Having said that, there are many questions left: possible issues in how expensive it is with ZooKeeper today, may be ZooKeeper is not really designed for this (it's not really a global coordination work, as the client would be readers only) and so on. FWIW, it seems that the current limit is around 10K sessions in ZK:

          Patrick Hunt / Nov 18, 2010; 8:57pm
          Re: number of clients/watchers

          fyi: I haven't heard of anyone running over 10k sessions. I've tried 20k before and had issues.
          [...]
          A session is represented by a "ZooKeeper" object. One session per object. So if you have 10 client hosts each creating it's own ZooKeeper instance you'll have 10 sessions. This is regardless of the number of znodes, watches, etc... Watches were designed to be lightweight and you can maintain a large number of them. (25million spread across 500 sessions in my example)

          There were also a discussion on ZK mailing list about lightweith sessions. http://markmail.org/message/cyow2xkneh2t3juc

          Show
          nkeywal Nicolas Liochon added a comment - System.exit(1); Actually is was already like that in hbasefsck, I replaced a tracker by a watcher that does not watch to read the data, that's all. Why not add a create method to ZooKeeperWatcher that takes a name, conf, and Abortable? Or is that a ZKW Constructor altogether? Yes, the question is what to do when you're asked to abort. Here I reused the approach in hbasefsck, just exit. N, can you explain more about what is going on here. How is it that we are not taking a Watcher when we are creating a ZKW? Because we don't call start? (If so, that'd be 'elegant' solution) A ZKW is a watcher. When you create a ZKW, you create a RecoverableZooKeeper with yourself as a parameter. Pseudo code is: class RecoverableZooKeeper { ZooKeeper zk; RecoverableZooKeeper (Watcher w){ zk=new ZooKeeper(w) } } class ZooKeeperWatcher implements Watcher { RecoverableZooKeeper rz; ZooKeeperWatcher (){ rz = new RecoverableZooKeeper(this); } } Using 'this' in a constructor is looking for problems but it works in this case (remember, that's the existing code, not mine ). Basically all these classes are very strongly coupled. When I tried to partially decouple them it exploded in my hands because you anyway need a watcher to manage the session expiry stuff. I don't have a middle solution here: it's either a full rewriting with a lot of fun to keep the existing interfaces for backward compatibility or nothing. So in the final patch I've just done some cleanup (removed the last usage of getZooKeeperWatcher) and the usage of any watcher. So there's no proof in the code, just that actually all the functions we use on the client don't use a watcher. Anyway, they have a session in the ZK servers so they are expensive. But thanks to #5399 the session on ZK will be closed after 5 minutes. So if you have an architecture with clients coming up and down, you will be able to increase the number of clients. Three last comments: one of the design issue is that there ate two API: you can use directly any of the ZKW, RZK, RK object or you can go through the static ZKUtils. May be the intermediate solutions lie around this area. even if the existing design should not be shown to innocent scholars it's not that terrible, because it's small. I didn't really like my first patches because I was adding more classes and complexity without fixing the design. On the long term, I think that it actually make sense to have a watcher in the client. It's not about the previous code: The previous code was not really using watchers. The previous code was setting watchers without using them. The new code (after #5399 and #5573) does not use or set watchers. But when you have a fat client architecture like we have, it makes sense to share some global state information, and it scales better when the info is pushed vs. pulled. Having said that, there are many questions left: possible issues in how expensive it is with ZooKeeper today, may be ZooKeeper is not really designed for this (it's not really a global coordination work, as the client would be readers only) and so on. FWIW, it seems that the current limit is around 10K sessions in ZK: Patrick Hunt / Nov 18, 2010; 8:57pm Re: number of clients/watchers fyi: I haven't heard of anyone running over 10k sessions. I've tried 20k before and had issues. [...] A session is represented by a "ZooKeeper" object. One session per object. So if you have 10 client hosts each creating it's own ZooKeeper instance you'll have 10 sessions. This is regardless of the number of znodes, watches, etc... Watches were designed to be lightweight and you can maintain a large number of them. (25million spread across 500 sessions in my example) There were also a discussion on ZK mailing list about lightweith sessions. http://markmail.org/message/cyow2xkneh2t3juc
          Hide
          stack stack added a comment -

          Ok on your your just redoing what hbasefsck was doing anyways.

          Regards the pseudo-code you drew out for me where you show how RZKW relates to ZKW relates to ZK, it makes my head hurt. If the 'Watcher' implementation was broken out into a standalone class that might help some but my guess is that that'd be big mess to untangle ("...exploded in my hands").

          So there's no proof in the code, just that actually all the functions we use on the client don't use a watcher.

          Excellent

          one of the design issue is that there ate two API: you can use directly any of the ZKW, RZK, RK object or you can go through the static ZKUtils. May be the intermediate solutions lie around this area.

          OK. Any recommendation you can make here having been down deep in this code? We should make everyone go via ZKUtils and via ZKAssign, etc., and clean up any other errant use of zkw directly? Would that be good to do?

          But when you have a fat client architecture like we have, it makes sense to share some global state information, and it scales better when the info is pushed vs. pulled.

          I'd like to get to the case where we have not watchers – which this patch is finishing – and then have the above discussion subsequently. I'd think its more scalable if clients do not keep open sessions and keep watchers. But we can talk about that some other time after we've let go at least of watchers.

          Let me look at your last patch.

          Show
          stack stack added a comment - Ok on your your just redoing what hbasefsck was doing anyways. Regards the pseudo-code you drew out for me where you show how RZKW relates to ZKW relates to ZK, it makes my head hurt. If the 'Watcher' implementation was broken out into a standalone class that might help some but my guess is that that'd be big mess to untangle ("...exploded in my hands"). So there's no proof in the code, just that actually all the functions we use on the client don't use a watcher. Excellent one of the design issue is that there ate two API: you can use directly any of the ZKW, RZK, RK object or you can go through the static ZKUtils. May be the intermediate solutions lie around this area. OK. Any recommendation you can make here having been down deep in this code? We should make everyone go via ZKUtils and via ZKAssign, etc., and clean up any other errant use of zkw directly? Would that be good to do? But when you have a fat client architecture like we have, it makes sense to share some global state information, and it scales better when the info is pushed vs. pulled. I'd like to get to the case where we have not watchers – which this patch is finishing – and then have the above discussion subsequently. I'd think its more scalable if clients do not keep open sessions and keep watchers. But we can talk about that some other time after we've let go at least of watchers. Let me look at your last patch.
          Hide
          stack stack added a comment -

          Patch looks good. Only question is the one I had yesterday where in HBaseTestingUtility#getZooKeeperWatcher, if its aborted, it does nothing but this ZKW is being used by test code so I'd think if an abort, it shouldn't be suppressed – rather we should complain loudly? Rethrow as RuntimeException?

          Do you want to be consistent? You call methods getZKW most times and then getZooKeeperWatcher in this test code (I prefer the latter).

          Show
          stack stack added a comment - Patch looks good. Only question is the one I had yesterday where in HBaseTestingUtility#getZooKeeperWatcher, if its aborted, it does nothing but this ZKW is being used by test code so I'd think if an abort, it shouldn't be suppressed – rather we should complain loudly? Rethrow as RuntimeException? Do you want to be consistent? You call methods getZKW most times and then getZooKeeperWatcher in this test code (I prefer the latter).
          Hide
          nkeywal Nicolas Liochon added a comment -

          OK. Any recommendation you can make here having been down deep in this code? We should make everyone go via ZKUtils and via ZKAssign, etc., and clean up any other errant use of zkw directly? Would that be good to do?

          It would do no harm as it's not good to have two APIs. I could be a first step to change the internal design. I haven't checked the impact.

          Do you want to be consistent? You call methods getZKW most times and then getZooKeeperWatcher in this test code (I prefer the latter).

          Ok, I will change all this to getZooKeeperWatcher.

          Show
          nkeywal Nicolas Liochon added a comment - OK. Any recommendation you can make here having been down deep in this code? We should make everyone go via ZKUtils and via ZKAssign, etc., and clean up any other errant use of zkw directly? Would that be good to do? It would do no harm as it's not good to have two APIs. I could be a first step to change the internal design. I haven't checked the impact. Do you want to be consistent? You call methods getZKW most times and then getZooKeeperWatcher in this test code (I prefer the latter). Ok, I will change all this to getZooKeeperWatcher.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520262/5573.v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1327//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520262/5573.v7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1327//console This message is automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520266/5573.v8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 12 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.coprocessor.TestMasterObserver
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520266/5573.v8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1328//console This message is automatically generated.
          Hide
          nkeywal Nicolas Liochon added a comment -

          It can be committed imho.

          Show
          nkeywal Nicolas Liochon added a comment - It can be committed imho.
          Hide
          zhihyu@ebaysf.com Ted Yu added a comment -

          I ran TestMasterObserver with patch v8 and didn't see failure.

          Integrated to trunk.

          Thanks for the patch, N.

          Thanks for the review, Stack.

          Show
          zhihyu@ebaysf.com Ted Yu added a comment - I ran TestMasterObserver with patch v8 and didn't see failure. Integrated to trunk. Thanks for the patch, N. Thanks for the review, Stack.
          Hide
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK #2699 (See https://builds.apache.org/job/HBase-TRUNK/2699/)
          HBASE-5573 Replace client ZooKeeper watchers by simple ZooKeeper reads (N Keywal) (Revision 1307549)

          Result = FAILURE
          tedyu :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
          Show
          hudson Hudson added a comment - Integrated in HBase-TRUNK #2699 (See https://builds.apache.org/job/HBase-TRUNK/2699/ ) HBASE-5573 Replace client ZooKeeper watchers by simple ZooKeeper reads (N Keywal) (Revision 1307549) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
          Hide
          hudson Hudson added a comment -

          Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/)
          HBASE-5573 Replace client ZooKeeper watchers by simple ZooKeeper reads (N Keywal) (Revision 1307549)

          Result = SUCCESS
          tedyu :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
          Show
          hudson Hudson added a comment - Integrated in HBase-TRUNK-security #155 (See https://builds.apache.org/job/HBase-TRUNK-security/155/ ) HBASE-5573 Replace client ZooKeeper watchers by simple ZooKeeper reads (N Keywal) (Revision 1307549) Result = SUCCESS tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
          Hide
          stack stack added a comment -

          Marking closed.

          Show
          stack stack added a comment - Marking closed.
          Hide
          stack stack added a comment -

          Nicolas Liochon So we didn't complete undo watching in the client? What should the release not be here? Thanks.

          Show
          stack stack added a comment - Nicolas Liochon So we didn't complete undo watching in the client? What should the release not be here? Thanks.
          Hide
          mantonov Mikhail Antonov added a comment -

          linking as there was a discussion on that CatalogTracking issue.

          Show
          mantonov Mikhail Antonov added a comment - linking as there was a discussion on that CatalogTracking issue.

            People

            • Assignee:
              nkeywal Nicolas Liochon
              Reporter:
              nkeywal Nicolas Liochon
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development