HBase
  1. HBase
  2. HBASE-5790

ZKUtil deleteRecursively should be a recoverable operation

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Unscheduling from 0.94.x, because ZK 3.4.x requirement.

      Description

      As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means we can wholesale delete chunks of the zk tree and ensure that we don't have any pesky recursive delete issues where we delete the children of a node, but then a child joins before deletion of the parent. Even without transactions, this should be the behavior, but it is possible to make it much cleaner now that we have this new feature in zk.

      1. java_HBASE-5790.patch
        8 kB
        Jesse Yates
      2. java_HBASE-5790-v1.patch
        10 kB
        Jesse Yates

        Issue Links

          Activity

          Hide
          Jesse Yates added a comment -

          patch coming momentarily.

          Show
          Jesse Yates added a comment - patch coming momentarily.
          Hide
          Jesse Yates added a comment -

          Attaching patch and simple test case for recoverableZK. No need to update ZKUtil test since deleteRecurisive already covered.

          Wish there was a better way to test this out...tried doing a better version via Mockito, but seems to not be able to catch the actual method invocations and let me do extra work (though doAnswer should support it).

          Either way, did a hacked version where it dumps a node in middle while collecting children (setting a callable in the test that is called periodically in the delete), and seemed to work fine. Not recommending that we pursue the same course for general testing, but just putting this up here for peace of mind for all.

          Show
          Jesse Yates added a comment - Attaching patch and simple test case for recoverableZK. No need to update ZKUtil test since deleteRecurisive already covered. Wish there was a better way to test this out...tried doing a better version via Mockito, but seems to not be able to catch the actual method invocations and let me do extra work (though doAnswer should support it). Either way, did a hacked version where it dumps a node in middle while collecting children (setting a callable in the test that is called periodically in the delete), and seemed to work fine. Not recommending that we pursue the same course for general testing, but just putting this up here for peace of mind for all.
          Hide
          Ted Yu added a comment -
          +   * Recursively delete the path all children on that path.
          

          'the path all' -> 'all the'.

          +          retryOrThrow(retryCounter, e, "delete");
          

          "delete" should be "deleteRecursively".

          +  private void addAllChildrenToDeleteTransaction(Transaction trans, String path, int version)
          

          I don't see how version is used to filter child nodes in the above method.

          Show
          Ted Yu added a comment - + * Recursively delete the path all children on that path. 'the path all' -> 'all the'. + retryOrThrow(retryCounter, e, "delete" ); "delete" should be "deleteRecursively". + private void addAllChildrenToDeleteTransaction(Transaction trans, String path, int version) I don't see how version is used to filter child nodes in the above method.
          Hide
          Lars Hofhansl added a comment -

          Patch looks good.
          Will there be any request size problem if the subtree to delete is large?
          Super minor nit: Should point out in Javadoc that deleteRecursively and an idempotent operation.

          Show
          Lars Hofhansl added a comment - Patch looks good. Will there be any request size problem if the subtree to delete is large? Super minor nit: Should point out in Javadoc that deleteRecursively and an idempotent operation.
          Hide
          Jesse Yates added a comment -

          I don't see how version is used to filter child nodes in the above method.

          It works because of the line:

          trans.delete(path, version);
          

          in addAllChildrenToDeleteTransaction - only children with the same version will be deleted. Generally, we are just going to want to delete all versions (version == -1), so don't often expect funky cases, but this way also ensures that you just delete children with the same version as the parent (and fail if those versions don't match up). I didn't want to go with any assumptions on versions of children vs. parent since there was not need for it yet AFAIK.

          Will there be any request size problem if the subtree to delete is large?

          Not unless you are going to blow out the stack or memory. However, that seems incredibly unlikely. Seems a little excessive at this point to do the tail recursion optimization, but can switch to that if it seems a big issue.

          Show
          Jesse Yates added a comment - I don't see how version is used to filter child nodes in the above method. It works because of the line: trans.delete(path, version); in addAllChildrenToDeleteTransaction - only children with the same version will be deleted. Generally, we are just going to want to delete all versions (version == -1), so don't often expect funky cases, but this way also ensures that you just delete children with the same version as the parent (and fail if those versions don't match up). I didn't want to go with any assumptions on versions of children vs. parent since there was not need for it yet AFAIK. Will there be any request size problem if the subtree to delete is large? Not unless you are going to blow out the stack or memory. However, that seems incredibly unlikely. Seems a little excessive at this point to do the tail recursion optimization, but can switch to that if it seems a big issue.
          Hide
          Jesse Yates added a comment -

          Attaching new version addressing comments and now with more testing!

          Show
          Jesse Yates added a comment - Attaching new version addressing comments and now with more testing!
          Hide
          Lars Hofhansl added a comment -

          @Jesse: Was more referring to the request size that is shipped to ZK. Probably not a problem.

          Show
          Lars Hofhansl added a comment - @Jesse: Was more referring to the request size that is shipped to ZK. Probably not a problem.
          Hide
          Lars Hofhansl added a comment -

          +1 on V2

          Show
          Lars Hofhansl added a comment - +1 on V2
          Hide
          stack added a comment -

          This patch requires zk 3.4.x right but it doesn't check that version running before it goes and uses this new Transaction feature (I'm not sure if you even can ask zk its ensemble version from the client)? If a user puts 3.3.x under hbase, we'll hang doing this call?

          Show
          stack added a comment - This patch requires zk 3.4.x right but it doesn't check that version running before it goes and uses this new Transaction feature (I'm not sure if you even can ask zk its ensemble version from the client)? If a user puts 3.3.x under hbase, we'll hang doing this call?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12522670/java_HBASE-5790-v1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522670/java_HBASE-5790-v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1559//console This message is automatically generated.
          Hide
          Jesse Yates added a comment -

          @stack: in short, yes. I don't see an easy way to set zk version, though I guess we could add a config option if people want to run an old version (but this gets into some really strange corner cases where we expect transaction for newer features, but have to support non-transactional views; I'd rather just go wholesale into the new stuff).
          I didn't know that we have a requirement on supporting past versions of ZK. If we want to roll this into just 0.96 we can assume singularity and say people need to run at least zk-3.4.

          Show
          Jesse Yates added a comment - @stack: in short, yes. I don't see an easy way to set zk version, though I guess we could add a config option if people want to run an old version (but this gets into some really strange corner cases where we expect transaction for newer features, but have to support non-transactional views; I'd rather just go wholesale into the new stuff). I didn't know that we have a requirement on supporting past versions of ZK. If we want to roll this into just 0.96 we can assume singularity and say people need to run at least zk-3.4.
          Hide
          Jesse Yates added a comment -

          Looks like the tests pass on that QA run, but I guess its angry about the second stage (related to keywals's recent comments on dev?). Here is the output line from the console:

          Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (secondPartTestsExecution) on project hbase: Failure or timeout

          Is there an easy way to see which findbugs warnings were introduced? I figure since we are tracking them correctly, we should have a way to say what they are, but I couldn't see a way to do it besides checking against a good run and that is a huge pain...

          Show
          Jesse Yates added a comment - Looks like the tests pass on that QA run, but I guess its angry about the second stage (related to keywals's recent comments on dev?). Here is the output line from the console: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK- HBASE-2 :test (secondPartTestsExecution) on project hbase: Failure or timeout Is there an easy way to see which findbugs warnings were introduced? I figure since we are tracking them correctly, we should have a way to say what they are, but I couldn't see a way to do it besides checking against a good run and that is a huge pain...
          Hide
          Jesse Yates added a comment -

          Adding link to transactionally createWithParents ticket that also uses the zookeeper.multi functionality.

          Show
          Jesse Yates added a comment - Adding link to transactionally createWithParents ticket that also uses the zookeeper.multi functionality.
          Hide
          Ted Yu added a comment -

          Our codebase uses NIOServerCnxnFactory which is not in zookeeper 3.3
          Meaning zookeeper 3.4 is required.

          Show
          Ted Yu added a comment - Our codebase uses NIOServerCnxnFactory which is not in zookeeper 3.3 Meaning zookeeper 3.4 is required.
          Hide
          stack added a comment -

          @Ted Its not a runtime requirement that client or ensemble be 3.4.x. 3.4.x client and ensemble is required if you run secure hbase else its not necessary and we should be wary requiring it; e.g. our ops didn't want to upgrade to 3.4.x ensemble just yet and so we run w/ a 3.4.x client against 3.3.x ensemble.

          @Jesse Sounds fine requiring 3.4.x in 0.96. Want to raise a conversation out on mailing list?

          Show
          stack added a comment - @Ted Its not a runtime requirement that client or ensemble be 3.4.x. 3.4.x client and ensemble is required if you run secure hbase else its not necessary and we should be wary requiring it; e.g. our ops didn't want to upgrade to 3.4.x ensemble just yet and so we run w/ a 3.4.x client against 3.3.x ensemble. @Jesse Sounds fine requiring 3.4.x in 0.96. Want to raise a conversation out on mailing list?
          Hide
          stack added a comment -

          Removing improvement that depends on 3.4zk.

          Our Greg added a flag where you can set it if the quorum is 3.4.

           12   <property>
           11     <name>hbase.zookeeper.useMulti</name>
           10     <value>false</value>
            9     <description>Instructs HBase to make use of ZooKeeper's multi-update functionality.
            8     This allows certain ZooKeeper operations to complete more quickly and prevents some issues
            7     with rare Replication failure scenarios (see the release note of HBASE-2611 for an example).
            6     IMPORTANT: only set this to true if all ZooKeeper servers in the cluster are on version 3.4+
            5     and will not be downgraded.  ZooKeeper versions before 3.4 do not support multi-update and will
            4     not fail gracefully if multi-update is invoked (see ZOOKEEPER-1495).
            3     </description>
            2   </property>
          

          If patch was refactored to use this config., I'd commit.

          Show
          stack added a comment - Removing improvement that depends on 3.4zk. Our Greg added a flag where you can set it if the quorum is 3.4. 12 <property> 11 <name>hbase.zookeeper.useMulti</name> 10 <value> false </value> 9 <description>Instructs HBase to make use of ZooKeeper's multi-update functionality. 8 This allows certain ZooKeeper operations to complete more quickly and prevents some issues 7 with rare Replication failure scenarios (see the release note of HBASE-2611 for an example). 6 IMPORTANT: only set this to true if all ZooKeeper servers in the cluster are on version 3.4+ 5 and will not be downgraded. ZooKeeper versions before 3.4 do not support multi-update and will 4 not fail gracefully if multi-update is invoked (see ZOOKEEPER-1495). 3 </description> 2 </property> If patch was refactored to use this config., I'd commit.

            People

            • Assignee:
              Jesse Yates
              Reporter:
              Jesse Yates
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development