Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-842

Fate operations should be rolled into shell

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: shell
    • Labels:
      None

      Description

      Fate is nifty but the only ways to deal with it are by directly calling some utility jars. We should roll this functionality into the shell to make it easier to manage.

      1. FateCommand.patch
        17 kB
        Damon A. Brown
      2. FateCommand-2.patch
        21 kB
        Damon A. Brown
      3. FateCommand-3.patch
        21 kB
        Damon A. Brown
      4. FateCommand-4.patch
        21 kB
        Damon A. Brown

        Activity

        Hide
        medined David Medinets added a comment -

        Is there a way to describe this task more concretely so that others
        might attempt it? Can the work be phased? Why is this a bug? Can you
        estimate the difficulty level?

        Show
        medined David Medinets added a comment - Is there a way to describe this task more concretely so that others might attempt it? Can the work be phased? Why is this a bug? Can you estimate the difficulty level?
        Hide
        kturner Keith Turner added a comment -

        Is there a way to describe this task more concretely

        Currently you can work w/ FATE operations through the following command.

        ./bin/accumulo org.apache.accumulo.server.fate.Admin 
        Usage : Admin fail <txid> | delete <txid> | print
        

        If this were available through shell, it would be more convenient.

        Show
        kturner Keith Turner added a comment - Is there a way to describe this task more concretely Currently you can work w/ FATE operations through the following command. ./bin/accumulo org.apache.accumulo.server.fate.Admin Usage : Admin fail <txid> | delete <txid> | print If this were available through shell, it would be more convenient.
        Hide
        vines John Vines added a comment -

        Keith nailed it. It's basically tweaking that fate.Admin class to be a bit more multi-purpose and then writing a shell Command to utilize it.

        Show
        vines John Vines added a comment - Keith nailed it. It's basically tweaking that fate.Admin class to be a bit more multi-purpose and then writing a shell Command to utilize it.
        Hide
        dab5879 Damon A. Brown added a comment -

        Added patch containing Fate shell comands:
        txfail <txid>
        txdelete <txid>
        txprint

        fail|delete validate against the master lock and running transaction IDs. User can override master lock to change transaction state.

        Show
        dab5879 Damon A. Brown added a comment - Added patch containing Fate shell comands: txfail <txid> txdelete <txid> txprint fail|delete validate against the master lock and running transaction IDs. User can override master lock to change transaction state.
        Hide
        kturner Keith Turner added a comment -

        Damon,

        Thanks for the patch, I have a few comments.

        • I am not keen on having an option to run if the master lock is held. I think modifications to Fate ops should only run if the master lock is not held. If there is a buggy fate op thats stuck in an infinite loop, deleting it from the ZooStore will not necessarily kill the thread in the master. In the case a user would still need to kill the master. I would not want the user to get the impression that this shell command will clean up state in the master process, it will not. Also, could get strange behavior in the master from concurrent modification of zookeeper. I think always killing the master process is the safest option. Killing the master for a bit should not impact reads and writes to Accumulo, but it will stall table operations.
        • seems to duplicates some code in AdminUtil, consolidate?
        • TxFailCommand.prepFail() is a nice improvement, I think this could be moved to fate.AdminUtil and have shell call AdminUtil
        • TxDeleteCommand.prepDelete() should always delete, even its NEW or IN_PROGRESS. Consider a buggy fate op thats in progress and always gets stuck in an infinite loop when it tries to run. Maybe just use fate.AdminUtil.prepDelete().
        • consider using shellState.printLines(), it does pagination, in TxPrintCommand
        • javadoc for TxPrintCommand wrong
        • I think I am slightly in favor of having one shell command with options, less clutter in the shells list of commands. Was there a reason you chose to create three commands?
        • Did you use the Accumulo Eclipse formatter? The formatting seems mostly correct, but I am seeing some slight diffs like extra spaces in parametric types. Its odd.
        • May want to consider making the current Admin command log a warning saying its deperecated and point users to the shell commands.

        For testing, you can pick a random fate op and add an infinite loop to it

        Show
        kturner Keith Turner added a comment - Damon, Thanks for the patch, I have a few comments. I am not keen on having an option to run if the master lock is held. I think modifications to Fate ops should only run if the master lock is not held. If there is a buggy fate op thats stuck in an infinite loop, deleting it from the ZooStore will not necessarily kill the thread in the master. In the case a user would still need to kill the master. I would not want the user to get the impression that this shell command will clean up state in the master process, it will not. Also, could get strange behavior in the master from concurrent modification of zookeeper. I think always killing the master process is the safest option. Killing the master for a bit should not impact reads and writes to Accumulo, but it will stall table operations. seems to duplicates some code in AdminUtil, consolidate? TxFailCommand.prepFail() is a nice improvement, I think this could be moved to fate.AdminUtil and have shell call AdminUtil TxDeleteCommand.prepDelete() should always delete, even its NEW or IN_PROGRESS. Consider a buggy fate op thats in progress and always gets stuck in an infinite loop when it tries to run. Maybe just use fate.AdminUtil.prepDelete(). consider using shellState.printLines(), it does pagination, in TxPrintCommand javadoc for TxPrintCommand wrong I think I am slightly in favor of having one shell command with options, less clutter in the shells list of commands. Was there a reason you chose to create three commands? Did you use the Accumulo Eclipse formatter? The formatting seems mostly correct, but I am seeing some slight diffs like extra spaces in parametric types. Its odd. May want to consider making the current Admin command log a warning saying its deperecated and point users to the shell commands. For testing, you can pick a random fate op and add an infinite loop to it
        Hide
        dab5879 Damon A. Brown added a comment -

        The new patch incorporates your comments. I was confused about accumulo-server vs. accumulo-fate being available within accumulo-core.

        • Integrated and enhanced AdminUtil with the switch statement and to support running within the shell (no System.exit call)
        • Consolidated to one 'fate fail|delete|print' command within 'shell state commands' - not sure if this is the best place
        • Blocks fail|delete command on master lock
        • Fixed src code formatting
        • Pagination on 'fate print'
        • Added deprecation warning to Admin
        Show
        dab5879 Damon A. Brown added a comment - The new patch incorporates your comments. I was confused about accumulo-server vs. accumulo-fate being available within accumulo-core. Integrated and enhanced AdminUtil with the switch statement and to support running within the shell (no System.exit call) Consolidated to one 'fate fail|delete|print' command within 'shell state commands' - not sure if this is the best place Blocks fail|delete command on master lock Fixed src code formatting Pagination on 'fate print' Added deprecation warning to Admin
        Hide
        ecn Eric Newton added a comment -

        It would be good if you could run these commands without access to the site configuration file. That is, supply the secret, optionally, at the command prompt.

        Show
        ecn Eric Newton added a comment - It would be good if you could run these commands without access to the site configuration file. That is, supply the secret, optionally, at the command prompt.
        Hide
        kturner Keith Turner added a comment -

        Damon the latest patch looks good. A few more comments.

        • The patch introduced some java warnings, like unused imports. Does not bother me but introducing java warnings bugs some of the other committers.
        • The commands accept and ignore extra arguments. For example it will take "fate print foo" or "fate delete 6cf5ff841bcab678 0146bbf43749e62f".
        • I agree with the comment Eric made. Give the ability to to supply the secret as an option. If supplied, do not try to read local config file.

        I like just adding new patches to a ticket, rather than replacing patches. Like FateCommand-1.patch, FateCommand-2.patch, etc. I think it can makes following the discussion in the comments easier if someone is trying to correlate comments with a patch. But thats jsut my personal preference.

        Show
        kturner Keith Turner added a comment - Damon the latest patch looks good. A few more comments. The patch introduced some java warnings, like unused imports. Does not bother me but introducing java warnings bugs some of the other committers. The commands accept and ignore extra arguments. For example it will take "fate print foo" or "fate delete 6cf5ff841bcab678 0146bbf43749e62f". I agree with the comment Eric made. Give the ability to to supply the secret as an option. If supplied, do not try to read local config file. I like just adding new patches to a ticket, rather than replacing patches. Like FateCommand-1.patch, FateCommand-2.patch, etc. I think it can makes following the discussion in the comments easier if someone is trying to correlate comments with a patch. But thats jsut my personal preference.
        Hide
        dab5879 Damon A. Brown added a comment -

        Keith and Eric,

        Thanks for your input! I've incorporated your changes into FateCommand-2.patch.

        • Imports fixed - my Save actions don't appear to be working correctly...weird.
        • fail and delete both now take multiple transaction IDs
        • print can now filter on transaction ID and/or status type (-t arg)
        • Instance secret is specified via -s arg

        I hope this is getting closer to your vision. Thanks for letting me contribute!

        • Damon
        Show
        dab5879 Damon A. Brown added a comment - Keith and Eric, Thanks for your input! I've incorporated your changes into FateCommand-2.patch. Imports fixed - my Save actions don't appear to be working correctly...weird. fail and delete both now take multiple transaction IDs print can now filter on transaction ID and/or status type (-t arg) Instance secret is specified via -s arg I hope this is getting closer to your vision. Thanks for letting me contribute! Damon
        Hide
        kturner Keith Turner added a comment -

        I tried the latest patch. "fate print" never prints anything even when I know there are Fate operations.

        Show
        kturner Keith Turner added a comment - I tried the latest patch. "fate print" never prints anything even when I know there are Fate operations.
        Hide
        dab5879 Damon A. Brown added a comment -

        Off-by-one array error...fixed: args.length >= 2

        Show
        dab5879 Damon A. Brown added a comment - Off-by-one array error...fixed: args.length >= 2
        Hide
        kturner Keith Turner added a comment -

        Looking at patch #3, it seems the -t option does not work.

        root@test16> fate print
        txid: 5f5ca934ead74225  status: FAILED_IN_PROGRESS  op: CreateTable      locked: [W:6]           locking: []              top: FinishCreateTable
        txid: 46cb3e1491f1a4fe  status: IN_PROGRESS         op: CreateTable      locked: [W:4]           locking: []              top: FinishCreateTable
        txid: 63862690ab0a9b96  status: IN_PROGRESS         op: CreateTable      locked: [W:5]           locking: []              top: FinishCreateTable
         3 transactions
        root@test16> fate print -t IN_PROGRESS       
         0 transactions
        root@test16> fate print
        txid: 5f5ca934ead74225  status: FAILED_IN_PROGRESS  op: CreateTable      locked: [W:6]           locking: []              top: FinishCreateTable
        txid: 46cb3e1491f1a4fe  status: IN_PROGRESS         op: CreateTable      locked: [W:4]           locking: []              top: FinishCreateTable
        txid: 63862690ab0a9b96  status: IN_PROGRESS         op: CreateTable      locked: [W:5]           locking: []              top: FinishCreateTable
         3 transactions
        
        Show
        kturner Keith Turner added a comment - Looking at patch #3, it seems the -t option does not work. root@test16> fate print txid: 5f5ca934ead74225 status: FAILED_IN_PROGRESS op: CreateTable locked: [W:6] locking: [] top: FinishCreateTable txid: 46cb3e1491f1a4fe status: IN_PROGRESS op: CreateTable locked: [W:4] locking: [] top: FinishCreateTable txid: 63862690ab0a9b96 status: IN_PROGRESS op: CreateTable locked: [W:5] locking: [] top: FinishCreateTable 3 transactions root@test16> fate print -t IN_PROGRESS 0 transactions root@test16> fate print txid: 5f5ca934ead74225 status: FAILED_IN_PROGRESS op: CreateTable locked: [W:6] locking: [] top: FinishCreateTable txid: 46cb3e1491f1a4fe status: IN_PROGRESS op: CreateTable locked: [W:4] locking: [] top: FinishCreateTable txid: 63862690ab0a9b96 status: IN_PROGRESS op: CreateTable locked: [W:5] locking: [] top: FinishCreateTable 3 transactions
        Hide
        kturner Keith Turner added a comment -

        I just experimented with #4, it works great. I had not really looked at the code too closely since #2. I should have experimented and reviewed the code, instead of just experimenting. I noticed some code changes that would be nice. Instead of getting the zookeeper host and timeout from config, I think we should use the following two methods to get these

            shellState.getConnector().getInstance().getZooKeepers()
            shellState.getConnector().getInstance().getZooKeepersSessionTimeOut()
        

        If this is done, then I think the command will not depend on the config at all. The client code used to have a lot of dependencies on these config files, which are mainly intended for servers. We have been trying to remove these dependencies on the config files from client code.

        One other little thing, all of the shell commands do something like if(cl.hasOption(statusOption.getOpt())),where the code has if(cl.hasOption('t')).

        Show
        kturner Keith Turner added a comment - I just experimented with #4, it works great. I had not really looked at the code too closely since #2. I should have experimented and reviewed the code, instead of just experimenting. I noticed some code changes that would be nice. Instead of getting the zookeeper host and timeout from config, I think we should use the following two methods to get these shellState.getConnector().getInstance().getZooKeepers() shellState.getConnector().getInstance().getZooKeepersSessionTimeOut() If this is done, then I think the command will not depend on the config at all. The client code used to have a lot of dependencies on these config files, which are mainly intended for servers. We have been trying to remove these dependencies on the config files from client code. One other little thing, all of the shell commands do something like if(cl.hasOption(statusOption.getOpt())),where the code has if(cl.hasOption('t')).
        Hide
        kturner Keith Turner added a comment -

        I applied this patch and made the few changes it needed. I did not want to see a good patch get too stale. It did not apply cleanly when I first tried it, but "patch -p0 --merge" seemed to do the trick.

        Show
        kturner Keith Turner added a comment - I applied this patch and made the few changes it needed. I did not want to see a good patch get too stale. It did not apply cleanly when I first tried it, but "patch -p0 --merge" seemed to do the trick.
        Hide
        hudson Hudson added a comment -

        Integrated in Accumulo-Trunk #788 (See https://builds.apache.org/job/Accumulo-Trunk/788/)
        ACCUMULO-842 removed dependency on accumulo-site.xml from new fate shell command (Revision 1458888)
        ACCUMULO-842 applied patch from Damon Brown (Revision 1458882)

        Result = SUCCESS
        kturner :
        Files :

        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java

        kturner :
        Files :

        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java
        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java
        • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java
        • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/fate/Admin.java
        Show
        hudson Hudson added a comment - Integrated in Accumulo-Trunk #788 (See https://builds.apache.org/job/Accumulo-Trunk/788/ ) ACCUMULO-842 removed dependency on accumulo-site.xml from new fate shell command (Revision 1458888) ACCUMULO-842 applied patch from Damon Brown (Revision 1458882) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/fate/Admin.java
        Hide
        hudson Hudson added a comment -

        Integrated in Accumulo-Trunk-Hadoop-2.0 #147 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/147/)
        ACCUMULO-842 removed dependency on accumulo-site.xml from new fate shell command (Revision 1458888)
        ACCUMULO-842 applied patch from Damon Brown (Revision 1458882)

        Result = SUCCESS
        kturner :
        Files :

        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java

        kturner :
        Files :

        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java
        • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java
        • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java
        • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/fate/Admin.java
        Show
        hudson Hudson added a comment - Integrated in Accumulo-Trunk-Hadoop-2.0 #147 (See https://builds.apache.org/job/Accumulo-Trunk-Hadoop-2.0/147/ ) ACCUMULO-842 removed dependency on accumulo-site.xml from new fate shell command (Revision 1458888) ACCUMULO-842 applied patch from Damon Brown (Revision 1458882) Result = SUCCESS kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java kturner : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/FateCommand.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/AdminUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/fate/Admin.java

          People

          • Assignee:
            dab5879 Damon A. Brown
            Reporter:
            vines John Vines
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development