HBase
  1. HBase
  2. HBASE-5583

Master restart on create table with splitkeys does not recreate table with all the splitkey regions

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.99.0
    • Component/s: None
    • Labels:
      None

      Description

      -> Create table using splitkeys
      -> MAster goes down before all regions are added to meta
      -> On master restart the table is again enabled but with less number of regions than specified in splitkeys

      Anyway client will get an exception if i had called sync create table. But table exists or not check will say table exists.
      Is this scenario to be handled by client only or can we have some mechanism on the master side for this? Pls suggest.

      1. HBASE-5583_new_1_review.patch
        76 kB
        ramkrishna.s.vasudevan
      2. HBASE-5583_new_1.patch
        529 kB
        ramkrishna.s.vasudevan
      3. HBASE-5583_new_2.patch
        66 kB
        ramkrishna.s.vasudevan
      4. HBASE-5583_new_4_WIP.patch
        91 kB
        ramkrishna.s.vasudevan
      5. HBASE-5583_new_5_WIP_using_tableznode.patch
        74 kB
        ramkrishna.s.vasudevan

        Activity

        ramkrishna.s.vasudevan created issue -
        Hide
        Jonathan Hsieh added a comment -

        Maybe we should put plans for master actions into ZK, and then remove each step as they are completed. This would allow a backup master or the restarted master to pick up and continue where the previous master had died.

        So this we may have a zk structure act like a queue with region infos and then let the masters drain from there to create .META. rows.

        I was thinking about adding a flag to info:regioninfo or extra column ("info:unfinshedPresplitCreate") but this would seem to suffer from a similar problem if the master failed as these were being removed after all regions created.

        Show
        Jonathan Hsieh added a comment - Maybe we should put plans for master actions into ZK, and then remove each step as they are completed. This would allow a backup master or the restarted master to pick up and continue where the previous master had died. So this we may have a zk structure act like a queue with region infos and then let the masters drain from there to create .META. rows. I was thinking about adding a flag to info:regioninfo or extra column ("info:unfinshedPresplitCreate") but this would seem to suffer from a similar problem if the master failed as these were being removed after all regions created.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Jon
        Yes we should go with some form of persisting the splitkeys. ZK should be the best place for that.

        Show
        ramkrishna.s.vasudevan added a comment - @Jon Yes we should go with some form of persisting the splitkeys. ZK should be the best place for that.
        Hide
        ramkrishna.s.vasudevan added a comment -

        One more issue is there
        If the master goes down just after writing the table descriptors and before adding anything to the META, on master restart we move the enabling state tables to enabled only if they have entry in META. So this is another area we need to address.
        So ZK should ideally used in a better way to track the table creation steps.

        Show
        ramkrishna.s.vasudevan added a comment - One more issue is there If the master goes down just after writing the table descriptors and before adding anything to the META, on master restart we move the enabling state tables to enabled only if they have entry in META. So this is another area we need to address. So ZK should ideally used in a better way to track the table creation steps.
        Hide
        ramkrishna.s.vasudevan added a comment -

        First step update the status in the below to CREATETABLE
        /hbase/table/tableName
        /hbase/table/tableName/currentstep

        -> Create nodes with hregioninfo in the constructor of create table handler.
        Create those many nodes as in the splitkeys.

        -> /hbase/table/tableName/currentStep
        CREATED_SPLIT_KEYS (after the creation of all splitkey nodes)
        CREATING_TD
        ADD_TO_META
        ASSIGN_USER_REGIONS

        if all are done 'ENABLED' will be the state of the table.

        -> add regioninfo to META and on success delete the node created node one by one

        -> IF master fail over happens

        See in what step the node is
        ============================

        If failed in ASSIGN_USER_REGIONS
        Just reassign all the regions

        If failed in ADD_TO_META
        Check what regions are not yet added to meta and then do the assignment for all the regions
        We can ensure that the nodes that are available in the zk are the ones for which the meta entry is not updated

        If failed in CREATING_TD
        If tabledescriptor not found create it again. if not go to the next step

        If failed in CREATED_SPLIT_KEYS
        Delete the node so that atleast the next time creation is successful. Just log it as we cannot throw error back to the client.

        ===================================================================================

        We need to create one more api in client called 'isTableAvailable()' accepting splitKeys also. so that the user can use them to check if really
        the table is created with all the splitkeys.
        Currently the isTableAvailable() just checks if atleast one region is assigned to any RS.

        The current changes does not guarentee that the user table will be created in a synchronous way to the client createTable api.
        but will ensure that while creating any table if the master failover happens the table doesnot get created with less number of
        regions but still the user thinks table creation is sucessful.

        We now distinguish the states where a table was in create flow or in the enable flow.
        Previously the ENABLING state was in master fail over scenario.

        Pls provide your inputs.

        Show
        ramkrishna.s.vasudevan added a comment - First step update the status in the below to CREATETABLE /hbase/table/tableName /hbase/table/tableName/currentstep -> Create nodes with hregioninfo in the constructor of create table handler. Create those many nodes as in the splitkeys. -> /hbase/table/tableName/currentStep CREATED_SPLIT_KEYS (after the creation of all splitkey nodes) CREATING_TD ADD_TO_META ASSIGN_USER_REGIONS if all are done 'ENABLED' will be the state of the table. -> add regioninfo to META and on success delete the node created node one by one -> IF master fail over happens See in what step the node is ============================ If failed in ASSIGN_USER_REGIONS Just reassign all the regions If failed in ADD_TO_META Check what regions are not yet added to meta and then do the assignment for all the regions We can ensure that the nodes that are available in the zk are the ones for which the meta entry is not updated If failed in CREATING_TD If tabledescriptor not found create it again. if not go to the next step If failed in CREATED_SPLIT_KEYS Delete the node so that atleast the next time creation is successful. Just log it as we cannot throw error back to the client. =================================================================================== We need to create one more api in client called 'isTableAvailable()' accepting splitKeys also. so that the user can use them to check if really the table is created with all the splitkeys. Currently the isTableAvailable() just checks if atleast one region is assigned to any RS. The current changes does not guarentee that the user table will be created in a synchronous way to the client createTable api. but will ensure that while creating any table if the master failover happens the table doesnot get created with less number of regions but still the user thinks table creation is sucessful. We now distinguish the states where a table was in create flow or in the enable flow. Previously the ENABLING state was in master fail over scenario. Pls provide your inputs.
        Hide
        Ted Yu added a comment -

        Expanding table creation state coverage makes sense.

        Show
        Ted Yu added a comment - Expanding table creation state coverage makes sense.
        stack made changes -
        Field Original Value New Value
        Fix Version/s 0.95.0 [ 12324094 ]
        Fix Version/s 0.96.0 [ 12320040 ]
        Hide
        chunhui shen added a comment -

        Could we consider make creating table as a transaction, e.g. create files in tmp directory, atomic put regions to .META. ?

        Show
        chunhui shen added a comment - Could we consider make creating table as a transaction, e.g. create files in tmp directory, atomic put regions to .META. ?
        Hide
        ramkrishna.s.vasudevan added a comment -

        Ya that was my idea. I was working on that..had some rough working patch. But that also had some set of problems.
        Will revive that patch and get back on the issues/problems that we may face.

        Show
        ramkrishna.s.vasudevan added a comment - Ya that was my idea. I was working on that..had some rough working patch. But that also had some set of problems. Will revive that patch and get back on the issues/problems that we may face.
        Hide
        Matteo Bertozzi added a comment -

        Could we consider make creating table as a transaction, e.g. create files in tmp directory, atomic put regions to .META.?

        We have already that, the problem is that META and the filesystem are two separate operations... so you may fail in the middle and have META but not the fs dir moved...
        The idea of the file-tracking table (mentioned in others jira, e.g. HBASE-7806) was exactly to solve the problem and having a single source of truth... and have the table creation/deletion as transaction

        Show
        Matteo Bertozzi added a comment - Could we consider make creating table as a transaction, e.g. create files in tmp directory, atomic put regions to .META.? We have already that, the problem is that META and the filesystem are two separate operations... so you may fail in the middle and have META but not the fs dir moved... The idea of the file-tracking table (mentioned in others jira, e.g. HBASE-7806 ) was exactly to solve the problem and having a single source of truth... and have the table creation/deletion as transaction
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Matteo
        My plan was to maintain the state of the operation in ZK or HDFS. Can i refresh the patch that i have and submit for review?
        Or you feel that will not be needed at all.

        Show
        ramkrishna.s.vasudevan added a comment - @Matteo My plan was to maintain the state of the operation in ZK or HDFS. Can i refresh the patch that i have and submit for review? Or you feel that will not be needed at all.
        Hide
        Matteo Bertozzi added a comment -

        My plan was to maintain the state of the operation in ZK or HDFS. Can i refresh the patch that i have and submit for review?
        Or you feel that will not be needed at all.

        There's a long way to the file-tracking table,
        so I'm +1 to have an alternative now based on a ZK or HDFS state

        Show
        Matteo Bertozzi added a comment - My plan was to maintain the state of the operation in ZK or HDFS. Can i refresh the patch that i have and submit for review? Or you feel that will not be needed at all. There's a long way to the file-tracking table, so I'm +1 to have an alternative now based on a ZK or HDFS state
        ramkrishna.s.vasudevan made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_1.patch [ 12574840 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        I am attaching a patch here with the current state of work.
        This is not the final one.

        Still Exception handling and testing is pending.

        I would like to get myself clarified whether am going in the right direction.
        My first question would be
        -> In order to maintain different states at which we are in while creating table should we go for a special node under the table znode or can we add new states to the existing table states like (DISABLING, ENABLING, DISABLED etc).?

        BAsed on this i can rebase the patch.
        Also pls note that the ZooKeeperProto.java is showing as a complete change with out which this patch should be comparitively smaller.

        -> This patch basically tries to handle table creation on master failure and restart.
        -> Adds a new state CREATINGTABLE instead of ENABLING so that we know that this table is partially created.
        -> Adds a status node under /table to maintain various states of the table creation process (is this seperate node needed or not is my doubt).
        -> Table lock related things needs to be checked.

        Even after we do all this HBaseAdmin.isTableAvailable(tableName, splitKeys) should be used by the client to check if all the regions are created.
        Feed back is appreciated.
        I can do more testing and handle ZK exceptions in a better way in my further patches.
        Thanks.

        Show
        ramkrishna.s.vasudevan added a comment - I am attaching a patch here with the current state of work. This is not the final one. Still Exception handling and testing is pending. I would like to get myself clarified whether am going in the right direction. My first question would be -> In order to maintain different states at which we are in while creating table should we go for a special node under the table znode or can we add new states to the existing table states like (DISABLING, ENABLING, DISABLED etc).? BAsed on this i can rebase the patch. Also pls note that the ZooKeeperProto.java is showing as a complete change with out which this patch should be comparitively smaller. -> This patch basically tries to handle table creation on master failure and restart. -> Adds a new state CREATINGTABLE instead of ENABLING so that we know that this table is partially created. -> Adds a status node under /table to maintain various states of the table creation process (is this seperate node needed or not is my doubt). -> Table lock related things needs to be checked. Even after we do all this HBaseAdmin.isTableAvailable(tableName, splitKeys) should be used by the client to check if all the regions are created. Feed back is appreciated. I can do more testing and handle ZK exceptions in a better way in my further patches. Thanks.
        Hide
        Anoop Sam John added a comment -

        Ram
        Started going through the patch. It looks so big change but is it really? There generated file is fully rewritten!
        ZooKeeper.proto

        ENABLING = 3;
        +    CREATINGTABLE = 4;
        

        Can we make CREATINGTABLE -> CREATING?

        +  enum State {
        +    CREATED_TD = 0;
        +    CREATED_REGIONINFO = 1;
        +    MOVED_TO_ORIG_LOCATION = 2;
        +    ADD_TO_META = 3;
        +    ASSIGN_USER_REGIONS = 4;
        +    DONE = 4;
        +  }
        

        Repeated ordinal. Typo?
        Will continue..

        Show
        Anoop Sam John added a comment - Ram Started going through the patch. It looks so big change but is it really? There generated file is fully rewritten! ZooKeeper.proto ENABLING = 3; + CREATINGTABLE = 4; Can we make CREATINGTABLE -> CREATING? + enum State { + CREATED_TD = 0; + CREATED_REGIONINFO = 1; + MOVED_TO_ORIG_LOCATION = 2; + ADD_TO_META = 3; + ASSIGN_USER_REGIONS = 4; + DONE = 4; + } Repeated ordinal. Typo? Will continue..
        Hide
        Anoop Sam John added a comment -

        In order to maintain different states at which we are in while creating table should we go for a special node under the table znode or can we add new states to the existing table states like (DISABLING, ENABLING, DISABLED etc)

        IMO go with separate node.

        Show
        Anoop Sam John added a comment - In order to maintain different states at which we are in while creating table should we go for a special node under the table znode or can we add new states to the existing table states like (DISABLING, ENABLING, DISABLED etc) IMO go with separate node.
        Hide
        ramkrishna.s.vasudevan added a comment -

        It looks so big change but is it really? There generated file is fully rewritten!

        Yes.. i think i made some mess there.
        On the enum names and ordering i will finalise the names once the patch comes alright. BTW thanks for taking a look.

        Show
        ramkrishna.s.vasudevan added a comment - It looks so big change but is it really? There generated file is fully rewritten! Yes.. i think i made some mess there. On the enum names and ordering i will finalise the names once the patch comes alright. BTW thanks for taking a look.
        Hide
        ramkrishna.s.vasudevan added a comment -

        But going with seperate node we have to do the bookkeeping work for that new node.
        Also need to take care of exception that come out on setting data on that znode.
        Advantage that we get by updating the same table znode is that things are easy to maintain and we can do the updates automically like the deletion/updation of the table node alone would suffice to indicate some operation is done. But when we have another znode we need to update that znode also and things will not be automic.
        Particularly on restart scenarios we may face some problems. Anyway will list out the probs that i faced with one and 2 znodes.

        Show
        ramkrishna.s.vasudevan added a comment - But going with seperate node we have to do the bookkeeping work for that new node. Also need to take care of exception that come out on setting data on that znode. Advantage that we get by updating the same table znode is that things are easy to maintain and we can do the updates automically like the deletion/updation of the table node alone would suffice to indicate some operation is done. But when we have another znode we need to update that znode also and things will not be automic. Particularly on restart scenarios we may face some problems. Anyway will list out the probs that i faced with one and 2 znodes.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Just added few more logs and ensured that the proto file is not fully changed.
        Made some changes to the enums. Yet not started testing and still need to validate some cases. Uploading for reference.

        Show
        ramkrishna.s.vasudevan added a comment - Just added few more logs and ensured that the proto file is not fully changed. Made some changes to the enums. Yet not started testing and still need to validate some cases. Uploading for reference.
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_2.patch [ 12575005 ]
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_2.patch [ 12575005 ]
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_2.patch [ 12575007 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        This is still WIP in progress. But overall in better shape.
        Will add more tests with more split keys.
        May be some refactorings are needed still.

        Show
        ramkrishna.s.vasudevan added a comment - This is still WIP in progress. But overall in better shape. Will add more tests with more split keys. May be some refactorings are needed still.
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_4_WIP.patch [ 12575722 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        Thinking on the problems of having two znodes is that
        -> Table is ENABLED but before the status znode is cleared the Master goes down. Now the new master may see that the znode is ENABLED but the status znode will remain there. We now have to check for every table if the status znode is present and if the table is in ENABLED state we need to delete the znode.
        -> The other case is - while the table creation was going on theere was a failure in table creation and so we go and try to delete the status and znode and also the table znode. Now assume the master goes down just after the status znode deletion. Now the new master knows it is a partially created table but it will not know what action to take if the status znode is not available. According to me this is the only scenario where having 2 status znodes will be problematic.
        -> The advantage of having only one znode is, our entire recovery just depends on the state present in the table znode. If the table is in any of the state other than ENABLED/ENABLING, DISABLING,DISABLED we go ahead with the existing code. If the znode is any of the the new states that is introduced by this patch then master would be clear in knowing what action to take.

        But introducing new state to the new existing table znode - am not sure if others will accept to it. So some feedback on this part will help me to proceed with the patch.

        Show
        ramkrishna.s.vasudevan added a comment - Thinking on the problems of having two znodes is that -> Table is ENABLED but before the status znode is cleared the Master goes down. Now the new master may see that the znode is ENABLED but the status znode will remain there. We now have to check for every table if the status znode is present and if the table is in ENABLED state we need to delete the znode. -> The other case is - while the table creation was going on theere was a failure in table creation and so we go and try to delete the status and znode and also the table znode. Now assume the master goes down just after the status znode deletion. Now the new master knows it is a partially created table but it will not know what action to take if the status znode is not available. According to me this is the only scenario where having 2 status znodes will be problematic. -> The advantage of having only one znode is, our entire recovery just depends on the state present in the table znode. If the table is in any of the state other than ENABLED/ENABLING, DISABLING,DISABLED we go ahead with the existing code. If the znode is any of the the new states that is introduced by this patch then master would be clear in knowing what action to take. But introducing new state to the new existing table znode - am not sure if others will accept to it. So some feedback on this part will help me to proceed with the patch.
        Hide
        ramkrishna.s.vasudevan added a comment -

        This patch _5 uses the table znode with new states that will be used in table creation.
        If you guys feel this is the best approach i can check out if this cause anyother impact.
        Also need to make the TesetcreateTableHandler to run reliably.

        Show
        ramkrishna.s.vasudevan added a comment - This patch _5 uses the table znode with new states that will be used in table creation. If you guys feel this is the best approach i can check out if this cause anyother impact. Also need to make the TesetcreateTableHandler to run reliably.
        ramkrishna.s.vasudevan made changes -
        Hide
        Ted Yu added a comment -
        +  public static Set<String> getCreatingTables(ZooKeeperWatcher zkw) throws KeeperException {
        

        I think getTablesInCreation would be better method name.

        +  public void removeCreateTableStates(final String tableName) {
        

        Suggest renaming removeCreateTableStates() as removeTableStateForTableInCreation().

        +  private final static Table.State createTableStates[] = { Table.State.CREATING, Table.State.CREATING_TD,
        

        Why do you use an array instead of a Set ? That way you don't need to iterate through the states in ZKTableReadOnly.isTableState()

        +         List<String> listChildrenNoWatch = ZKUtil.listChildrenNoWatch(this.watcher, this.watcher.tableZNode+"/"+child);
        +         for (String string : listChildrenNoWatch) {
        +       System.out.println("Status node should be prenet "+string);
        +     }
        

        I guess you would remove the above in the next patch.

        For createTableCreationStatusNode():

        +    LOG.info("Created children node "+numberOfChildren);
        

        If you keep the above log, please add tableName.

        +  public boolean checkAndSetCreatingTableStates(final String tableName) throws KeeperException {
        

        Remove the trailing 's' in method name.

        Show
        Ted Yu added a comment - + public static Set< String > getCreatingTables(ZooKeeperWatcher zkw) throws KeeperException { I think getTablesInCreation would be better method name. + public void removeCreateTableStates( final String tableName) { Suggest renaming removeCreateTableStates() as removeTableStateForTableInCreation(). + private final static Table.State createTableStates[] = { Table.State.CREATING, Table.State.CREATING_TD, Why do you use an array instead of a Set ? That way you don't need to iterate through the states in ZKTableReadOnly.isTableState() + List< String > listChildrenNoWatch = ZKUtil.listChildrenNoWatch( this .watcher, this .watcher.tableZNode+ "/" +child); + for ( String string : listChildrenNoWatch) { + System .out.println( "Status node should be prenet " +string); + } I guess you would remove the above in the next patch. For createTableCreationStatusNode(): + LOG.info( "Created children node " +numberOfChildren); If you keep the above log, please add tableName. + public boolean checkAndSetCreatingTableStates( final String tableName) throws KeeperException { Remove the trailing 's' in method name.
        Hide
        Jimmy Xiang added a comment -

        Please ignore if you have already considered about this: is it simpler to disable the table and let the user create the table again in such a case? I assume this won't happen many times. Or show the user a big warning in this case and let them decide how to move ahead? Users should use caution if the cluster fails in the middle when a table is created.

        Show
        Jimmy Xiang added a comment - Please ignore if you have already considered about this: is it simpler to disable the table and let the user create the table again in such a case? I assume this won't happen many times. Or show the user a big warning in this case and let them decide how to move ahead? Users should use caution if the cluster fails in the middle when a table is created.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Jimmy,
        When i was with my previous employer we actually wanted this feature. The reason was tables was dropped and created on a daily basis. So because of this problem, the developers had written a client code which creates the table by adding directly into META and then tries to enable those regions. I think that was in 0.90 code.
        The later version of hbase made the creation more reliable but often in our integration tests and Functional testing we end up with the tables getting half created and we end up in problems.
        That is why it prompted us to work on this.

        s it simpler to disable the table and let the user create the table again in such a case

        Yes this is possible. But involves the following changes that i can think of immediately
        -> Even if the table znode is in ENABLING we should still disable the table.
        > If we mark the table disabled, user should be very clear that the table he created is partial and that is what is DISABLED now. I think the better way is to use a new state say UNUSABLE (or something like this) which will make the user to take a clear course of action.
        I still feel the above steps are simpler.
        But table creation is the basic step and handling the failure scenarios in this should be more of automated rather than user interference is what i felt. Hence decided to go on with this way.
        @Ted
        Thanks for going thro the patch. I will make the necessary changes and upload it.
        Just waiting for more ideas and views on this.

        Show
        ramkrishna.s.vasudevan added a comment - @Jimmy, When i was with my previous employer we actually wanted this feature. The reason was tables was dropped and created on a daily basis. So because of this problem, the developers had written a client code which creates the table by adding directly into META and then tries to enable those regions. I think that was in 0.90 code. The later version of hbase made the creation more reliable but often in our integration tests and Functional testing we end up with the tables getting half created and we end up in problems. That is why it prompted us to work on this. s it simpler to disable the table and let the user create the table again in such a case Yes this is possible. But involves the following changes that i can think of immediately -> Even if the table znode is in ENABLING we should still disable the table. > If we mark the table disabled, user should be very clear that the table he created is partial and that is what is DISABLED now. I think the better way is to use a new state say UNUSABLE (or something like this) which will make the user to take a clear course of action. I still feel the above steps are simpler. But table creation is the basic step and handling the failure scenarios in this should be more of automated rather than user interference is what i felt. Hence decided to go on with this way. @Ted Thanks for going thro the patch. I will make the necessary changes and upload it. Just waiting for more ideas and views on this.
        Hide
        Jimmy Xiang added a comment -

        @Ram, I see. Thanks. When you are ready, can we put it on the review board?

        Show
        Jimmy Xiang added a comment - @Ram, I see. Thanks. When you are ready, can we put it on the review board?
        Hide
        ramkrishna.s.vasudevan added a comment -

        I am facing some problems in making the testcases run. Or else would have uploaded a patch. I am still figuring out the root cause. Will keep updated.

        Show
        ramkrishna.s.vasudevan added a comment - I am facing some problems in making the testcases run. Or else would have uploaded a patch. I am still figuring out the root cause. Will keep updated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        The added testcases passes. Will wait for hadoop QA to run

        Show
        ramkrishna.s.vasudevan added a comment - The added testcases passes. Will wait for hadoop QA to run
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_1_review.patch [ 12576542 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        Added review request in RB - https://reviews.apache.org/r/10237/

        Show
        ramkrishna.s.vasudevan added a comment - Added review request in RB - https://reviews.apache.org/r/10237/
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_1_review.patch [ 12576542 ]
        ramkrishna.s.vasudevan made changes -
        Attachment HBASE-5583_new_1_review.patch [ 12576723 ]
        ramkrishna.s.vasudevan made changes -
        Status In Progress [ 3 ] Patch Available [ 10002 ]
        Hide
        Enis Soztutar added a comment -

        Sorry for coming in this late to this jira. Ram, have you looked at the generic master coordinated actions jira, HBASE-5487? What you are doing here, and what we need in general for operations(create table, disable table, etc) running in master is:

        • Client initiates the Master operation, gets a trackId in an async response.
        • Operations are written in a sequence of adempotent steps, where each step can be undone. Each step can work on top of partial results from a previous execution.
        • Master executes the steps of operation in sequence. Master either keeps the state of the whole execution stack, or uses a WAL for keeping the execution state.
        • Client query back the execution state by asking the master with the obtained trackId.

        This patch only solves the create table problem, for some cases, but cannot be generalized to other operations. Instead of doing it case-by-case, we should really solve this in HBASE-5487 I think. As I outlined there, there are 3 ways, that we can approach this:

        • Keeping the master state in zookeeper. Your patch does this for create table. However, we should go with the FATE approach if we decide on this.
        • Keeping the state in an WAL from master. This log will use hlog, but will be owned by the active master. recovery is done using this.
        • Keeping the state in a system table. Since HBase tables in turn use WAL's, this is just indirectly using a WAL.
        Show
        Enis Soztutar added a comment - Sorry for coming in this late to this jira. Ram, have you looked at the generic master coordinated actions jira, HBASE-5487 ? What you are doing here, and what we need in general for operations(create table, disable table, etc) running in master is: Client initiates the Master operation, gets a trackId in an async response. Operations are written in a sequence of adempotent steps, where each step can be undone. Each step can work on top of partial results from a previous execution. Master executes the steps of operation in sequence. Master either keeps the state of the whole execution stack, or uses a WAL for keeping the execution state. Client query back the execution state by asking the master with the obtained trackId. This patch only solves the create table problem, for some cases, but cannot be generalized to other operations. Instead of doing it case-by-case, we should really solve this in HBASE-5487 I think. As I outlined there, there are 3 ways, that we can approach this: Keeping the master state in zookeeper. Your patch does this for create table. However, we should go with the FATE approach if we decide on this. Keeping the state in an WAL from master. This log will use hlog, but will be owned by the active master. recovery is done using this. Keeping the state in a system table. Since HBase tables in turn use WAL's, this is just indirectly using a WAL.
        Hide
        ramkrishna.s.vasudevan added a comment - - edited

        @Enis
        Off late did not watch HBASE-5487 JIRA closely. I agree all the master originated table management operations should be generic. I will watch the JIRA and participate in discussions over there.
        So if we don't need this feature then i can create a patch for 0.94 if it makes sense to have it there.
        Till FATE is in place can we go ahead with this? Just asking your opinion.

        Show
        ramkrishna.s.vasudevan added a comment - - edited @Enis Off late did not watch HBASE-5487 JIRA closely. I agree all the master originated table management operations should be generic. I will watch the JIRA and participate in discussions over there. So if we don't need this feature then i can create a patch for 0.94 if it makes sense to have it there. Till FATE is in place can we go ahead with this? Just asking your opinion.
        Hide
        Enis Soztutar added a comment -

        Thanks Ram. Yes, the design in HBASE-5487 is not finalized yet unfortunately. We wanted to bring FATE, which will only handle the client initiated operations, not stuff like region assignment. Then the other WAL approaches came, and now Sergey's design. In any case, the issue probably won't cut it for 0.94.

        I would not object to getting the patch in this issue in, given that the design is still in discussion. My only concern would be that the table state is checked in a lot of places, so we should be careful. If you think that solving this issue only for create table with splits is important we can go with your approach. Otherwise, I would wait for a resolution on HBASE-5487.

        Show
        Enis Soztutar added a comment - Thanks Ram. Yes, the design in HBASE-5487 is not finalized yet unfortunately. We wanted to bring FATE, which will only handle the client initiated operations, not stuff like region assignment. Then the other WAL approaches came, and now Sergey's design. In any case, the issue probably won't cut it for 0.94. I would not object to getting the patch in this issue in, given that the design is still in discussion. My only concern would be that the table state is checked in a lot of places, so we should be careful. If you think that solving this issue only for create table with splits is important we can go with your approach. Otherwise, I would wait for a resolution on HBASE-5487 .
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 lineLengths. The patch introduces lines longer than 100

        +1 site. The mvn site goal succeeds with this patch.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.master.TestAssignmentManager
        org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 lineLengths . The patch introduces lines longer than 100 +1 site . The mvn site goal succeeds with this patch. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManager org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5112//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        I would not object to getting the patch in this issue in, given that the design is still in discussion. My only concern would be that the table state is checked in a lot of places, so we should be careful. If you think that solving this issue only for create table with splits is important we can go with your approach. Otherwise, I would wait for a resolution on HBASE-5487.

        Enis concern here is valid.
        What other folks think on this, i can work based on that.
        Stack,Ted Yu,Jimmy Xiang,rajeshbabu,Matteo Bertozzi - Your thoughts?

        Show
        ramkrishna.s.vasudevan added a comment - I would not object to getting the patch in this issue in, given that the design is still in discussion. My only concern would be that the table state is checked in a lot of places, so we should be careful. If you think that solving this issue only for create table with splits is important we can go with your approach. Otherwise, I would wait for a resolution on HBASE-5487 . Enis concern here is valid. What other folks think on this, i can work based on that. Stack , Ted Yu , Jimmy Xiang , rajeshbabu , Matteo Bertozzi - Your thoughts?
        Hide
        Jimmy Xiang added a comment -

        To me, I think we can wait for HBASE-5487, if this is not so urgent.

        Show
        Jimmy Xiang added a comment - To me, I think we can wait for HBASE-5487 , if this is not so urgent.
        stack made changes -
        Fix Version/s 0.95.1 [ 12324288 ]
        Fix Version/s 0.95.0 [ 12324094 ]
        Hide
        ramkrishna.s.vasudevan added a comment -

        If HBASE-5487 takes some more time can we check this ?

        Show
        ramkrishna.s.vasudevan added a comment - If HBASE-5487 takes some more time can we check this ?
        Hide
        Ted Yu added a comment -

        HBASE-5487 has no assignee and no Fix Version.

        I think we can revive discussion for this JIRA.

        Show
        Ted Yu added a comment - HBASE-5487 has no assignee and no Fix Version. I think we can revive discussion for this JIRA.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Its more than one month, is it ok if we can revive now? Ted agreed to it a month ago itself. What other feel?

        Show
        ramkrishna.s.vasudevan added a comment - Its more than one month, is it ok if we can revive now? Ted agreed to it a month ago itself. What other feel?
        stack made changes -
        Fix Version/s 0.95.2 [ 12320040 ]
        Fix Version/s 0.95.1 [ 12324288 ]
        stack made changes -
        Fix Version/s 0.96.0 [ 12324822 ]
        Fix Version/s 0.95.2 [ 12320040 ]
        stack made changes -
        Fix Version/s 0.96.1 [ 12324961 ]
        Fix Version/s 0.96.0 [ 12324822 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7763//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7763//console This message is automatically generated.
        stack made changes -
        Fix Version/s 0.99.0 [ 12325675 ]
        Fix Version/s 0.96.1 [ 12324961 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8186//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12576723/HBASE-5583_new_1_review.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8186//console This message is automatically generated.

          People

          • Assignee:
            ramkrishna.s.vasudevan
            Reporter:
            ramkrishna.s.vasudevan
          • Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

            • Created:
              Updated:

              Development