HBase
  1. HBase
  2. HBASE-1730

Near-instantaneous online schema and table state updates

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0
    • Component/s: None
    • Labels:
      None

      Description

      We should not need to take a table offline to update HCD or HTD.

      One option for that is putting HTDs and HCDs up into ZK, with mirror on disk catalog tables to be used only for cold init scenarios, as discussed on IRC. In this scheme, regionservers hosting regions of a table would watch permanent nodes in ZK associated with that table for schema updates and take appropriate actions out of the watcher. In effect, schema updates become another item in the ToDo list.

      /hbase/tables/<table-name>/schema

      Must be associated with a write locking scheme also handled with ZK primitives to avoid situations where one concurrent update clobbers another.

      1. HBASE-1730-2.patch
        35 kB
        Nicolas Spiegelberg
      2. HBASE-1730.patch
        37 kB
        nileema shingte
      3. 1730-v3.patch
        191 kB
        stack
      4. 1730-v2.patch
        42 kB
        stack
      5. 1730.patch
        30 kB
        stack

        Issue Links

          Activity

          Hide
          Ted Yu added a comment -

          Latest patch has been integrated by Nicolas.

          Show
          Ted Yu added a comment - Latest patch has been integrated by Nicolas.
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2175 (See https://builds.apache.org/job/HBase-TRUNK/2175/)
          HBASE-1730 Online schema changes for HBase

          nspiegelberg :
          Files :

          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          • /hbase/trunk/src/main/ruby/hbase/admin.rb
          • /hbase/trunk/src/main/ruby/shell.rb
          • /hbase/trunk/src/main/ruby/shell/commands/alter.rb
          • /hbase/trunk/src/main/ruby/shell/commands/alter_async.rb
          • /hbase/trunk/src/main/ruby/shell/commands/alter_status.rb
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2175 (See https://builds.apache.org/job/HBase-TRUNK/2175/ ) HBASE-1730 Online schema changes for HBase nspiegelberg : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java /hbase/trunk/src/main/ruby/hbase/admin.rb /hbase/trunk/src/main/ruby/shell.rb /hbase/trunk/src/main/ruby/shell/commands/alter.rb /hbase/trunk/src/main/ruby/shell/commands/alter_async.rb /hbase/trunk/src/main/ruby/shell/commands/alter_status.rb /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
          Hide
          Nicolas Spiegelberg added a comment -

          Nileema sent me this patch to address the peer feedback (note, she volunteered this after her internship, +1). Made minor modifications. Passes TestAdmin

          Show
          Nicolas Spiegelberg added a comment - Nileema sent me this patch to address the peer feedback (note, she volunteered this after her internship, +1). Made minor modifications. Passes TestAdmin
          Hide
          Ted Yu added a comment -

          @Nicolas:
          As Nileema indicated, she was no longer working on this issue.
          Please assign this to the next person so that progress can be made.

          Show
          Ted Yu added a comment - @Nicolas: As Nileema indicated, she was no longer working on this issue. Please assign this to the next person so that progress can be made.
          Hide
          Ted Yu added a comment -

          RegionPlan has become a top level class:

          /home/hadoop/hbase/src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java:[32,50] cannot find symbol
          symbol  : class RegionPlan
          location: interface org.apache.hadoop.hbase.master.LoadBalancer
          
          Show
          Ted Yu added a comment - RegionPlan has become a top level class: /home/hadoop/hbase/src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java:[32,50] cannot find symbol symbol : class RegionPlan location: interface org.apache.hadoop.hbase.master.LoadBalancer
          Hide
          Ted Yu added a comment -

          Please rebase patch on latest TRUNK:

          patching file src/main/ruby/shell.rb
          Hunk #1 FAILED at 227.
          1 out of 1 hunk FAILED -- saving rejects to file src/main/ruby/shell.rb.rej
          patching file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
          Hunk #1 FAILED at 26.
          1 out of 2 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java.rej
          
          Show
          Ted Yu added a comment - Please rebase patch on latest TRUNK: patching file src/main/ruby/shell.rb Hunk #1 FAILED at 227. 1 out of 1 hunk FAILED -- saving rejects to file src/main/ruby/shell.rb.rej patching file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java Hunk #1 FAILED at 26. 1 out of 2 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java.rej
          Hide
          Subbu M Iyer added a comment -

          Thanks Nileema. I shall take it from here after your patch is commited to trunk.

          Show
          Subbu M Iyer added a comment - Thanks Nileema. I shall take it from here after your patch is commited to trunk.
          Hide
          nileema shingte added a comment -

          Subbu,

          I am not working on this. You can go ahead with this! :

          Show
          nileema shingte added a comment - Subbu, I am not working on this. You can go ahead with this! :
          Hide
          Subbu M Iyer added a comment -

          Nilemma/Nicholas,

          Are you (or some one) already working on incorporating the required changes from 4213 to trunk after your patch is applied?

          I would like to work on it as I already have most of the required changes, but would not want to step into your shoes if some one on your side is already working on it.

          Please let me know.

          Show
          Subbu M Iyer added a comment - Nilemma/Nicholas, Are you (or some one) already working on incorporating the required changes from 4213 to trunk after your patch is applied? I would like to work on it as I already have most of the required changes, but would not want to step into your shoes if some one on your side is already working on it. Please let me know.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/#review1665
          -----------------------------------------------------------

          Ship it!

          This is good to go. Few questions in the below first though before commit. Thanks Nileema.

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/1479/#comment3712>

          Good.

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3713>

          nit: You don't need to say 'This function...' Its a given (smile).

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3714>

          This code dups the method above, removeClosedRegions? Is that intentional?

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3715>

          Why would we wait? Don't we want schema change to come on fast? Already, opening each region is series is going to run slow when table is 20k regions or 200k regions. Or do you folks think different?

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          <https://reviews.apache.org/r/1479/#comment3716>

          nit: curly braces

          src/main/ruby/shell.rb
          <https://reviews.apache.org/r/1479/#comment3717>

          So, old style alter works too?

          src/main/ruby/shell/commands/alter_async.rb
          <https://reviews.apache.org/r/1479/#comment3718>

          nice

          • Michael

          On 2011-08-26 17:25:21, Nileema Shingte wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/1479/

          -----------------------------------------------------------

          (Updated 2011-08-26 17:25:21)

          Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary

          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.

          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.

          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.

          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.

          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.

          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.

          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.

          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs

          -----

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919

          src/main/ruby/shell.rb 1ec330f

          src/main/ruby/shell/commands/alter.rb 1dd43ad

          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION

          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa

          src/main/ruby/hbase/admin.rb 4460d6e

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195

          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837

          src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953

          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce

          src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce

          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          Diff: https://reviews.apache.org/r/1479/diff

          Testing

          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.

          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/#review1665 ----------------------------------------------------------- Ship it! This is good to go. Few questions in the below first though before commit. Thanks Nileema. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/1479/#comment3712 > Good. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3713 > nit: You don't need to say 'This function...' Its a given (smile). src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3714 > This code dups the method above, removeClosedRegions? Is that intentional? src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3715 > Why would we wait? Don't we want schema change to come on fast? Already, opening each region is series is going to run slow when table is 20k regions or 200k regions. Or do you folks think different? src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java < https://reviews.apache.org/r/1479/#comment3716 > nit: curly braces src/main/ruby/shell.rb < https://reviews.apache.org/r/1479/#comment3717 > So, old style alter works too? src/main/ruby/shell/commands/alter_async.rb < https://reviews.apache.org/r/1479/#comment3718 > nice Michael On 2011-08-26 17:25:21, Nileema Shingte wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-26 17:25:21) Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs ----- src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919 src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953 src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-08-18 23:15:10, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java, line 80

          > <https://reviews.apache.org/r/1479/diff/1/?file=32327#file32327line80>

          >

          > Javadoc does not agree w/ what is being actually returned.

          fixed this.

          On 2011-08-18 23:15:10, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 1498

          > <https://reviews.apache.org/r/1479/diff/1/?file=32328#file32328line1498>

          >

          > Why are we waiting between unassigns? To lessen load on cluster?

          yes.

          On 2011-08-18 23:15:10, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 1067

          > <https://reviews.apache.org/r/1479/diff/1/?file=32330#file32330line1067>

          >

          > Is there anything in here to stop someone disabling a table or preventing the load balancer moving regions while this is all going on?

          the problem with this is, if you run more than one alter concurrently, we do not have a way to restore the state of the loadbalancer.

          On 2011-08-18 23:15:10, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java, line 101

          > <https://reviews.apache.org/r/1479/diff/1/?file=32332#file32332line101>

          >

          > This looks like you are fixing a bug we have in hbase?

          No. this is required for the bulk reopen operation and impacts only reopen datastructures.

          • Nileema

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/#review1535
          -----------------------------------------------------------

          On 2011-08-26 17:25:21, Nileema Shingte wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/1479/

          -----------------------------------------------------------

          (Updated 2011-08-26 17:25:21)

          Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary

          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.

          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.

          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.

          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.

          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.

          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.

          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.

          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs

          -----

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919

          src/main/ruby/shell.rb 1ec330f

          src/main/ruby/shell/commands/alter.rb 1dd43ad

          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION

          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa

          src/main/ruby/hbase/admin.rb 4460d6e

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195

          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837

          src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953

          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce

          src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce

          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          Diff: https://reviews.apache.org/r/1479/diff

          Testing

          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.

          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-08-18 23:15:10, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java, line 80 > < https://reviews.apache.org/r/1479/diff/1/?file=32327#file32327line80 > > > Javadoc does not agree w/ what is being actually returned. fixed this. On 2011-08-18 23:15:10, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 1498 > < https://reviews.apache.org/r/1479/diff/1/?file=32328#file32328line1498 > > > Why are we waiting between unassigns? To lessen load on cluster? yes. On 2011-08-18 23:15:10, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 1067 > < https://reviews.apache.org/r/1479/diff/1/?file=32330#file32330line1067 > > > Is there anything in here to stop someone disabling a table or preventing the load balancer moving regions while this is all going on? the problem with this is, if you run more than one alter concurrently, we do not have a way to restore the state of the loadbalancer. On 2011-08-18 23:15:10, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java, line 101 > < https://reviews.apache.org/r/1479/diff/1/?file=32332#file32332line101 > > > This looks like you are fixing a bug we have in hbase? No. this is required for the bulk reopen operation and impacts only reopen datastructures. Nileema ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/#review1535 ----------------------------------------------------------- On 2011-08-26 17:25:21, Nileema Shingte wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-26 17:25:21) Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs ----- src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919 src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953 src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/
          -----------------------------------------------------------

          (Updated 2011-08-26 17:25:21.197997)

          Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary
          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.
          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.
          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.
          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.
          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.
          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.
          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.
          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs


          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919
          src/main/ruby/shell.rb 1ec330f
          src/main/ruby/shell/commands/alter.rb 1dd43ad
          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION
          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa
          src/main/ruby/hbase/admin.rb 4460d6e
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195
          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837
          src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953
          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce
          src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce
          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024
          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          Diff: https://reviews.apache.org/r/1479/diff

          Testing
          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.
          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-26 17:25:21.197997) Review request for hbase, Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919 src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953 src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/
          -----------------------------------------------------------

          (Updated 2011-08-26 17:21:22.585689)

          Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Changes
          -------

          1. Removed the changes to HBaseObjectWritable as Pair is already serializable
          2. fixed whitespaces
          3. added debug statements

          I have not disabled the balancer because the logic for restoring the state of the balancer breaks for multiple concurrent alter commands.

          Summary
          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.
          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.
          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.
          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.
          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.
          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.
          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.
          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs (updated)


          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919
          src/main/ruby/shell.rb 1ec330f
          src/main/ruby/shell/commands/alter.rb 1dd43ad
          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION
          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa
          src/main/ruby/hbase/admin.rb 4460d6e
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195
          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837
          src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953
          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce
          src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce
          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024
          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          Diff: https://reviews.apache.org/r/1479/diff

          Testing
          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.
          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-26 17:21:22.585689) Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Changes ------- 1. Removed the changes to HBaseObjectWritable as Pair is already serializable 2. fixed whitespaces 3. added debug statements I have not disabled the balancer because the logic for restoring the state of the balancer breaks for multiple concurrent alter commands. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs (updated) src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 5965919 src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9784195 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java 6332953 src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java b5d65ce src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java 0c9ecce src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          stack added a comment -

          @Ted Would suggest you just point out error in the Nicolas comment rather than edit the comments of someone else. Its hard to grok why an edit (though in above you say why).

          @Subbu On 1., your soln. looks like it would be more 'scalable'. On 2., 1730, we learned yesterday, has been deployed in a production context and purportedly works (This true Nicolas?). On 3., IIRC, Karthik's reasoning was that with 1730 you can tell if it failed or not since master is keeping running tab on state (If failure – master dies or failed close/open of region – then manual intervention to disable/enable table was fine he reasoned). With 4213, the thought was that it was not as easy seeing whether operation succeeded or not.

          Another point bantered about was that 1730 has region granularity whereas 4213 goes by regionserver; it was thought that it'd be easier to figure the problem region in 1730 case.

          Show
          stack added a comment - @Ted Would suggest you just point out error in the Nicolas comment rather than edit the comments of someone else. Its hard to grok why an edit (though in above you say why). @Subbu On 1., your soln. looks like it would be more 'scalable'. On 2., 1730, we learned yesterday, has been deployed in a production context and purportedly works (This true Nicolas?). On 3., IIRC, Karthik's reasoning was that with 1730 you can tell if it failed or not since master is keeping running tab on state (If failure – master dies or failed close/open of region – then manual intervention to disable/enable table was fine he reasoned). With 4213, the thought was that it was not as easy seeing whether operation succeeded or not. Another point bantered about was that 1730 has region granularity whereas 4213 goes by regionserver; it was thought that it'd be easier to figure the problem region in 1730 case.
          Hide
          Subbu M Iyer added a comment -

          Hi Nicolas,

          Couple of call outs:

          1. 4213, just like 1730, also implements the online schema change as
          well. (but the pattern is slightly different, in that 4213 relies on
          the RS cloud to refresh the online regions applicable for the table
          with latest schema changes rather than master explicitly unassigning
          and reassigning all the regions of the table). When we alter a table
          with add/modify/delete CF the only thing that changes is the HTD for
          the table, and since the RS has access to this new version of HTD
          (after master completes the schema change) it can simply close/open a
          region for the changes to take effect. This was the fundamental
          assumption behind 4213's implementation.

          2. The testing and maturity for 4213 is TBD/Unknown, as I don't have
          the required resources (a real cluster setup) to perform the necessary
          acid tests to comment on that.

          3.  Karthik & other developers had a handful of design concerns about 4123:
          Could you please iterate those concerns so we can address them when we
          add the persistent capability to 1730?

          thanks
          Subbu

          On Tue, Aug 23, 2011 at 3:35 PM, Nicolas Spiegelberg (JIRA)

          Show
          Subbu M Iyer added a comment - Hi Nicolas, Couple of call outs: 1. 4213, just like 1730, also implements the online schema change as well. (but the pattern is slightly different, in that 4213 relies on the RS cloud to refresh the online regions applicable for the table with latest schema changes rather than master explicitly unassigning and reassigning all the regions of the table). When we alter a table with add/modify/delete CF the only thing that changes is the HTD for the table, and since the RS has access to this new version of HTD (after master completes the schema change) it can simply close/open a region for the changes to take effect. This was the fundamental assumption behind 4213's implementation. 2. The testing and maturity for 4213 is TBD/Unknown, as I don't have the required resources (a real cluster setup) to perform the necessary acid tests to comment on that. 3.  Karthik & other developers had a handful of design concerns about 4123: Could you please iterate those concerns so we can address them when we add the persistent capability to 1730? thanks Subbu On Tue, Aug 23, 2011 at 3:35 PM, Nicolas Spiegelberg (JIRA)
          Hide
          Ted Yu added a comment -

          @Nicholas:
          Pardon me for correcting the JIRA number.

          I am fine with 1730 going in at this moment.
          Subbu/Nileema can use 4213 to continue with master failover support.

          Please also publish the test cases for 1730: cluster size, table size, types of online schema changes (family addition/deletion, etc), values for knobs used, time taken for online schema changes.

          Show
          Ted Yu added a comment - @Nicholas: Pardon me for correcting the JIRA number. I am fine with 1730 going in at this moment. Subbu/Nileema can use 4213 to continue with master failover support. Please also publish the test cases for 1730: cluster size, table size, types of online schema changes (family addition/deletion, etc), values for knobs used, time taken for online schema changes.
          Hide
          Nicolas Spiegelberg added a comment - - edited

          Stack, Ted, Nileema, Karthik, and I discussed the HBASE-1730 & HBASE-4213 methodologies at the HBase Hackathon yesterday. Main Points:

          1. 1730 & 4213 are not the same feature. 1730 is implementing online schema changes, 4213 is adding persistence to online schema changes during master failover. We should think of the two JIRAs as disjoint in this way.
          2. 1730 meets it's looser requirements. Karthik & other developers had a handful of design concerns about 4213. We should make sure the design & failure analysis is solid before committing that feature.
          3. In the 4213/1730 overlap, there is some useful non-persistent functionality and design that should be incorporated into 1730 before commit.
          4. What is the testing/maturity of the 4213 patch? We have done extensive dev/cluster testing on 1730 internally & communicated this to the other committers. The persistence on 4213 requires even more edge case analysis testing than this feature to meet its requirements. None of us currently have any visibility into the 4213 testing & support. @Subbu: Could you please elaborate more in JIRA/IRC about the maturity of your patch?

          We would like to commit 1730 first since it is mature and iterate/solidify 4213 a little bit before committing.

          Show
          Nicolas Spiegelberg added a comment - - edited Stack, Ted, Nileema, Karthik, and I discussed the HBASE-1730 & HBASE-4213 methodologies at the HBase Hackathon yesterday. Main Points: 1. 1730 & 4213 are not the same feature. 1730 is implementing online schema changes, 4213 is adding persistence to online schema changes during master failover. We should think of the two JIRAs as disjoint in this way. 2. 1730 meets it's looser requirements. Karthik & other developers had a handful of design concerns about 4213. We should make sure the design & failure analysis is solid before committing that feature. 3. In the 4213/1730 overlap, there is some useful non-persistent functionality and design that should be incorporated into 1730 before commit. 4. What is the testing/maturity of the 4213 patch? We have done extensive dev/cluster testing on 1730 internally & communicated this to the other committers. The persistence on 4213 requires even more edge case analysis testing than this feature to meet its requirements. None of us currently have any visibility into the 4213 testing & support. @Subbu: Could you please elaborate more in JIRA/IRC about the maturity of your patch? We would like to commit 1730 first since it is mature and iterate/solidify 4213 a little bit before committing.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/#review1535
          -----------------------------------------------------------

          I reviewed about half. Nice patch. Thank you.

          One feature that hbase-4213 has is that the state of the alter is tracked up in zk. 4213 doesn't do this but, going via zk would seem to make the transaction more robust against failure of master. What happens in this patch if master is shot in the head mid-way threw the alter? What you think Nileema?

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/1479/#comment3489>

          White space.

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
          <https://reviews.apache.org/r/1479/#comment3490>

          Why catch exception to just rethrow it?

          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
          <https://reviews.apache.org/r/1479/#comment3491>

          White space

          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
          <https://reviews.apache.org/r/1479/#comment3492>

          You are adding white space.

          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java
          <https://reviews.apache.org/r/1479/#comment3493>

          Javadoc does not agree w/ what is being actually returned.

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3494>

          HSI is deprecated.

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3497>

          White space.

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          <https://reviews.apache.org/r/1479/#comment3499>

          Why are we waiting between unassigns? To lessen load on cluster?

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java
          <https://reviews.apache.org/r/1479/#comment3500>

          Is there anything in here to stop someone disabling a table or preventing the load balancer moving regions while this is all going on?

          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java
          <https://reviews.apache.org/r/1479/#comment3501>

          This looks like you are fixing a bug we have in hbase?

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java
          <https://reviews.apache.org/r/1479/#comment3502>

          Missing curly braces

          • Michael

          On 2011-08-12 06:14:21, Nileema Shingte wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/1479/

          -----------------------------------------------------------

          (Updated 2011-08-12 06:14:21)

          Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary

          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.

          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.

          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.

          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.

          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.

          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.

          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.

          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77

          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c

          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140

          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa

          src/main/ruby/hbase/admin.rb 4460d6e

          src/main/ruby/shell.rb 1ec330f

          src/main/ruby/shell/commands/alter.rb 1dd43ad

          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION

          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99

          Diff: https://reviews.apache.org/r/1479/diff

          Testing

          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.

          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/#review1535 ----------------------------------------------------------- I reviewed about half. Nice patch. Thank you. One feature that hbase-4213 has is that the state of the alter is tracked up in zk. 4213 doesn't do this but, going via zk would seem to make the transaction more robust against failure of master. What happens in this patch if master is shot in the head mid-way threw the alter? What you think Nileema? src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/1479/#comment3489 > White space. src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/1479/#comment3490 > Why catch exception to just rethrow it? src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java < https://reviews.apache.org/r/1479/#comment3491 > White space src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java < https://reviews.apache.org/r/1479/#comment3492 > You are adding white space. src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java < https://reviews.apache.org/r/1479/#comment3493 > Javadoc does not agree w/ what is being actually returned. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3494 > HSI is deprecated. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3497 > White space. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java < https://reviews.apache.org/r/1479/#comment3499 > Why are we waiting between unassigns? To lessen load on cluster? src/main/java/org/apache/hadoop/hbase/master/HMaster.java < https://reviews.apache.org/r/1479/#comment3500 > Is there anything in here to stop someone disabling a table or preventing the load balancer moving regions while this is all going on? src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java < https://reviews.apache.org/r/1479/#comment3501 > This looks like you are fixing a bug we have in hbase? src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java < https://reviews.apache.org/r/1479/#comment3502 > Missing curly braces Michael On 2011-08-12 06:14:21, Nileema Shingte wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-12 06:14:21) Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs ----- src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77 src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99 Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          Subbu M Iyer added a comment -

          In continuation of HBASE-451, I was working on a patch for this issue and just realized that there is already a patch submitted for this Jira. I created a related patch HBASE-4213 that follows a slightly different approach to the same problem and thought will any way submit my patch as well.

          Show
          Subbu M Iyer added a comment - In continuation of HBASE-451 , I was working on a patch for this issue and just realized that there is already a patch submitted for this Jira. I created a related patch HBASE-4213 that follows a slightly different approach to the same problem and thought will any way submit my patch as well.
          Hide
          Subbu M Iyer added a comment -

          Support instant schema updates with out master's intervention and bulk assign/unassign.

          Show
          Subbu M Iyer added a comment - Support instant schema updates with out master's intervention and bulk assign/unassign.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/#review1416
          -----------------------------------------------------------

          Please test this patch using a real cluster.

          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
          <https://reviews.apache.org/r/1479/#comment3273>

          Pair implements Serializable which is handled specially by HbaseObjectWritable (see line 350 below). Why is this needed ?

          • Ted

          On 2011-08-12 06:14:21, Nileema Shingte wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/1479/

          -----------------------------------------------------------

          (Updated 2011-08-12 06:14:21)

          Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary

          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.

          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.

          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.

          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.

          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.

          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.

          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.

          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77

          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c

          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024

          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c

          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68

          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140

          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837

          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa

          src/main/ruby/hbase/admin.rb 4460d6e

          src/main/ruby/shell.rb 1ec330f

          src/main/ruby/shell/commands/alter.rb 1dd43ad

          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION

          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99

          Diff: https://reviews.apache.org/r/1479/diff

          Testing

          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.

          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/#review1416 ----------------------------------------------------------- Please test this patch using a real cluster. src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java < https://reviews.apache.org/r/1479/#comment3273 > Pair implements Serializable which is handled specially by HbaseObjectWritable (see line 350 below). Why is this needed ? Ted On 2011-08-12 06:14:21, Nileema Shingte wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- (Updated 2011-08-12 06:14:21) Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs ----- src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77 src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99 Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/1479/
          -----------------------------------------------------------

          Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray.

          Summary
          -------

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.
          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.
          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.
          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.
          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.
          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.
          7. modification in the unit test for enabling alter without disabling the table.

          This addresses bug HBASE-1730.
          https://issues.apache.org/jira/browse/HBASE-1730

          Diffs


          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77
          src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c
          src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024
          src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c
          src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68
          src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140
          src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837
          src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa
          src/main/ruby/hbase/admin.rb 4460d6e
          src/main/ruby/shell.rb 1ec330f
          src/main/ruby/shell/commands/alter.rb 1dd43ad
          src/main/ruby/shell/commands/alter_async.rb PRE-CREATION
          src/main/ruby/shell/commands/alter_status.rb PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99

          Diff: https://reviews.apache.org/r/1479/diff

          Testing
          -------

          I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode.
          Need to run unit tests.

          Thanks,

          Nileema

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1479/ ----------------------------------------------------------- Review request for Dhruba Borthakur, Ted Yu, Michael Stack, and Jonathan Gray. Summary ------- When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. This addresses bug HBASE-1730 . https://issues.apache.org/jira/browse/HBASE-1730 Diffs src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java f151c77 src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 13c8b8c src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c0aa024 src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 49d1e7c src/main/java/org/apache/hadoop/hbase/master/BulkReOpen.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/master/HMaster.java 8beeb68 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 57c1140 src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java ae43837 src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java 09891aa src/main/ruby/hbase/admin.rb 4460d6e src/main/ruby/shell.rb 1ec330f src/main/ruby/shell/commands/alter.rb 1dd43ad src/main/ruby/shell/commands/alter_async.rb PRE-CREATION src/main/ruby/shell/commands/alter_status.rb PRE-CREATION src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java a32dc99 Diff: https://reviews.apache.org/r/1479/diff Testing ------- I am putting this up for initial review. I have tested the functionality in a pseudo distributed mode. Need to run unit tests. Thanks, Nileema
          Hide
          nileema shingte added a comment -

          Hi Ted,
          Thank you for reviewing the code!
          I have incorporated them and posted it on the review board here: https://reviews.apache.org/r/1479/

          Show
          nileema shingte added a comment - Hi Ted, Thank you for reviewing the code! I have incorporated them and posted it on the review board here: https://reviews.apache.org/r/1479/
          Hide
          Ted Yu added a comment -

          Please finish this code in BulkReOpen.java:

          +          } catch (InterruptedException e) {
          +            // TODO Auto-generated catch block
          +            e.printStackTrace();
          +          }
          

          The method getThreadCount() can be simplified to:

          +      return this.server.getConfiguration().getInt(
          +        "hbase.bulk.reopen.threadpool.size", defaultThreadCount);
          

          For TableEventHandler, why is the loop needed in reOpenAllRegions() ? BulkReOpen.waitUntilDone() always returns true for bulkReopen.bulkReOpen().

          Show
          Ted Yu added a comment - Please finish this code in BulkReOpen.java: + } catch (InterruptedException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + } The method getThreadCount() can be simplified to: + return this .server.getConfiguration().getInt( + "hbase.bulk.reopen.threadpool.size" , defaultThreadCount); For TableEventHandler, why is the loop needed in reOpenAllRegions() ? BulkReOpen.waitUntilDone() always returns true for bulkReopen.bulkReOpen().
          Hide
          Ted Yu added a comment -

          Nice work. In the future, please use https://reviews.apache.org for code review.
          Is the following needed in HbaseObjectWritable.java where Serializable is handled already:

          +    addToMap(Pair.class, code++);
          

          Please remove log for debugging or change it to LOG.debug():

          +    LOG.info("======== in assignment manager");
          
          +    RegionState regionState = regionsInTransition.get(hri.getEncodedName());
          +    if (!regionsToReopen.isEmpty()) {
          +      regionsToReopen.remove(regionState.getRegion().getEncodedName());
          +    }  
               new ClosedRegionHandler(this.master, this, hri).process();
          

          Should regionsToReopen.remove() call be placed inside ClosedRegionHandler.process() (toward the end)?
          And why use regionState.getRegion().getEncodedName() where hri.getEncodedName() gives the same value ?

          +    int waitTime = this.master.getConfiguration().getInt(
          +        "hbase.bulk.reopen.waitafter.reopen", 0);
          

          Do we need to retrieve the wait time every time in this method ? A better name for the knob is hbase.bulk.waitbetween.reopen

          Show
          Ted Yu added a comment - Nice work. In the future, please use https://reviews.apache.org for code review. Is the following needed in HbaseObjectWritable.java where Serializable is handled already: + addToMap(Pair.class, code++); Please remove log for debugging or change it to LOG.debug(): + LOG.info( "======== in assignment manager" ); + RegionState regionState = regionsInTransition.get(hri.getEncodedName()); + if (!regionsToReopen.isEmpty()) { + regionsToReopen.remove(regionState.getRegion().getEncodedName()); + } new ClosedRegionHandler( this .master, this , hri).process(); Should regionsToReopen.remove() call be placed inside ClosedRegionHandler.process() (toward the end)? And why use regionState.getRegion().getEncodedName() where hri.getEncodedName() gives the same value ? + int waitTime = this .master.getConfiguration().getInt( + "hbase.bulk.reopen.waitafter.reopen" , 0); Do we need to retrieve the wait time every time in this method ? A better name for the knob is hbase.bulk.waitbetween.reopen
          Hide
          nileema shingte added a comment -

          The attached patch (HBASE-1730.patch) does the following:

          When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes:

          1. Changes to reopen the regions when any of the above operations are performed.
          2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it.
          3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time.
          4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second.
          5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status.
          6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated.
          7. modification in the unit test for enabling alter without disabling the table.

          Number of regions that are closed at a time (per region server) is configurable.

          Show
          nileema shingte added a comment - The attached patch ( HBASE-1730 .patch) does the following: When the master receives an alter table call (addColumn, modifyColumn, deleteColumn, modifyTable), it updates the .tableinfo and then closes all the regions of that table. The patch includes: 1. Changes to reopen the regions when any of the above operations are performed. 2. Best effort is made to preserve the locality of regions by assigning it a region plan before closing it. 3. Throttling logic that ensures that only a configurable number of regions are closed per region server at a time. 4. alter command in the hbase shell will block until all the regions are updated, providing a status "x/y regions updated" every second. 5. alter_async command that works exactly like alter, except that it does not block for completion or provide the status. 6. alter_status <table_name> which is a sync call and blocks to provide the "x/y regions updated" status per second until all regions are updated. 7. modification in the unit test for enabling alter without disabling the table. Number of regions that are closed at a time (per region server) is configurable.
          Hide
          nileema shingte added a comment -

          Here is a patch for Dhruba's proposal.

          Show
          nileema shingte added a comment - Here is a patch for Dhruba's proposal.
          Hide
          stack added a comment -

          For a first cut, yes. We can make it more smarty-pants later.

          Show
          stack added a comment - For a first cut, yes. We can make it more smarty-pants later.
          Hide
          dhruba borthakur added a comment -

          > I like Ryans' dictum that we store transitory info in zk only, not permanent info like table schemas.

          Completely agree. All schema info resides in .tableinfo and it will continue to reside there.

          My current proposal for this jira is to make the alter-table command go to the master to update the .tableinfo file. Then the master will trigger the normal close/open sequence for each region-id to all relevant regionservers. There will be some throttling built in so that all regions of the table do not encounter a close/open sequence at the same time.

          The master will not maintain persistant state on which region-ids has been closed/opened; if the master dies before all regionids were closed/opened, then an admin has to issue a disable/enable command to propage the schema change to all region servers.

          Does this sound reasonable?

          Show
          dhruba borthakur added a comment - > I like Ryans' dictum that we store transitory info in zk only, not permanent info like table schemas. Completely agree. All schema info resides in .tableinfo and it will continue to reside there. My current proposal for this jira is to make the alter-table command go to the master to update the .tableinfo file. Then the master will trigger the normal close/open sequence for each region-id to all relevant regionservers. There will be some throttling built in so that all regions of the table do not encounter a close/open sequence at the same time. The master will not maintain persistant state on which region-ids has been closed/opened; if the master dies before all regionids were closed/opened, then an admin has to issue a disable/enable command to propage the schema change to all region servers. Does this sound reasonable?
          Hide
          stack added a comment -

          Dhruba:

          You can ignore the attached patches. They and the design rotted. A few things I learned back then were that "complex" objects to json is not so easy; if class is made of anything more complicated than a base types and primitives then you need to write helper code for ser/deser. Because of this, I tried to dumb-down the schema classes making new versions of HTD and HCD. I didn't finish it.

          Regards the proposal, it will no longer work; i.e the bit about "updates .META.' because recently HTD was moved out of HRI (so .META. no longer holds table meta in each row). It is now to be found in a file named .tableinfo in the filesystem under each table directory.

          So, how this is implemented is pretty wide open. I like Ryans' dictum that we store transitory info in zk only, not permanent info like table schemas. I'd imagine then that a schema edit would change the .tableinfo content and then we'd signal regionservers to reread the .tableinfo (via zk?). What a RS did thereafter could be as dumb as close and reopen of all regions of the modified table or it could try and be smart about it and only do this if it really has to (otherwise, just update its inmemory copy of the .tableinfo with new settings).

          Show
          stack added a comment - Dhruba: You can ignore the attached patches. They and the design rotted. A few things I learned back then were that "complex" objects to json is not so easy; if class is made of anything more complicated than a base types and primitives then you need to write helper code for ser/deser. Because of this, I tried to dumb-down the schema classes making new versions of HTD and HCD. I didn't finish it. Regards the proposal, it will no longer work; i.e the bit about "updates .META.' because recently HTD was moved out of HRI (so .META. no longer holds table meta in each row). It is now to be found in a file named .tableinfo in the filesystem under each table directory. So, how this is implemented is pretty wide open. I like Ryans' dictum that we store transitory info in zk only, not permanent info like table schemas. I'd imagine then that a schema edit would change the .tableinfo content and then we'd signal regionservers to reread the .tableinfo (via zk?). What a RS did thereafter could be as dumb as close and reopen of all regions of the modified table or it could try and be smart about it and only do this if it really has to (otherwise, just update its inmemory copy of the .tableinfo with new settings).
          Hide
          dhruba borthakur added a comment -

          Can somebody pl explain the current proposal? Is it as follows?

          " The master receives a scheme-change command, updates META with the new schema and then issues a close/reopen (with some throttling) to all concerned regionservers. The master maintains current state of which region-servers have completed the close/reopen in a catalog file stored in HDFS".

          Show
          dhruba borthakur added a comment - Can somebody pl explain the current proposal? Is it as follows? " The master receives a scheme-change command, updates META with the new schema and then issues a close/reopen (with some throttling) to all concerned regionservers. The master maintains current state of which region-servers have completed the close/reopen in a catalog file stored in HDFS".
          Hide
          Ted Yu added a comment -

          This feature would save the trouble Eran experienced (HBASE-4060) because he wouldn't need to disable table in order to change TTL.

          Show
          Ted Yu added a comment - This feature would save the trouble Eran experienced ( HBASE-4060 ) because he wouldn't need to disable table in order to change TTL.
          Hide
          Andrew Purtell added a comment -

          Discussed today on IRC in context of 0.92:

          <St^Ack> apurtell: so, altering table online part of 0.92? done by may 1st?
          <St^Ack> apurtell: its pretty critical
          <St^Ack> not only for cps
          <apurtell> i would agree
          <apurtell> just the attrs right?
          <apurtell> we can overlay HTD and HCD attrs in ZK, update HRIs in meta write behind
          <St^Ack> apurtell: yes
          <apurtell> St^Ack: yes i think we need this, that would take care of the CP case (we can watch attr updates and trigger loads); having admin triggered online merge also highly desirable (after split threshold upward adjustment)

          <St^Ack> we can't put it up in zk as I used to think; dj_ryan points out that only transient data should be up in zk 'cos then we can just copy hdfs content to repro cluster elsewhere (otherwise have to copy out of zk too)
          <apurtell> i want to mirror them up in zk
          <apurtell> and change in zk first
          <apurtell> and trigger actions via watches on the attr znodes
          <apurtell> one action is write back to META of changes
          <apurtell> other action could be to load a CP
          <apurtell> other action would be to updated cached HRI or HRI-equivalent so we can e.g. change split threshold dynamically
          <apurtell> so what is in zk is indeed transient
          <apurtell> however we use it as a common point for online changes and use watches to replicate the changes cross cluster
          <apurtell> ... well technically to trigger changes cross cluster
          <St^Ack> apurtell: oh, there is hbase-1730
          <St^Ack> apurtell: stick your desire above into 1730... bit about mirroring

          Show
          Andrew Purtell added a comment - Discussed today on IRC in context of 0.92: <St^Ack> apurtell: so, altering table online part of 0.92? done by may 1st? <St^Ack> apurtell: its pretty critical <St^Ack> not only for cps <apurtell> i would agree <apurtell> just the attrs right? <apurtell> we can overlay HTD and HCD attrs in ZK, update HRIs in meta write behind <St^Ack> apurtell: yes <apurtell> St^Ack: yes i think we need this, that would take care of the CP case (we can watch attr updates and trigger loads); having admin triggered online merge also highly desirable (after split threshold upward adjustment) <St^Ack> we can't put it up in zk as I used to think; dj_ryan points out that only transient data should be up in zk 'cos then we can just copy hdfs content to repro cluster elsewhere (otherwise have to copy out of zk too) <apurtell> i want to mirror them up in zk <apurtell> and change in zk first <apurtell> and trigger actions via watches on the attr znodes <apurtell> one action is write back to META of changes <apurtell> other action could be to load a CP <apurtell> other action would be to updated cached HRI or HRI-equivalent so we can e.g. change split threshold dynamically <apurtell> so what is in zk is indeed transient <apurtell> however we use it as a common point for online changes and use watches to replicate the changes cross cluster <apurtell> ... well technically to trigger changes cross cluster <St^Ack> apurtell: oh, there is hbase-1730 <St^Ack> apurtell: stick your desire above into 1730... bit about mirroring
          Hide
          ryan rawson added a comment -

          this is because we dont have a lot of expertise in managing zk state. there are no dump tools, no restore tools, etc, etc. It would also make snapshotting a table harder, since there is now a secondary system to manage separately. By committing changes to WAL logs then having the regionservers execute those changes we have a better chance of doing snapshot & restore if, for example, HBASE-50 came back.

          Show
          ryan rawson added a comment - this is because we dont have a lot of expertise in managing zk state. there are no dump tools, no restore tools, etc, etc. It would also make snapshotting a table harder, since there is now a secondary system to manage separately. By committing changes to WAL logs then having the regionservers execute those changes we have a better chance of doing snapshot & restore if, for example, HBASE-50 came back.
          Hide
          stack added a comment -

          Ryan suggests that perhaps we not store the schema in ZK, but instead, in a schemas catalog table. We'd use zk though to notify all about schemas changes.

          Show
          stack added a comment - Ryan suggests that perhaps we not store the schema in ZK, but instead, in a schemas catalog table. We'd use zk though to notify all about schemas changes.
          Hide
          stack added a comment -
          Show
          stack added a comment - Design for this is up in http://wiki.apache.org/hadoop/Hbase/MasterRewrite
          Hide
          stack added a comment -

          Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.

          Show
          stack added a comment - Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.
          Hide
          stack added a comment -

          Here is latest state though a million miles from the finish. Doesn't all compile yet.

          + HCD becomes ColumnFamliyDefinition object.
          + HTD becomes TableDefinition
          + New TableState objection (offline, readonly, etc.)
          + HBaseAdmin now becomes nothing but moving stuff around in ZK, no need of a connection to cluster/master (Ryan noted that this means can do schema mods though hbase is not running). All that evening stuff run by the master goes away.
          + There are a bunch of additions to ZKWatcher but I think J-D has done the same ones to make replication work. I need to pick up his pattern of moving all of the master rewrite zk interactions to a class of their own rather than try and squeeze all into ZKWatcher.
          + Tests that play around with zk. New TableData object is made of a TableDefinition and a TableState. This is what we write out to a table znode.

          Show
          stack added a comment - Here is latest state though a million miles from the finish. Doesn't all compile yet. + HCD becomes ColumnFamliyDefinition object. + HTD becomes TableDefinition + New TableState objection (offline, readonly, etc.) + HBaseAdmin now becomes nothing but moving stuff around in ZK, no need of a connection to cluster/master (Ryan noted that this means can do schema mods though hbase is not running). All that evening stuff run by the master goes away. + There are a bunch of additions to ZKWatcher but I think J-D has done the same ones to make replication work. I need to pick up his pattern of moving all of the master rewrite zk interactions to a class of their own rather than try and squeeze all into ZKWatcher. + Tests that play around with zk. New TableData object is made of a TableDefinition and a TableState. This is what we write out to a table znode.
          Hide
          stack added a comment -

          Started a branch over in github for this work: http://github.com/saintstack/hbase/tree/1730

          Show
          stack added a comment - Started a branch over in github for this work: http://github.com/saintstack/hbase/tree/1730
          Hide
          Jean-Daniel Cryans added a comment -

          Looking good! I like how we get rid of all the doubled stuff because we had to maintain a string and a writable.

          Show
          Jean-Daniel Cryans added a comment - Looking good! I like how we get rid of all the doubled stuff because we had to maintain a string and a writable.
          Hide
          stack added a comment -

          More work on HCD. Removes UnmodifyableHCD. Since HCD now implements map, can just do a Collections.unmodifyableMap wrapper on delegate Map. Jackson is sweet in that for new HCD.toString, I ask jackson to serialize to a StringReader. It produces something like a ruby Map – e..g.

          {"NAME": "cf1", "IN_MEMORY": true, "COMPRESSION", "LZO"}

          – only I can then take this 'String' and deserialize to get me back the original HCD. Nice for shell, UI, and content of a znode (I don't think we need to get toBinaryString in there since I don't think we have binary data in schema and table state). Working on HTD next.

          Show
          stack added a comment - More work on HCD. Removes UnmodifyableHCD. Since HCD now implements map, can just do a Collections.unmodifyableMap wrapper on delegate Map. Jackson is sweet in that for new HCD.toString, I ask jackson to serialize to a StringReader. It produces something like a ruby Map – e..g. {"NAME": "cf1", "IN_MEMORY": true, "COMPRESSION", "LZO"} – only I can then take this 'String' and deserialize to get me back the original HCD. Nice for shell, UI, and content of a znode (I don't think we need to get toBinaryString in there since I don't think we have binary data in schema and table state). Working on HTD next.
          Hide
          stack added a comment -

          Started in on HCD. Added test for new serialization. Depends on jackson 1.0.1. Long way from finished.

          This work might have us move from ruby map representations to JSON in UI, shell and in znode. Will see. Shouldn't be prob. really since HRegionInfo will no longer be shown in shell as much since it will no longer be in .META. when this work is done.

          Show
          stack added a comment - Started in on HCD. Added test for new serialization. Depends on jackson 1.0.1. Long way from finished. This work might have us move from ruby map representations to JSON in UI, shell and in znode. Will see. Shouldn't be prob. really since HRegionInfo will no longer be shown in shell as much since it will no longer be in .META. when this work is done.
          Hide
          stack added a comment -

          Playing w/ Jackson, making it understand our HTD and HCD would take a bunch of ugly work. I'm thinking that we just make HTD and HCD be maps. Then Jackson can do it w/o customization. HTD and HCD don't need to be Writables anymore now they are to be serialized instead as JSON into znode.

          Show
          stack added a comment - Playing w/ Jackson, making it understand our HTD and HCD would take a bunch of ugly work. I'm thinking that we just make HTD and HCD be maps. Then Jackson can do it w/o customization. HTD and HCD don't need to be Writables anymore now they are to be serialized instead as JSON into znode.
          Hide
          stack added a comment -

          @apurtell nm... it seems like jackson is split in two in 1.0.1... if I have both jars in hbase/lib, then stargate tests all pass.

          Show
          stack added a comment - @apurtell nm... it seems like jackson is split in two in 1.0.1... if I have both jars in hbase/lib, then stargate tests all pass.
          Hide
          stack added a comment -

          @apurtell I tried putting jackson 1.0.1 up in hbase/lib dir but doesn't look like stargate picks it up from there (though looking at stargate build.xml, it should – maybe it dont like jackson-1.0.1).

          Show
          stack added a comment - @apurtell I tried putting jackson 1.0.1 up in hbase/lib dir but doesn't look like stargate picks it up from there (though looking at stargate build.xml, it should – maybe it dont like jackson-1.0.1).
          Hide
          stack added a comment -

          For sure @ryan. Thats what I'm about: schema and state changes that are near-instantaneous.

          Show
          stack added a comment - For sure @ryan. Thats what I'm about: schema and state changes that are near-instantaneous.
          Hide
          stack added a comment -

          Going to serialize HTD and HCD to zk as JSON; this seems easiest way to go for schema. Will remove notion of state change from HTD and HCD though. For instance if a region is offline, you won't know by reading the .META.; master will know because will have read state from zk.

          @apurtell I should use jackson for pojo 2 json because thats what you and avro are using. One concern is versions. Jackson is up to 1.2. avro is jackson-core-asl-1.0.1.jar. Stargate is 0.94. Let me start with the version in stargate. OK I pull it up to trunk/lib? You can get it when its there? Can I update to 1.0.1?

          Show
          stack added a comment - Going to serialize HTD and HCD to zk as JSON; this seems easiest way to go for schema. Will remove notion of state change from HTD and HCD though. For instance if a region is offline, you won't know by reading the .META.; master will know because will have read state from zk. @apurtell I should use jackson for pojo 2 json because thats what you and avro are using. One concern is versions. Jackson is up to 1.2. avro is jackson-core-asl-1.0.1.jar. Stargate is 0.94. Let me start with the version in stargate. OK I pull it up to trunk/lib? You can get it when its there? Can I update to 1.0.1?
          Hide
          ryan rawson added a comment -

          while we are thinking about this, i'd like to be able to do:

          • add new CF
          • remove existing CF
          • change compression
          • change other things

          without taking a table offline. that would be some killer features right there!

          Show
          ryan rawson added a comment - while we are thinking about this, i'd like to be able to do: add new CF remove existing CF change compression change other things without taking a table offline. that would be some killer features right there!
          Hide
          stack added a comment -

          Assigned myself and added online state changes to title.

          Show
          stack added a comment - Assigned myself and added online state changes to title.
          Hide
          Andrew Purtell added a comment -

          @Jim: If AVRO is JSON based, then why not just use JSON? Pulls in less external dependencies...

          Show
          Andrew Purtell added a comment - @Jim: If AVRO is JSON based, then why not just use JSON? Pulls in less external dependencies...
          Hide
          Jim Kellerman added a comment -

          +1 for putting descriptions both on disk for cold start and in ZK for on-line.

          As for format, how about Avro (which is json based)?

          Show
          Jim Kellerman added a comment - +1 for putting descriptions both on disk for cold start and in ZK for on-line. As for format, how about Avro (which is json based)?
          Hide
          Andrew Purtell added a comment -

          +1 on all of Ryan's points.

          Show
          Andrew Purtell added a comment - +1 on all of Ryan's points.
          Hide
          ryan rawson added a comment -

          lets decide on a serialization format to use within ZK, that is NOT hadoop writables. I would vote for JSON since the data is small, and it would be nice for it to be human readable without programmatic intervention.

          Show
          ryan rawson added a comment - lets decide on a serialization format to use within ZK, that is NOT hadoop writables. I would vote for JSON since the data is small, and it would be nice for it to be human readable without programmatic intervention.

            People

            • Assignee:
              Nicolas Spiegelberg
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development