HBase
  1. HBase
  2. HBASE-2743

Script to drop N regions from a table and then patch hole the hole by inserting a new hole spanning region to meta.

    Details

    • Type: Task Task
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      Script to help out our mozilla buddies.

        Activity

        Hide
        stack added a comment -

        Script that offlines regions in the first run and then on subsequent run, closes and deletes regions finishing by adding up a hole-plugging region to meta. Needs more testing.

        Show
        stack added a comment - Script that offlines regions in the first run and then on subsequent run, closes and deletes regions finishing by adding up a hole-plugging region to meta. Needs more testing.
        Hide
        stack added a comment -

        Testing, the problem is better addressed with two scripts... one to do the offlining, close and delete with another to plug the hole.

        Show
        stack added a comment - Testing, the problem is better addressed with two scripts... one to do the offlining, close and delete with another to plug the hole.
        Hide
        stack added a comment -

        This version records changes made into designated table so we can undo at later date.

        Show
        stack added a comment - This version records changes made into designated table so we can undo at later date.
        Hide
        stack added a comment -

        Here is how the scripts work:

        Run excise_regions.rb as follows:

         ./bin/hbase org.jruby.Main bin/excise_regions.rb TestTable 0000107136 0104707253 archive
        

        The keys provided must be actual region startkeys that are present in the table. 'archive' is the name of the table we archive what we've done into. It must be present for this script to work. It must have a column family named 'info'. Do:

        hbase> create 'archive', 'info'

        ... to create the table.

        Run the above multiple times till no more 'Offlined=' and 'Closed and delete=' messages.

        You may get an NPE from time to time. Thats OK. The master is asked to run parts of this job and when it goes to execute, the info it needs may not be present in .META. (Master needs a patch to not NPE).

        When no more messages, then you all regions between the passed key range will have been removed from .META. (and closed out on the regionservers).

        The regions that were removed from .META. will be listed in the passed archive table ('archive' in the cmdline above).

        You will now have a hole in your table.

        To plug the hole, run the following:

        ./bin/hbase org.jruby.Main bin/plug_hole.rb TestTable 0000107136 0104707253
        

        This adds a region that spans the passed keys.

        Show
        stack added a comment - Here is how the scripts work: Run excise_regions.rb as follows: ./bin/hbase org.jruby.Main bin/excise_regions.rb TestTable 0000107136 0104707253 archive The keys provided must be actual region startkeys that are present in the table. 'archive' is the name of the table we archive what we've done into. It must be present for this script to work. It must have a column family named 'info'. Do: hbase> create 'archive', 'info' ... to create the table. Run the above multiple times till no more 'Offlined=' and 'Closed and delete=' messages. You may get an NPE from time to time. Thats OK. The master is asked to run parts of this job and when it goes to execute, the info it needs may not be present in .META. (Master needs a patch to not NPE). When no more messages, then you all regions between the passed key range will have been removed from .META. (and closed out on the regionservers). The regions that were removed from .META. will be listed in the passed archive table ('archive' in the cmdline above). You will now have a hole in your table. To plug the hole, run the following: ./bin/hbase org.jruby.Main bin/plug_hole.rb TestTable 0000107136 0104707253 This adds a region that spans the passed keys.
        Hide
        stack added a comment -

        I put the scripts here instead: http://github.com/saintstack/hbase_bin_scripts Latest versions have better documentation on their heads.

        Show
        stack added a comment - I put the scripts here instead: http://github.com/saintstack/hbase_bin_scripts Latest versions have better documentation on their heads.
        Hide
        Daniel Einspanjer added a comment -

        I just used this script today to relieve burden on our cluster from too many regions. Details of the operation with some before/after charts are here:
        https://bugzilla.mozilla.org/show_bug.cgi?id=574998

        Show
        Daniel Einspanjer added a comment - I just used this script today to relieve burden on our cluster from too many regions. Details of the operation with some before/after charts are here: https://bugzilla.mozilla.org/show_bug.cgi?id=574998
        Hide
        stack added a comment -

        @Daniel It looks it worked then? 41k to 6k regions?

        Show
        stack added a comment - @Daniel It looks it worked then? 41k to 6k regions?
        Hide
        Daniel Einspanjer added a comment -

        Yes. If we've generated 6k regions since 2010-06-10, that's around 200 per day. That means it would take us approximately 200 days of the same traffic shape to hit 40000 again. We might want to try to tune that number a little more, but it doesn't seem terrible where it is.

        Show
        Daniel Einspanjer added a comment - Yes. If we've generated 6k regions since 2010-06-10, that's around 200 per day. That means it would take us approximately 200 days of the same traffic shape to hit 40000 again. We might want to try to tune that number a little more, but it doesn't seem terrible where it is.
        Hide
        stack added a comment -

        Reopening to mark as noob. This comes up from time to time. Someone will do it.

        Show
        stack added a comment - Reopening to mark as noob. This comes up from time to time. Someone will do it.
        Hide
        Esteban Gutierrez added a comment -

        What about have something like snapshot 'region' and then drop 'region' in the hbase shell for consistency with other commands.

        Show
        Esteban Gutierrez added a comment - What about have something like snapshot 'region' and then drop 'region' in the hbase shell for consistency with other commands.
        Hide
        Esteban Gutierrez added a comment -

        It could work in ranges also by specifying snapshot 'start_region','end_region' and drop 'start_region','end_region'.

        Show
        Esteban Gutierrez added a comment - It could work in ranges also by specifying snapshot 'start_region','end_region' and drop 'start_region','end_region' .
        Hide
        stack added a comment -

        I like the idea of drop a range. Why would we snapshot a range?

        Show
        stack added a comment - I like the idea of drop a range. Why would we snapshot a range?
        Hide
        Esteban Gutierrez added a comment -

        You could do a snapshot of the whole table if you want, but if you are trimming the table I think having a snapshot of the range might be handy if you change your mind and you want to re-load the data, perhaps the operator choose the wrong range initially.

        Show
        Esteban Gutierrez added a comment - You could do a snapshot of the whole table if you want, but if you are trimming the table I think having a snapshot of the range might be handy if you change your mind and you want to re-load the data, perhaps the operator choose the wrong range initially.

          People

          • Assignee:
            Unassigned
            Reporter:
            stack
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development