HBase
  1. HBase
  2. HBASE-42

Set region split size on table creation

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None

      Description

      Right now the region size before a split is determined by a global configuration. It would be nice to configure tables independently of the global parameter.

        Issue Links

          Activity

          Hide
          stack added a comment -

          Looks like HBASE-62 includes this patch. Resolving. Lets reopen if I got it wrong.

          Show
          stack added a comment - Looks like HBASE-62 includes this patch. Resolving. Lets reopen if I got it wrong.
          Hide
          stack added a comment -

          One other thought Andrew:

          You might want to use VersionedWritable instead of adding your own version to HTableDescriptor.

          Show
          stack added a comment - One other thought Andrew: You might want to use VersionedWritable instead of adding your own version to HTableDescriptor.
          Hide
          stack added a comment -

          Hey Andrew. Thanks for the patch. Looks good to me. Didn't try applying it.

          Let us know if you need any facility writing migration scripts (or help); migration scripts are not pretty.

          I'm fine w/ the single point for all table meta changes. Would make API less brittle. If we're going to make the change, lets do it now rather than later. 0.2 already breaks backward compatibility in the API. We're operating under the delusion that we can make all API changes in the one release.

          Show
          stack added a comment - Hey Andrew. Thanks for the patch. Looks good to me. Didn't try applying it. Let us know if you need any facility writing migration scripts (or help); migration scripts are not pretty. I'm fine w/ the single point for all table meta changes. Would make API less brittle. If we're going to make the change, lets do it now rather than later. 0.2 already breaks backward compatibility in the API. We're operating under the delusion that we can make all API changes in the one release.
          Hide
          Andrew Purtell added a comment -

          Preliminary patch attached. (Sorry for the delay, just back from Asia...)

          Todo:

          • Update migration script to rewrite HTableDescriptor with version tag.
          • Test case.
          • Consider extending "modifyTableMeta" to be a single point for changing all HTable metadata, with corresponding change to HMasterInterface to remove addColumn, deleteColumn, and similar. This wouldn't make much difference on the server side but on the client side having a single RPC through which all table metadata is changed would make it easier to move at some later time to Zookeeper or whatever is chosen. Probably the easiest thing to do is something like the following:
            1. Client disables table.
            2. Client uses RPC to retrieve a lease and a serialized HTableDescriptor from the HMaster, which is instantiated. If another lease is outstanding, client must retry.
            3. Client manipulates the HTableDescriptor via its methods.
            4. Client uses RPC to send the updated HTableDescriptor to the HMaster.
            5. Master records the updated HTableDescriptor.
            6. Client enables table.
          Show
          Andrew Purtell added a comment - Preliminary patch attached. (Sorry for the delay, just back from Asia...) Todo: Update migration script to rewrite HTableDescriptor with version tag. Test case. Consider extending "modifyTableMeta" to be a single point for changing all HTable metadata, with corresponding change to HMasterInterface to remove addColumn, deleteColumn, and similar. This wouldn't make much difference on the server side but on the client side having a single RPC through which all table metadata is changed would make it easier to move at some later time to Zookeeper or whatever is chosen. Probably the easiest thing to do is something like the following: Client disables table. Client uses RPC to retrieve a lease and a serialized HTableDescriptor from the HMaster, which is instantiated. If another lease is outstanding, client must retry. Client manipulates the HTableDescriptor via its methods. Client uses RPC to send the updated HTableDescriptor to the HMaster. Master records the updated HTableDescriptor. Client enables table.
          Hide
          stack added a comment -

          Assigning Andrew (after adding him as contributor)

          Show
          stack added a comment - Assigning Andrew (after adding him as contributor)
          Hide
          Bryan Duxbury added a comment -

          This one is for Andrew Purtell. Jim or Stack, make Andrew assignable for issues please.

          Show
          Bryan Duxbury added a comment - This one is for Andrew Purtell. Jim or Stack, make Andrew assignable for issues please.
          Hide
          Bryan Duxbury added a comment -

          We should be able to do this in 0.17.

          Todo:

          • Add split size to HTableDescriptor
          • Add new CREATE TABLE option for region split size (CREATE TABLE ... SPLIT_SIZE=NN)
          • Splitter/compactor should take this alternate size into account
          • Split size specified in hbase-site.xml or hbase-default.xml should be the default split_size value, but otherwise not used at runtime

          Also, this will requie an update to the migration script.

          Show
          Bryan Duxbury added a comment - We should be able to do this in 0.17. Todo: Add split size to HTableDescriptor Add new CREATE TABLE option for region split size (CREATE TABLE ... SPLIT_SIZE=NN) Splitter/compactor should take this alternate size into account Split size specified in hbase-site.xml or hbase-default.xml should be the default split_size value, but otherwise not used at runtime Also, this will requie an update to the migration script.
          Hide
          Billy Pearson added a comment - - edited

          I would like to see this option added too. table level would be fine for me

          Show
          Billy Pearson added a comment - - edited I would like to see this option added too. table level would be fine for me
          Hide
          Jim Kellerman added a comment -

          The finest level of granularity for this parameter would be at the table level since a region split affects all the columns in a particular row range

          Show
          Jim Kellerman added a comment - The finest level of granularity for this parameter would be at the table level since a region split affects all the columns in a particular row range

            People

            • Assignee:
              Andrew Purtell
              Reporter:
              Paul Saab
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development