HBase
  1. HBase
  2. HBASE-2600

Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This is an idea that Ryan and I have been kicking around on and off for a while now.

      If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

      If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

      This issue is about changing the way we name regions.

      If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

      Converting to the new method, we'd have to run a migration on startup changing the content in meta.

      Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

        Issue Links

          Activity

          Hide
          Lars Hofhansl added a comment -

          I ran into this partially when I worked on HBASE-4071.

          One thing I noticed is that getRowOrBefore(...) is actually exposed via HTableInterface. Do we really want this? If we ever want to get rid of this, should we at least deprecate this method in HTableInterface now before more - if any - folks start using it?

          Also a question regarding HBASE-2531, can we rely on the fact all table (except - ROOT - and .META.) have the encodedRegionName? Looking at the patch attached there it seems that regions of old tables (and ROOT and META - but that's ok) could still have the old names.

          Show
          Lars Hofhansl added a comment - I ran into this partially when I worked on HBASE-4071 . One thing I noticed is that getRowOrBefore(...) is actually exposed via HTableInterface. Do we really want this? If we ever want to get rid of this, should we at least deprecate this method in HTableInterface now before more - if any - folks start using it? Also a question regarding HBASE-2531 , can we rely on the fact all table (except - ROOT - and .META.) have the encodedRegionName? Looking at the patch attached there it seems that regions of old tables (and ROOT and META - but that's ok) could still have the old names.
          Hide
          Lars Hofhansl added a comment -

          What I meant to say with the previous comment:
          The comment of Store.getRowKeyAtOrBefore lists a bunch of requirements that are true for the meta tables, but might not hold in general. Somebody using HTableInterface will probably not understand that, and hence it might be better to make this method available publicly at all.

          I realize this is not primarily what this issue about, but it would be nice to rid ourselves of getClosestRowBefore altogether. Maybe starting with removing this from the public interface and then fixing this issue.

          Show
          Lars Hofhansl added a comment - What I meant to say with the previous comment: The comment of Store.getRowKeyAtOrBefore lists a bunch of requirements that are true for the meta tables, but might not hold in general. Somebody using HTableInterface will probably not understand that, and hence it might be better to make this method available publicly at all. I realize this is not primarily what this issue about, but it would be nice to rid ourselves of getClosestRowBefore altogether. Maybe starting with removing this from the public interface and then fixing this issue.
          Hide
          stack added a comment -

          +1 on deprecating getRowOrBefore (if we think its on its way out). And yes, I think this method only 'works' going against .META.

          Also a question regarding HBASE-2531, can we rely on the fact all table (except - ROOT - and .META.) have the encodedRegionName?

          Yes. Only regions created by an hbase older than 0.90 AND that have not been compacted will have old-style names.

          Show
          stack added a comment - +1 on deprecating getRowOrBefore (if we think its on its way out). And yes, I think this method only 'works' going against .META. Also a question regarding HBASE-2531 , can we rely on the fact all table (except - ROOT - and .META.) have the encodedRegionName? Yes. Only regions created by an hbase older than 0.90 AND that have not been compacted will have old-style names.
          Hide
          Lars Hofhansl added a comment -

          One thing that is harder now is how to deal with tables with only one region. The endKey will be '', and hence it sort before all keys we might be looking for. There is no key that reliably sorts after all other keys.
          So if a region cannot be found for a key there has to be a 2nd lookup to check if a row with '' endKey exists (and if we ever split .META. the regionName with the '' endKey might even be served by a different region server).

          Show
          Lars Hofhansl added a comment - One thing that is harder now is how to deal with tables with only one region. The endKey will be '', and hence it sort before all keys we might be looking for. There is no key that reliably sorts after all other keys. So if a region cannot be found for a key there has to be a 2nd lookup to check if a row with '' endKey exists (and if we ever split .META. the regionName with the '' endKey might even be served by a different region server).
          Hide
          stack added a comment -

          IIRC, we do special casing already for '' start/end key. Your concern seems like a valid one but I'd say that when comes implementation time, we can deal in a line or two of checks in a strategic location.

          Show
          stack added a comment - IIRC, we do special casing already for '' start/end key. Your concern seems like a valid one but I'd say that when comes implementation time, we can deal in a line or two of checks in a strategic location.
          Hide
          Lars Hofhansl added a comment -

          I have something basic working (i.e. it does the conversion on startup and I can still do scans and gets on my various existing tables - unless the table has one region that is). The change is in no condition to be attached here, though.

          The startup conversion is idempotent (if the master dies during the conversion it can just be restarted), and in the end it writes a marker into the ROOT table.

          To solve the problem cited above I'll just add a 2nd scan for now.

          Show
          Lars Hofhansl added a comment - I have something basic working (i.e. it does the conversion on startup and I can still do scans and gets on my various existing tables - unless the table has one region that is). The change is in no condition to be attached here, though. The startup conversion is idempotent (if the master dies during the conversion it can just be restarted), and in the end it writes a marker into the ROOT table. To solve the problem cited above I'll just add a 2nd scan for now.
          Hide
          Todd Lipcon added a comment -

          It's a hack, but we could change the format of the rows in META to be <table_name>\x00<region end row>, and then have the special value <table_name>\x01 be used for the last region? Or something of that sort.

          Show
          Todd Lipcon added a comment - It's a hack, but we could change the format of the rows in META to be <table_name>\x00<region end row> , and then have the special value <table_name>\x01 be used for the last region? Or something of that sort.
          Hide
          stack added a comment -

          More hacks. I think table name should be fixed size. Hash or code or something. Would make the compare less of a mess.

          Show
          stack added a comment - More hacks. I think table name should be fixed size. Hash or code or something. Would make the compare less of a mess.
          Hide
          Todd Lipcon added a comment -

          Do we need fixed size, or just a reasonable escaping scheme? Maybe we can use something like consistent overhead byte stuffing: http://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing

          Show
          Todd Lipcon added a comment - Do we need fixed size, or just a reasonable escaping scheme? Maybe we can use something like consistent overhead byte stuffing: http://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing
          Hide
          Lars Hofhansl added a comment -

          I was thinking along that line too (I thought about replacing the first , with a ; in the key for the last region); then I convinced myself that it wouldn't work.

          But I think you are right, it would work.

          Show
          Lars Hofhansl added a comment - I was thinking along that line too (I thought about replacing the first , with a ; in the key for the last region); then I convinced myself that it wouldn't work. But I think you are right, it would work.
          Hide
          Lars Hofhansl added a comment -

          Ah... Comment crossing again.

          Stack and I had been discussing some kind of escaping, because the Meta comparator is the only reason why we pass around comparators everywhere.
          We'd have to guarantee that the escaped byte sequences will still sorts correctly.

          Show
          Lars Hofhansl added a comment - Ah... Comment crossing again. Stack and I had been discussing some kind of escaping, because the Meta comparator is the only reason why we pass around comparators everywhere. We'd have to guarantee that the escaped byte sequences will still sorts correctly.
          Hide
          Todd Lipcon added a comment -

          Yea, I don't think COBS is even necessary... this seems like it would work:

          Region encoding is:
          <table_name>\x00<end key>
          The last region of a table is:
          <table_name>\x01

          If a table name or key has a \x00, \x01, or \x02 in it, it is represented as \x02\x00, \x0201, \x0202

          This should preserve ordering, right?
          foo\x00bar < foo\x01 and foo\x02\x00bar < foo\x02\x01
          foo\x02bar > foo\x02 and foo\x02\x02bar > foo\x02\x02
          foo\x00bar < foo\x03 and foo\x02\x00bar < foo\x03

          Show
          Todd Lipcon added a comment - Yea, I don't think COBS is even necessary... this seems like it would work: Region encoding is: <table_name>\x00<end key> The last region of a table is: <table_name>\x01 If a table name or key has a \x00, \x01, or \x02 in it, it is represented as \x02\x00, \x0201, \x0202 This should preserve ordering, right? foo\x00bar < foo\x01 and foo\x02\x00bar < foo\x02\x01 foo\x02bar > foo\x02 and foo\x02\x02bar > foo\x02\x02 foo\x00bar < foo\x03 and foo\x02\x00bar < foo\x03
          Hide
          Joe Pallas added a comment -

          How would a table name have 0x00 in it? HTableDescriptor says it will throw IllegalArgumentException "if passed a table name that is made of other than 'word' characters or underscores: i.e. [a-zA-Z_0-9]."

          Show
          Joe Pallas added a comment - How would a table name have 0x00 in it? HTableDescriptor says it will throw IllegalArgumentException "if passed a table name that is made of other than 'word' characters or underscores: i.e. [a-zA-Z_0-9] ."
          Hide
          Todd Lipcon added a comment -

          Oh, I forgot that we restrict table names...

          In that case we could just prefix all row keys with \x00, except for the special "end of region" which would be \x01?

          Show
          Todd Lipcon added a comment - Oh, I forgot that we restrict table names... In that case we could just prefix all row keys with \x00, except for the special "end of region" which would be \x01?
          Hide
          Lars Hofhansl added a comment -

          We can probably leave the format closer to what it is now by keeping , as separator (instead of 0x00), and using some other character - maybe ; - for the last region (neither , nor ; are allowed in table names). Still need to fix up MetaKeyComparator in either case.

          This is not for 0.92.

          Show
          Lars Hofhansl added a comment - We can probably leave the format closer to what it is now by keeping , as separator (instead of 0x00), and using some other character - maybe ; - for the last region (neither , nor ; are allowed in table names). Still need to fix up MetaKeyComparator in either case. This is not for 0.92.
          Hide
          Lars Hofhansl added a comment -

          This also has implication for the asynchhbase. Just looked through the code. The RegionInfo there only stores the stopKey because the startKey is implicit in the name. That will no longer be the case.

          Show
          Lars Hofhansl added a comment - This also has implication for the asynchhbase. Just looked through the code. The RegionInfo there only stores the stopKey because the startKey is implicit in the name. That will no longer be the case.
          Hide
          Alex Newman added a comment -

          So I think the issue with ; is that it comes after [0-9]. What about using ! or something like that?

          Show
          Alex Newman added a comment - So I think the issue with ; is that it comes after [0-9] . What about using ! or something like that?
          Hide
          Lars Hofhansl added a comment -

          I think it does not matter as long as it comes after ','.
          The current format is: tableName','..., in order to enforce that the empty region is at the end, we can have the that entry have the forma: tableName';'... that way we ensure it is after all other key for that table.

          Show
          Lars Hofhansl added a comment - I think it does not matter as long as it comes after ','. The current format is: tableName','..., in order to enforce that the empty region is at the end, we can have the that entry have the forma: tableName';'... that way we ensure it is after all other key for that table.
          Hide
          Alex Newman added a comment -

          consider what would happen if you had two tables, one named foo and one named foo1

          Wouldn't you have a row order like

          foo,A
          foo,B
          foo,C
          foo1,A <- notice 1 comes before ; in ascii
          foo1,B
          foo1,C
          foo;,
          foo1;,

          Show
          Alex Newman added a comment - consider what would happen if you had two tables, one named foo and one named foo1 Wouldn't you have a row order like foo,A foo,B foo,C foo1,A <- notice 1 comes before ; in ascii foo1,B foo1,C foo;, foo1;,
          Hide
          Lars Hofhansl added a comment -

          True. Could use '-' or '/' then. '!' is before ',', so that would not work.
          Maybe Todd's suggestion above is best then \x00 for all regions but the last one, and \x01 for the last one.

          Show
          Lars Hofhansl added a comment - True. Could use '-' or '/' then. '!' is before ',', so that would not work. Maybe Todd's suggestion above is best then \x00 for all regions but the last one, and \x01 for the last one.
          Hide
          Ted Yu added a comment -

          '-' or '/' makes row key human-friendly and easier to enter by users.

          Show
          Ted Yu added a comment - '-' or '/' makes row key human-friendly and easier to enter by users.
          Hide
          Alex Newman added a comment -

          Are they valid tableNames?

          Show
          Alex Newman added a comment - Are they valid tableNames?
          Hide
          Alex Newman added a comment -

          Just as a heads up I am very close on this patch. It was a larger change than I wanted but it touches a bunch of places.

          Show
          Alex Newman added a comment - Just as a heads up I am very close on this patch. It was a larger change than I wanted but it touches a bunch of places.
          Hide
          Ted Yu added a comment -

          @Alex:
          Cannot wait to see your work !

          Friendly reminder: .META. table conversion and its unit tests are included, I guess.

          Show
          Ted Yu added a comment - @Alex: Cannot wait to see your work ! Friendly reminder: .META. table conversion and its unit tests are included, I guess.
          Hide
          Alex Newman added a comment -

          Indeed.

          Show
          Alex Newman added a comment - Indeed.
          Hide
          Lars Hofhansl added a comment -

          @Alex, awesome.

          Show
          Lars Hofhansl added a comment - @Alex, awesome.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2965/
          -----------------------------------------------------------

          (Updated 2011-11-29 22:58:47.832063)

          Review request for hbase.

          Summary (updated)
          -------

          The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f

          Diff: https://reviews.apache.org/r/2965/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2965/ ----------------------------------------------------------- (Updated 2011-11-29 22:58:47.832063) Review request for hbase. Summary (updated) ------- The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f Diff: https://reviews.apache.org/r/2965/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2968/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug hbase-2600.
          https://issues.apache.org/jira/browse/hbase-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java d22f50a
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f
          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 1c49dc5
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java aa8512b
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6af1f82
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 08b7de3
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/resources/hbase-default.xml 7059c60
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 66d808f
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 7af4db4
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 940d726
          src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java b579b29
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 49bfc5a
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 477e772
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 24903f3
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java 4a8bb69
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java 60e0e41
          src/test/ruby/hbase/admin_test.rb 0c2672b

          Diff: https://reviews.apache.org/r/2968/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2968/ ----------------------------------------------------------- Review request for hbase. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug hbase-2600. https://issues.apache.org/jira/browse/hbase-2600 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java d22f50a src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 1c49dc5 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java aa8512b src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6af1f82 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 08b7de3 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/resources/hbase-default.xml 7059c60 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 66d808f src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 7af4db4 src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 940d726 src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java b579b29 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 49bfc5a src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 477e772 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 24903f3 src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java 4a8bb69 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java 60e0e41 src/test/ruby/hbase/admin_test.rb 0c2672b Diff: https://reviews.apache.org/r/2968/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2968/#review3568
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2968/#comment7985>

          Please ignore me.

          • Alex

          On 2011-11-29 23:00:08, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2968/

          -----------------------------------------------------------

          (Updated 2011-11-29 23:00:08)

          Review request for hbase.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug hbase-2600.

          https://issues.apache.org/jira/browse/hbase-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java d22f50a

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f

          src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 1c49dc5

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java aa8512b

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6af1f82

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 08b7de3

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/main/resources/hbase-default.xml 7059c60

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 66d808f

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 7af4db4

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 940d726

          src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java b579b29

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 49bfc5a

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 477e772

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 24903f3

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java 4a8bb69

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java 60e0e41

          src/test/ruby/hbase/admin_test.rb 0c2672b

          Diff: https://reviews.apache.org/r/2968/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2968/#review3568 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2968/#comment7985 > Please ignore me. Alex On 2011-11-29 23:00:08, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2968/ ----------------------------------------------------------- (Updated 2011-11-29 23:00:08) Review request for hbase. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug hbase-2600. https://issues.apache.org/jira/browse/hbase-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java d22f50a src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 1c49dc5 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java aa8512b src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6af1f82 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 08b7de3 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/resources/hbase-default.xml 7059c60 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 66d808f src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 7af4db4 src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 940d726 src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java b579b29 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 49bfc5a src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 477e772 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 24903f3 src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java 4a8bb69 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java 60e0e41 src/test/ruby/hbase/admin_test.rb 0c2672b Diff: https://reviews.apache.org/r/2968/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2965/#review3794
          -----------------------------------------------------------

          This patch doesn't seem to be coherent. It seems to be a mix of things Alex. Good on you.

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2965/#comment8601>

          This is odd.

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2965/#comment8600>

          This should do Bytes.toBytes() and pass the String version (else you are susceptible to the machines' locale – toBytes does UTF-8 all the time).

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2965/#comment8602>

          Not sure I grok this change.

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2965/#comment8603>

          Is this change related to this patch?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/2965/#comment8604>

          Should change the javadoc? Can we not get the table name any more once this change goes in? Get the table name from HRI I mean? We'd have to do look up into a Map of UUID to tablename?

          Yeah, what is a tableNameUUID? Its just a UUID?

          • Michael

          On 2011-11-29 22:58:47, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2965/

          -----------------------------------------------------------

          (Updated 2011-11-29 22:58:47)

          Review request for hbase.

          Summary

          -------

          The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f

          Diff: https://reviews.apache.org/r/2965/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2965/#review3794 ----------------------------------------------------------- This patch doesn't seem to be coherent. It seems to be a mix of things Alex. Good on you. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2965/#comment8601 > This is odd. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2965/#comment8600 > This should do Bytes.toBytes() and pass the String version (else you are susceptible to the machines' locale – toBytes does UTF-8 all the time). src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2965/#comment8602 > Not sure I grok this change. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2965/#comment8603 > Is this change related to this patch? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/2965/#comment8604 > Should change the javadoc? Can we not get the table name any more once this change goes in? Get the table name from HRI I mean? We'd have to do look up into a Map of UUID to tablename? Yeah, what is a tableNameUUID? Its just a UUID? Michael On 2011-11-29 22:58:47, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2965/ ----------------------------------------------------------- (Updated 2011-11-29 22:58:47) Review request for hbase. Summary ------- The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f Diff: https://reviews.apache.org/r/2965/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-12-09 19:15:28, Michael Stack wrote:

          > This patch doesn't seem to be coherent. It seems to be a mix of things Alex. Good on you.

          Correct, it was discarded and I reposted a different review. But your comments seem reasonable. Can you move them to the active review?

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/2965/#review3794
          -----------------------------------------------------------

          On 2011-11-29 22:58:47, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/2965/

          -----------------------------------------------------------

          (Updated 2011-11-29 22:58:47)

          Review request for hbase.

          Summary

          -------

          The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f

          Diff: https://reviews.apache.org/r/2965/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-12-09 19:15:28, Michael Stack wrote: > This patch doesn't seem to be coherent. It seems to be a mix of things Alex. Good on you. Correct, it was discarded and I reposted a different review. But your comments seem reasonable. Can you move them to the active review? Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2965/#review3794 ----------------------------------------------------------- On 2011-11-29 22:58:47, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2965/ ----------------------------------------------------------- (Updated 2011-11-29 22:58:47) Review request for hbase. Summary ------- The issue is we have to have a custom compareter for metakey/rootkey scanning to work. One of the reasons why this is required is that the tablenames are currently lexically sorted. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 0c1fa3f Diff: https://reviews.apache.org/r/2965/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/
          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33.352768)

          Review request for hbase.

          Summary
          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33.352768) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          Alex Newman added a comment -

          I updated the review for this change, although this should still be considered a talking point we still need to figure out migrations and a couple of tests. I am just curious if the approach makes sense.

          Show
          Alex Newman added a comment - I updated the review for this change, although this should still be considered a talking point we still need to figure out migrations and a couple of tests. I am just curious if the approach makes sense.
          Hide
          Alex Newman added a comment -

          So we have a couple of choices after a split. Currently daughter regions can come after the parent, while splitting, because of the ts when the split happened. I could just always grab 2 and use the second if the first one is offline. Or we could call next again. Or we could change it so the daughter always comes before the parent.

          Show
          Alex Newman added a comment - So we have a couple of choices after a split. Currently daughter regions can come after the parent, while splitting, because of the ts when the split happened. I could just always grab 2 and use the second if the first one is offline. Or we could call next again. Or we could change it so the daughter always comes before the parent.
          Hide
          stack added a comment -

          Seems like daughters should come before parent if we are to be consistent.

          Could change how we timestamp to be max - current time and as we do now, ensure the timestamp is always less (in this case) than parent – currently we always ensure daughter is > parent.

          Show
          stack added a comment - Seems like daughters should come before parent if we are to be consistent. Could change how we timestamp to be max - current time and as we do now, ensure the timestamp is always less (in this case) than parent – currently we always ensure daughter is > parent.
          Hide
          stack added a comment -

          On other hand, why not have daughters go in after the offlined parent? We'd just call next again as you say above?

          Show
          stack added a comment - On other hand, why not have daughters go in after the offlined parent? We'd just call next again as you say above?
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3887
          -----------------------------------------------------------

          I thought the change would be bigger than this. Does it work?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8723>

          Should this be 'starts with'? Are 0x01 and 0x02 good characters to have here? They are unprintable. Would it be better to have printables? More friendly to, you know, those humans that have to look at this stuff.

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8724>

          Whats this define for?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8722>

          Why this stray ';'?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8725>

          This define name is hard to grok too

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8726>

          Whats up w/ your formatting here? Here and a few lines down for the @return?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8727>

          A bit of a comment here on why this math would help the reader.

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8728>

          Can you not just put DELIMITER here? Ditto for the puts above? Do you have to put it into oneByte first?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8729>

          Would it be error if a null id?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8730>

          method names do not begin with capital letters.

          Line is too long

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8731>

          Formatting is off in this method?

          src/main/java/org/apache/hadoop/hbase/KeyValue.java
          <https://reviews.apache.org/r/3186/#comment8732>

          Will this always give same name? Doesn't uuid have time and machine name inputs?

          Does this belong in here anyways?

          src/main/java/org/apache/hadoop/hbase/KeyValue.java
          <https://reviews.apache.org/r/3186/#comment8734>

          line too long

          src/main/java/org/apache/hadoop/hbase/KeyValue.java
          <https://reviews.apache.org/r/3186/#comment8735>

          Would suggest you not change the formatting already in place; blend in instead (lines too long anyway)

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java
          <https://reviews.apache.org/r/3186/#comment8737>

          Needs class comment.

          What is this class replacing?

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java
          <https://reviews.apache.org/r/3186/#comment8738>

          public methods need javadoc?

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java
          <https://reviews.apache.org/r/3186/#comment8739>

          Do we need zeros?

          Is this tablename its uuid?

          Maybe we can't do uuid if it has host and time factors? Maybe need to sha1/md5 it? Something that will always give us same answer regardless of when we hash or where?

          • Michael

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3887 ----------------------------------------------------------- I thought the change would be bigger than this. Does it work? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8723 > Should this be 'starts with'? Are 0x01 and 0x02 good characters to have here? They are unprintable. Would it be better to have printables? More friendly to, you know, those humans that have to look at this stuff. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8724 > Whats this define for? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8722 > Why this stray ';'? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8725 > This define name is hard to grok too src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8726 > Whats up w/ your formatting here? Here and a few lines down for the @return? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8727 > A bit of a comment here on why this math would help the reader. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8728 > Can you not just put DELIMITER here? Ditto for the puts above? Do you have to put it into oneByte first? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8729 > Would it be error if a null id? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8730 > method names do not begin with capital letters. Line is too long src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8731 > Formatting is off in this method? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/3186/#comment8732 > Will this always give same name? Doesn't uuid have time and machine name inputs? Does this belong in here anyways? src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/3186/#comment8734 > line too long src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/3186/#comment8735 > Would suggest you not change the formatting already in place; blend in instead (lines too long anyway) src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java < https://reviews.apache.org/r/3186/#comment8737 > Needs class comment. What is this class replacing? src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java < https://reviews.apache.org/r/3186/#comment8738 > public methods need javadoc? src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java < https://reviews.apache.org/r/3186/#comment8739 > Do we need zeros? Is this tablename its uuid? Maybe we can't do uuid if it has host and time factors? Maybe need to sha1/md5 it? Something that will always give us same answer regardless of when we hash or where? Michael On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > I thought the change would be bigger than this. Does it work?

          Mostly, we still need the migration to work and a small change around the questions still in the jira. Also it is dependent on the two other reviews I filed.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 142

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line142>

          >

          > Should this be 'starts with'? Are 0x01 and 0x02 good characters to have here? They are unprintable. Would it be better to have printables? More friendly to, you know, those humans that have to look at this stuff.

          It should say, the tablename encoded in the region ends with 0x01, but the last region ends with 0x02. I am down with changing it to anything, it's super easy now that we uuid the tablename.

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3887
          -----------------------------------------------------------

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-12-13 22:35:41, Michael Stack wrote: > I thought the change would be bigger than this. Does it work? Mostly, we still need the migration to work and a small change around the questions still in the jira. Also it is dependent on the two other reviews I filed. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 142 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line142 > > > Should this be 'starts with'? Are 0x01 and 0x02 good characters to have here? They are unprintable. Would it be better to have printables? More friendly to, you know, those humans that have to look at this stuff. It should say, the tablename encoded in the region ends with 0x01, but the last region ends with 0x02. I am down with changing it to anything, it's super easy now that we uuid the tablename. Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3887 ----------------------------------------------------------- On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3893
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/KeyValue.java
          <https://reviews.apache.org/r/3186/#comment8759>

          According to http://kickjava.com/src/java/util/UUID.java.htm, UUID.nameUUIDFromBytes() resorts to MD5 MessageDigest.

          From http://www.docjar.com/html/api/java/security/MessageDigest.java.html, I don't see machine name or time being involved.

          • Ted

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3893 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/KeyValue.java < https://reviews.apache.org/r/3186/#comment8759 > According to http://kickjava.com/src/java/util/UUID.java.htm , UUID.nameUUIDFromBytes() resorts to MD5 MessageDigest. From http://www.docjar.com/html/api/java/security/MessageDigest.java.html , I don't see machine name or time being involved. Ted On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > I thought the change would be bigger than this. Does it work?

          Alex Newman wrote:

          Mostly, we still need the migration to work and a small change around the questions still in the jira. Also it is dependent on the two other reviews I filed.

          Expect an update shortly.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 145

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line145>

          >

          > Whats this define for?

          This is the 0x01 at the end of the table name.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line146>

          >

          > Why this stray ';'?

          fixed

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 147

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line147>

          >

          > This define name is hard to grok too

          It was actually suggested by ted. I am totally open minded.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 335

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line335>

          >

          > Whats up w/ your formatting here? Here and a few lines down for the @return?

          Fixed I think

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 347

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line347>

          >

          > A bit of a comment here on why this math would help the reader.

          done

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 366

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line366>

          >

          > Can you not just put DELIMITER here? Ditto for the puts above? Do you have to put it into oneByte first?

          Indeed, as far as I can tell you do. I could probably add it to one of the strings which is appended above if that is preferable.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 372

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line372>

          >

          > Would it be error if a null id?

          I think code actually does exercise this, but I will double check. It might just be testing.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 470

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line470>

          >

          > method names do not begin with capital letters.

          >

          > Line is too long

          woopsie, fixed

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 477

          > <https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line477>

          >

          > Formatting is off in this method?

          fixed

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1907

          > <https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line1907>

          >

          > line too long

          done

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1991

          > <https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line1991>

          >

          > Would suggest you not change the formatting already in place; blend in instead (lines too long anyway)

          done. But it's hard to read

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java, line 47

          > <https://reviews.apache.org/r/3186/diff/2/?file=64444#file64444line47>

          >

          > Do we need zeros?

          >

          > Is this tablename its uuid?

          >

          > Maybe we can't do uuid if it has host and time factors? Maybe need to sha1/md5 it? Something that will always give us same answer regardless of when we hash or where?

          It doesn't have those factors as far as I can tell. I reflected to the api to hint that it's a uuid of a tablename

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 68

          > <https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line68>

          >

          > Will this always give same name? Doesn't uuid have time and machine name inputs?

          >

          > Does this belong in here anyways?

          I ran the test:

          import java.util.List;
          import java.util.UUID;

          /**

          • Created by IntelliJ IDEA.
          • User: alexnewman
          • Date: 12/13/11
          • Time: 2:55 PM
          • To change this template use File | Settings | File Templates.
            */
            public class Test {

          public static void main(String args[])

          { String test = "test123"; UUID foo = UUID.nameUUIDFromBytes(test.getBytes()); System.out.println(foo.toString()); }

          }

          on a couple of machines a couple of times and it always made the same string. I think it's just that a "new UUID" has those components.

          On 2011-12-13 22:35:41, Michael Stack wrote:

          > src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java, line 30

          > <https://reviews.apache.org/r/3186/diff/2/?file=64444#file64444line30>

          >

          > public methods need javadoc?

          done.

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3887
          -----------------------------------------------------------

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-12-13 22:35:41, Michael Stack wrote: > I thought the change would be bigger than this. Does it work? Alex Newman wrote: Mostly, we still need the migration to work and a small change around the questions still in the jira. Also it is dependent on the two other reviews I filed. Expect an update shortly. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 145 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line145 > > > Whats this define for? This is the 0x01 at the end of the table name. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line146 > > > Why this stray ';'? fixed On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 147 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line147 > > > This define name is hard to grok too It was actually suggested by ted. I am totally open minded. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 335 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line335 > > > Whats up w/ your formatting here? Here and a few lines down for the @return? Fixed I think On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 347 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line347 > > > A bit of a comment here on why this math would help the reader. done On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 366 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line366 > > > Can you not just put DELIMITER here? Ditto for the puts above? Do you have to put it into oneByte first? Indeed, as far as I can tell you do. I could probably add it to one of the strings which is appended above if that is preferable. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 372 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line372 > > > Would it be error if a null id? I think code actually does exercise this, but I will double check. It might just be testing. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 470 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line470 > > > method names do not begin with capital letters. > > Line is too long woopsie, fixed On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 477 > < https://reviews.apache.org/r/3186/diff/2/?file=64438#file64438line477 > > > Formatting is off in this method? fixed On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1907 > < https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line1907 > > > line too long done On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 1991 > < https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line1991 > > > Would suggest you not change the formatting already in place; blend in instead (lines too long anyway) done. But it's hard to read On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java, line 47 > < https://reviews.apache.org/r/3186/diff/2/?file=64444#file64444line47 > > > Do we need zeros? > > Is this tablename its uuid? > > Maybe we can't do uuid if it has host and time factors? Maybe need to sha1/md5 it? Something that will always give us same answer regardless of when we hash or where? It doesn't have those factors as far as I can tell. I reflected to the api to hint that it's a uuid of a tablename On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 68 > < https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line68 > > > Will this always give same name? Doesn't uuid have time and machine name inputs? > > Does this belong in here anyways? I ran the test: import java.util.List; import java.util.UUID; /** Created by IntelliJ IDEA. User: alexnewman Date: 12/13/11 Time: 2:55 PM To change this template use File | Settings | File Templates. */ public class Test { public static void main(String args[]) { String test = "test123"; UUID foo = UUID.nameUUIDFromBytes(test.getBytes()); System.out.println(foo.toString()); } } on a couple of machines a couple of times and it always made the same string. I think it's just that a "new UUID" has those components. On 2011-12-13 22:35:41, Michael Stack wrote: > src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java, line 30 > < https://reviews.apache.org/r/3186/diff/2/?file=64444#file64444line30 > > > public methods need javadoc? done. Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3887 ----------------------------------------------------------- On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-12-13 23:12:20, Ted Yu wrote:

          > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 68

          > <https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line68>

          >

          > According to http://kickjava.com/src/java/util/UUID.java.htm, UUID.nameUUIDFromBytes() resorts to MD5 MessageDigest.

          >

          > From http://www.docjar.com/html/api/java/security/MessageDigest.java.html, I don't see machine name or time being involved.

          Thanks Ted. Yeah, saw that when went to read up on it. Was going to suggest we just md5 it altogether but could go either way (and not important at this stage anyways).

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3893
          -----------------------------------------------------------

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-12-13 23:12:20, Ted Yu wrote: > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 68 > < https://reviews.apache.org/r/3186/diff/2/?file=64439#file64439line68 > > > According to http://kickjava.com/src/java/util/UUID.java.htm , UUID.nameUUIDFromBytes() resorts to MD5 MessageDigest. > > From http://www.docjar.com/html/api/java/security/MessageDigest.java.html , I don't see machine name or time being involved. Thanks Ted. Yeah, saw that when went to read up on it. Was going to suggest we just md5 it altogether but could go either way (and not important at this stage anyways). Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3893 ----------------------------------------------------------- On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/#review3895
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3186/#comment8762>

          Yeah, Michael should blame me.

          This marker marks the last region for the underlying table where endkey is empty, hence the choice of value 0x02.

          • Ted

          On 2011-12-13 21:12:33, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3186/

          -----------------------------------------------------------

          (Updated 2011-12-13 21:12:33)

          Review request for hbase.

          Summary

          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55

          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0

          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing

          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/#review3895 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3186/#comment8762 > Yeah, Michael should blame me. This marker marks the last region for the underlying table where endkey is empty, hence the choice of value 0x02. Ted On 2011-12-13 21:12:33, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 21:12:33) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/
          -----------------------------------------------------------

          (Updated 2011-12-13 23:36:54.674313)

          Review request for hbase.

          Summary
          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-13 23:36:54.674313) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/
          -----------------------------------------------------------

          (Updated 2011-12-14 00:07:41.488978)

          Review request for hbase.

          Changes
          -------

          This change handles scanning while splitting better.

          Summary
          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-14 00:07:41.488978) Review request for hbase. Changes ------- This change handles scanning while splitting better. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 1cf58a9 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 6f19d21 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6fca020 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 95712dd src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3186/
          -----------------------------------------------------------

          (Updated 2011-12-19 21:25:51.965234)

          Review request for hbase.

          Summary
          -------

          PART 1 of hbase-4616

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d475a1d
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8cc6444
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55
          src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/HConstants.java 3c83846
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 9ea19e5
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 9f66880
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfoGetTableName.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0
          src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3186/diff

          Testing
          -------

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3186/ ----------------------------------------------------------- (Updated 2011-12-19 21:25:51.965234) Review request for hbase. Summary ------- PART 1 of hbase-4616 This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 6bff130 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d475a1d src/main/java/org/apache/hadoop/hbase/client/HTable.java 8cc6444 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 src/main/java/org/apache/hadoop/hbase/client/MetaSearchRow.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/HConstants.java 3c83846 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 9ea19e5 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java 9f66880 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfoGetTableName.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java a092cf0 src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java 1997abd src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3186/diff Testing ------- Thanks, Alex
          Hide
          Alex Newman added a comment -

          There's lots of discussion https://issues.apache.org/jira/browse/HBASE-4616 as well

          Show
          Alex Newman added a comment - There's lots of discussion https://issues.apache.org/jira/browse/HBASE-4616 as well
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          Review request for hbase and Michael Stack.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java 5120a3c
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java 5120a3c src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510274/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 20 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -147 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
          org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/736//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/736//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/736//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510274/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/736//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/736//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/736//console This message is automatically generated.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-16 18:26:39.949854)

          Review request for hbase and Michael Stack.

          Changes
          -------

          Updating the patch so that
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java

          uses the endkey instead of the startkey as it's more oftenly populated.

          it fixes the occasional test breakage of org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster#testShutdownSimpleFixup

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-16 18:26:39.949854) Review request for hbase and Michael Stack. Changes ------- Updating the patch so that src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java uses the endkey instead of the startkey as it's more oftenly populated. it fixes the occasional test breakage of org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster#testShutdownSimpleFixup Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510727/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 20 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
          org.apache.hadoop.hbase.replication.TestReplicationPeer
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/780//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/780//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/780//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510727/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD org.apache.hadoop.hbase.replication.TestReplicationPeer org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/780//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/780//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/780//console This message is automatically generated.
          Hide
          Alex Newman added a comment -

          I'll take a look at these broken tests. Weird that these didn't break on my jenkins.

          Show
          Alex Newman added a comment - I'll take a look at these broken tests. Weird that these didn't break on my jenkins.
          Hide
          Lars Hofhansl added a comment -

          These three always fail it seems:
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Show
          Lars Hofhansl added a comment - These three always fail it seems: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          Hide
          Alex Newman added a comment -

          On all jenkins?

          Show
          Alex Newman added a comment - On all jenkins?
          Hide
          Lars Hofhansl added a comment -

          Something to do with the Hadoop version on the jenkins machines.
          Ted might know the details.

          Show
          Lars Hofhansl added a comment - Something to do with the Hadoop version on the jenkins machines. Ted might know the details.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4418
          -----------------------------------------------------------

          This looks pretty good. Thanks for being persistent and patient Alex!
          Devil is probably still in the details.

          All the getClosestBefore huh hah can now be removed from HTable/Region[Server]/Store, right?

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3466/#comment9924>

          ! and "
          Although it's not very intuitive.
          So the encoded region is now?
          <tableName>!,<endKey>,...
          <tableName>",<endKey>,...

          Is that simpler than replacing the separator?
          That could look like this:
          <tableName>,<endKey>,...
          <tableName>/<endKey>,...

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3466/#comment9923>

          addEncoding does not use the startKey. Could just remove it from there, and hence from here as well so that this method just needs to know the endKey.

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
          <https://reviews.apache.org/r/3466/#comment9925>

          I like this. Captures what it is doing without being too complicated.

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/3466/#comment9926>

          Why is this needed?

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java
          <https://reviews.apache.org/r/3466/#comment9927>

          Yeah... Be gone!

          • Lars

          On 2012-01-16 18:26:39, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-16 18:26:39)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4418 ----------------------------------------------------------- This looks pretty good. Thanks for being persistent and patient Alex! Devil is probably still in the details. All the getClosestBefore huh hah can now be removed from HTable/Region [Server] /Store, right? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3466/#comment9924 > ! and " Although it's not very intuitive. So the encoded region is now? <tableName>!,<endKey>,... <tableName>",<endKey>,... Is that simpler than replacing the separator? That could look like this: <tableName>,<endKey>,... <tableName>/<endKey>,... src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3466/#comment9923 > addEncoding does not use the startKey. Could just remove it from there, and hence from here as well so that this method just needs to know the endKey. src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java < https://reviews.apache.org/r/3466/#comment9925 > I like this. Captures what it is doing without being too complicated. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/3466/#comment9926 > Why is this needed? src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java < https://reviews.apache.org/r/3466/#comment9927 > Yeah... Be gone! Lars On 2012-01-16 18:26:39, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-16 18:26:39) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Ted Yu added a comment -

          See MAPREDUCE-3583 for background on test failures for:

          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          

          TestMetaMigrationRemovingHTD needs attention for this feature.

          Show
          Ted Yu added a comment - See MAPREDUCE-3583 for background on test failures for: org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat TestMetaMigrationRemovingHTD needs attention for this feature.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4420
          -----------------------------------------------------------

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
          <https://reviews.apache.org/r/3466/#comment9931>

          Since we don't support meta splitting this is a lot simpler than the previous code. It would require a decent amount of changes, to be supported without getClosestRowBefore.

          • Alex

          On 2012-01-17 01:17:10, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 01:17:10)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4420 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < https://reviews.apache.org/r/3466/#comment9931 > Since we don't support meta splitting this is a lot simpler than the previous code. It would require a decent amount of changes, to be supported without getClosestRowBefore. Alex On 2012-01-17 01:17:10, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 01:17:10) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-17 01:17:10.937632)

          Review request for hbase and Michael Stack.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 01:17:10.937632) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826

          > <https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826>

          >

          > Why is this needed?

          Alex Newman wrote:

          Ooh, your right, it's not needed anymore. Removed.

          Woops wrong reply. Basically this would require a large change and we dont' support splitting meta anyway. I would argue this is much more efficient.

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4418
          -----------------------------------------------------------

          On 2012-01-17 01:17:10, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 01:17:10)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826 > < https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826 > > > Why is this needed? Alex Newman wrote: Ooh, your right, it's not needed anymore. Removed. Woops wrong reply. Basically this would require a large change and we dont' support splitting meta anyway. I would argue this is much more efficient. Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4418 ----------------------------------------------------------- On 2012-01-17 01:17:10, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 01:17:10) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > This looks pretty good. Thanks for being persistent and patient Alex!

          > Devil is probably still in the details.

          >

          > All the getClosestBefore huh hah can now be removed from HTable/Region[Server]/Store, right?

          sounds good. Should I remove it or mark it deprecated?

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146

          > <https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line146>

          >

          > ! and "

          > Although it's not very intuitive.

          > So the encoded region is now?

          > <tableName>!,<endKey>,...

          > <tableName>",<endKey>,...

          >

          > Is that simpler than replacing the separator?

          > That could look like this:

          > <tableName>,<endKey>,...

          > <tableName>/<endKey>,...

          >

          I believe it is much simpler than replacing the separator. In addition, i have a feeling that the format of these keys is going to change after I get this through. There is no reason why we can't move to fixed sized lhs/rhs, but I wanted to keep this patch simple.

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 436

          > <https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line436>

          >

          > addEncoding does not use the startKey. Could just remove it from there, and hence from here as well so that this method just needs to know the endKey.

          Ooh right, sorry. Fixed in patch

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826

          > <https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826>

          >

          > Why is this needed?

          Ooh, your right, it's not needed anymore. Removed.

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4418
          -----------------------------------------------------------

          On 2012-01-17 01:17:10, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 01:17:10)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-01-17 00:08:40, Lars Hofhansl wrote: > This looks pretty good. Thanks for being persistent and patient Alex! > Devil is probably still in the details. > > All the getClosestBefore huh hah can now be removed from HTable/Region [Server] /Store, right? sounds good. Should I remove it or mark it deprecated? On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146 > < https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line146 > > > ! and " > Although it's not very intuitive. > So the encoded region is now? > <tableName>!,<endKey>,... > <tableName>",<endKey>,... > > Is that simpler than replacing the separator? > That could look like this: > <tableName>,<endKey>,... > <tableName>/<endKey>,... > I believe it is much simpler than replacing the separator. In addition, i have a feeling that the format of these keys is going to change after I get this through. There is no reason why we can't move to fixed sized lhs/rhs, but I wanted to keep this patch simple. On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 436 > < https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line436 > > > addEncoding does not use the startKey. Could just remove it from there, and hence from here as well so that this method just needs to know the endKey. Ooh right, sorry. Fixed in patch On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826 > < https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826 > > > Why is this needed? Ooh, your right, it's not needed anymore. Removed. Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4418 ----------------------------------------------------------- On 2012-01-17 01:17:10, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 01:17:10) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > This looks pretty good. Thanks for being persistent and patient Alex!

          > Devil is probably still in the details.

          >

          > All the getClosestBefore huh hah can now be removed from HTable/Region[Server]/Store, right?

          Alex Newman wrote:

          sounds good. Should I remove it or mark it deprecated?

          In 0.92 I marked HTableInterface.getRowOrBefore as deprecated, so now we can go ahead and kill it, along with the methods in RegionServer, Store, etc.

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146

          > <https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line146>

          >

          > ! and "

          > Although it's not very intuitive.

          > So the encoded region is now?

          > <tableName>!,<endKey>,...

          > <tableName>",<endKey>,...

          >

          > Is that simpler than replacing the separator?

          > That could look like this:

          > <tableName>,<endKey>,...

          > <tableName>/<endKey>,...

          >

          Alex Newman wrote:

          I believe it is much simpler than replacing the separator. In addition, i have a feeling that the format of these keys is going to change after I get this through. There is no reason why we can't move to fixed sized lhs/rhs, but I wanted to keep this patch simple.

          That's fine then.

          On 2012-01-17 00:08:40, Lars Hofhansl wrote:

          > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826

          > <https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826>

          >

          > Why is this needed?

          Alex Newman wrote:

          Ooh, your right, it's not needed anymore. Removed.

          Alex Newman wrote:

          Woops wrong reply. Basically this would require a large change and we dont' support splitting meta anyway. I would argue this is much more efficient.

          Yeah.. I hope we never ever have to go back and require META splits.

          • Lars

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4418
          -----------------------------------------------------------

          On 2012-01-17 01:17:10, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 01:17:10)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-01-17 00:08:40, Lars Hofhansl wrote: > This looks pretty good. Thanks for being persistent and patient Alex! > Devil is probably still in the details. > > All the getClosestBefore huh hah can now be removed from HTable/Region [Server] /Store, right? Alex Newman wrote: sounds good. Should I remove it or mark it deprecated? In 0.92 I marked HTableInterface.getRowOrBefore as deprecated, so now we can go ahead and kill it, along with the methods in RegionServer, Store, etc. On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 146 > < https://reviews.apache.org/r/3466/diff/2/?file=68996#file68996line146 > > > ! and " > Although it's not very intuitive. > So the encoded region is now? > <tableName>!,<endKey>,... > <tableName>",<endKey>,... > > Is that simpler than replacing the separator? > That could look like this: > <tableName>,<endKey>,... > <tableName>/<endKey>,... > Alex Newman wrote: I believe it is much simpler than replacing the separator. In addition, i have a feeling that the format of these keys is going to change after I get this through. There is no reason why we can't move to fixed sized lhs/rhs, but I wanted to keep this patch simple. That's fine then. On 2012-01-17 00:08:40, Lars Hofhansl wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 826 > < https://reviews.apache.org/r/3466/diff/2/?file=69001#file69001line826 > > > Why is this needed? Alex Newman wrote: Ooh, your right, it's not needed anymore. Removed. Alex Newman wrote: Woops wrong reply. Basically this would require a large change and we dont' support splitting meta anyway. I would argue this is much more efficient. Yeah.. I hope we never ever have to go back and require META splits. Lars ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4418 ----------------------------------------------------------- On 2012-01-17 01:17:10, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 01:17:10) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-17 02:37:58.199761)

          Review request for hbase and Michael Stack.

          Changes
          -------

          Removed the deprecated api

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 02:37:58.199761) Review request for hbase and Michael Stack. Changes ------- Removed the deprecated api Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Alex Newman added a comment -

          Added a patch with removal of old api.

          Show
          Alex Newman added a comment - Added a patch with removal of old api.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-17 02:40:00.376730)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Changes
          -------

          added lars as a reviewer.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 02:40:00.376730) Review request for hbase, Michael Stack and Lars Hofhansl. Changes ------- added lars as a reviewer. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4423
          -----------------------------------------------------------

          I'm slightly worried that this patch has gotten overly big. Maybe we should break up the removal of the old api into it's own? Or this fine?

          • Alex

          On 2012-01-17 02:37:58, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 02:37:58)

          Review request for hbase and Michael Stack.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6

          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896

          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c

          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184

          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1

          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0

          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5

          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41

          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61

          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020

          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4423 ----------------------------------------------------------- I'm slightly worried that this patch has gotten overly big. Maybe we should break up the removal of the old api into it's own? Or this fine? Alex On 2012-01-17 02:37:58, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 02:37:58) Review request for hbase and Michael Stack. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510791/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 31 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/784//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/784//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/784//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510791/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 31 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/784//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/784//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/784//console This message is automatically generated.
          Hide
          Alex Newman added a comment -

          I attached the results of my jenkins run.

          Show
          Alex Newman added a comment - I attached the results of my jenkins run.
          Hide
          Ted Yu added a comment -

          TestMetaMigrationRemovingHTD is the only failed test.

          Show
          Ted Yu added a comment - TestMetaMigrationRemovingHTD is the only failed test.
          Hide
          Alex Newman added a comment -

          @Zhihong indeed, i need stack's help for fixing that one.

          Show
          Alex Newman added a comment - @Zhihong indeed, i need stack's help for fixing that one.
          Hide
          Alex Newman added a comment -

          I realized that manually editing the generated thrift files might not be the best approach. Any suggestions?

          Show
          Alex Newman added a comment - I realized that manually editing the generated thrift files might not be the best approach. Any suggestions?
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4427
          -----------------------------------------------------------

          I love all that removed code!!
          I think it's fine to have it with patch (in fact removing all that code and the main reason why we're doing this, right?)

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
          <https://reviews.apache.org/r/3466/#comment9937>

          This still has the startKey, but it is not needed. Seems we can simplify the code further by only requiring the endKey here

          • Lars

          On 2012-01-17 02:40:00, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-17 02:40:00)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6

          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896

          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c

          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184

          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8

          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1

          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0

          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5

          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41

          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61

          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020

          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4427 ----------------------------------------------------------- I love all that removed code!! I think it's fine to have it with patch (in fact removing all that code and the main reason why we're doing this, right?) src/main/java/org/apache/hadoop/hbase/HRegionInfo.java < https://reviews.apache.org/r/3466/#comment9937 > This still has the startKey, but it is not needed. Seems we can simplify the code further by only requiring the endKey here Lars On 2012-01-17 02:40:00, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 02:40:00) Review request for hbase, Michael Stack and Lars Hofhansl. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-17 20:35:34.796686)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-17 20:35:34.796686) Review request for hbase, Michael Stack and Lars Hofhansl. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8aeccb6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java ab80020 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Ted Yu added a comment -

          For HRegionInfo.createRegionName():

            public static byte [] createRegionName(final byte [] tableName,
                final byte [] startKey, final long regionid, boolean newFormat) {
          

          I don't see it deprecated in 0.92.

          I suggest creating a sub-task in 0.92.1 for this JIRA which deprecates the API's whose semantics are changed in this JIRA.

          Some (though few) users may utilize the above API in their codebase.

          Show
          Ted Yu added a comment - For HRegionInfo.createRegionName(): public static byte [] createRegionName( final byte [] tableName, final byte [] startKey, final long regionid, boolean newFormat) { I don't see it deprecated in 0.92. I suggest creating a sub-task in 0.92.1 for this JIRA which deprecates the API's whose semantics are changed in this JIRA. Some (though few) users may utilize the above API in their codebase.
          Hide
          Ted Yu added a comment -

          Patch rebased for the latest TRUNK.

          Show
          Ted Yu added a comment - Patch rebased for the latest TRUNK.
          Hide
          Ted Yu added a comment -

          Still need to understand the test failure.
          Since TestMetaMigrationRemovingHTD migrates from 0.90 HBase, I wonder if the test itself should be maintained in TRUNK.
          Migration from 0.90 to 0.94 isn't supported.

          Show
          Ted Yu added a comment - Still need to understand the test failure. Since TestMetaMigrationRemovingHTD migrates from 0.90 HBase, I wonder if the test itself should be maintained in TRUNK. Migration from 0.90 to 0.94 isn't supported.
          Hide
          Alex Newman added a comment -

          I will in fact fix this, but I am going to need stacks help. I know what the issue is.

          Show
          Alex Newman added a comment - I will in fact fix this, but I am going to need stacks help. I know what the issue is.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510912/2600-trunk-01-17.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 24 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/798//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/798//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/798//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510912/2600-trunk-01-17.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 24 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/798//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/798//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/798//console This message is automatically generated.
          Hide
          stack added a comment -

          I like what Ted says above about need to deprecate a call to remove.

          Here's some feedback. I'm half-way done. Patch is shaping up nicely.

          What is the change in TestHRegionInfo? You change a startkey to an endkey? I'm not sure I follow why this is done.

          I like the removal of the testGetClosestBefore from TestMinVersions.java
          and of TestGetClosestAtOrBefore.java (hurray!)

          I love all the removed code.

          So in HConstants, ZEROS is deprecated but NINES is not? How is NINES used still (later I see it used but I'm not sure what its doing?)

          I like removal of META_ROW_DELIMITER

          Spacing is wacky here:

          • this.startKey, this.id,
          • !HTableDescriptor.isMetaTable(tableNameAsBytes));
          • return Bytes.toString(nameAsBytes);
            + this.endKey,
            + Long.toString(this.id).getBytes(),
            + !HTableDescriptor.isMetaTable(tableNameAsBytes));
            + return Bytes.toStringBinary(nameAsBytes);

          What happens if the last region in a table is missing for whatever reason?

          Is the javadoc on getStartRow in HTableDescriptor right? Its says its returning first
          possible region that could match a tablename + searchrow? Is it first possible row
          in meta?

          Show
          stack added a comment - I like what Ted says above about need to deprecate a call to remove. Here's some feedback. I'm half-way done. Patch is shaping up nicely. What is the change in TestHRegionInfo? You change a startkey to an endkey? I'm not sure I follow why this is done. I like the removal of the testGetClosestBefore from TestMinVersions.java and of TestGetClosestAtOrBefore.java (hurray!) I love all the removed code. So in HConstants, ZEROS is deprecated but NINES is not? How is NINES used still (later I see it used but I'm not sure what its doing?) I like removal of META_ROW_DELIMITER Spacing is wacky here: this.startKey, this.id, !HTableDescriptor.isMetaTable(tableNameAsBytes)); return Bytes.toString(nameAsBytes); + this.endKey, + Long.toString(this.id).getBytes(), + !HTableDescriptor.isMetaTable(tableNameAsBytes)); + return Bytes.toStringBinary(nameAsBytes); What happens if the last region in a table is missing for whatever reason? Is the javadoc on getStartRow in HTableDescriptor right? Its says its returning first possible region that could match a tablename + searchrow? Is it first possible row in meta?
          Hide
          Lars Hofhansl added a comment -

          Is HRegionInfo part of the public API?

          Show
          Lars Hofhansl added a comment - Is HRegionInfo part of the public API?
          Hide
          Alex Newman added a comment -

          > What is the change in TestHRegionInfo? You change a startkey to an endkey? I'm not sure I follow why this is done.
          it's because creatioRegionId now uses the start key instead of the end key.

          > So in HConstants, ZEROS is deprecated but NINES is not? How is NINES used still (later I see it used but I'm not sure what its doing?)
          Fixed

          > Spacing
          fixed

          > What if the last region is missing
          The stoprow should catch it.

          >Is the javadoc on getStartRow in HTableDescriptor right? Its says its returning first possible region that could match a tablename + searchrow? Is it first possible row in meta?
          Oh your right fixed

          Patch inbound

          Show
          Alex Newman added a comment - > What is the change in TestHRegionInfo? You change a startkey to an endkey? I'm not sure I follow why this is done. it's because creatioRegionId now uses the start key instead of the end key. > So in HConstants, ZEROS is deprecated but NINES is not? How is NINES used still (later I see it used but I'm not sure what its doing?) Fixed > Spacing fixed > What if the last region is missing The stoprow should catch it. >Is the javadoc on getStartRow in HTableDescriptor right? Its says its returning first possible region that could match a tablename + searchrow? Is it first possible row in meta? Oh your right fixed Patch inbound
          Hide
          Alex Newman added a comment -

          @Lars it is public, i have no idea why.

          Show
          Alex Newman added a comment - @Lars it is public, i have no idea why.
          Hide
          stack added a comment -

          HRI can be exposed to the client via HTable (as part of HRegionLocation) and in HBaseAdmin doing a closeRegion – unnecessary, we should deprecate – and then when we do getTableRegions. We could redo and hide HRI. HRegionLocation would beome name and servername only instead of HRI.

          Show
          stack added a comment - HRI can be exposed to the client via HTable (as part of HRegionLocation) and in HBaseAdmin doing a closeRegion – unnecessary, we should deprecate – and then when we do getTableRegions. We could redo and hide HRI. HRegionLocation would beome name and servername only instead of HRI.
          Hide
          stack added a comment -

          Where is the migration issue Alex?

          Show
          stack added a comment - Where is the migration issue Alex?
          Hide
          Alex Newman added a comment -

          I was going to handle it as a part of this one. I can create a specialized migration issue if you think that's better.

          Show
          Alex Newman added a comment - I was going to handle it as a part of this one. I can create a specialized migration issue if you think that's better.
          Hide
          Alex Newman added a comment -

          rebased

          Show
          Alex Newman added a comment - rebased
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-18 01:22:37.824212)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Changes
          -------

          Stack's changes

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2
          src/main/java/org/apache/hadoop/hbase/client/HTable.java aec7af2
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 4b2b97c
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 01:22:37.824212) Review request for hbase, Michael Stack and Lars Hofhansl. Changes ------- Stack's changes Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java aec7af2 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 4b2b97c src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          stack added a comment -

          Why does this have to be hardcoded?

          -        return locateRegionInMeta(HConstants.ROOT_TABLE_NAME, tableName, row,
          -            useCache, metaRegionLock);
          +
          +        //HARD CODED TO POINT TO THE FIRST META TABLE
          +        return locateRegionInMeta(HConstants.ROOT_TABLE_NAME,
          +                                  HConstants.META_TABLE_NAME,
          +                                  HConstants.EMPTY_BYTE_ARRAY,
          +                                  useCache,
          +                                  metaRegionLock);
          

          It works right?

          I'm looking at NINES in HConnectionManager... we don't need this anymore now we are scanning in the 'natural' direction?

          Is this enough?

          +            // We always try to get two rows just in case one of them is a split.
          +            Result[] result = server.next(scannerId, 2);
          

          What if the split has split? Then you'd have two offlined regions in meta... so you'd have to scan a third to get the live one (and so on... if the split is split is split....)

          Is this comment right?

          -      // <tableName>,<startKey>,<regionIdTimeStamp>/encodedName/
          +      // <tableName>,<endKey>,<regionIdTimeStamp>/encodedName/
          

          Should the be a '!' in there?

          This I think I follow but its kind of an important change so should be crystal clear:

          +  // It should say, the tablename encoded in the region ends with !,
          +  // but the last region's tablename ends with "
          +  public static final int END_OF_TABLE_NAME = 33;  // The ascii for !
          +  public static final int END_OF_TABLE_NAME_FOR_EMPTY_ENDKEY =
          +          END_OF_TABLE_NAME + 1;
          

          So, last region in a table has a '!' delimiter between it and its empty endrow rather than a ','?

          Is the comment above complete? Whats the '"' about?

          Oh, I see. Lets discuss the actual characters used. Hopefully can be better ones than '!' and '"' (But this is minor)

          You do this a bunch in your patch:

          -    return createRegionName(tableName, startKey, Bytes.toBytes(id), newFormat);
          +      final byte [] endKey, final String id, boolean newFormat) {
          +    return createRegionName(tableName, endKey, Bytes.toBytes(id), newFormat);
             }
          +    /**
          +     * Make a region name of passed parameters.
          +     *
          

          I'm referring to the spacing. It should be two spaces, not four.

          How is this so?

          -      final int metalength = 7; // '.META.' length
          +      final int metalength = 8; // '.META.' length
          

          Update the comment to explain 8 I'd say.

          Show
          stack added a comment - Why does this have to be hardcoded? - return locateRegionInMeta(HConstants.ROOT_TABLE_NAME, tableName, row, - useCache, metaRegionLock); + + //HARD CODED TO POINT TO THE FIRST META TABLE + return locateRegionInMeta(HConstants.ROOT_TABLE_NAME, + HConstants.META_TABLE_NAME, + HConstants.EMPTY_BYTE_ARRAY, + useCache, + metaRegionLock); It works right? I'm looking at NINES in HConnectionManager... we don't need this anymore now we are scanning in the 'natural' direction? Is this enough? + // We always try to get two rows just in case one of them is a split. + Result[] result = server.next(scannerId, 2); What if the split has split? Then you'd have two offlined regions in meta... so you'd have to scan a third to get the live one (and so on... if the split is split is split....) Is this comment right? - // <tableName>,<startKey>,<regionIdTimeStamp>/encodedName/ + // <tableName>,<endKey>,<regionIdTimeStamp>/encodedName/ Should the be a '!' in there? This I think I follow but its kind of an important change so should be crystal clear: + // It should say, the tablename encoded in the region ends with !, + // but the last region's tablename ends with " + public static final int END_OF_TABLE_NAME = 33; // The ascii for ! + public static final int END_OF_TABLE_NAME_FOR_EMPTY_ENDKEY = + END_OF_TABLE_NAME + 1; So, last region in a table has a '!' delimiter between it and its empty endrow rather than a ','? Is the comment above complete? Whats the '"' about? Oh, I see. Lets discuss the actual characters used. Hopefully can be better ones than '!' and '"' (But this is minor) You do this a bunch in your patch: - return createRegionName(tableName, startKey, Bytes.toBytes(id), newFormat); + final byte [] endKey, final String id, boolean newFormat) { + return createRegionName(tableName, endKey, Bytes.toBytes(id), newFormat); } + /** + * Make a region name of passed parameters. + * I'm referring to the spacing. It should be two spaces, not four. How is this so? - final int metalength = 7; // '.META.' length + final int metalength = 8; // '.META.' length Update the comment to explain 8 I'd say.
          Hide
          stack added a comment -

          @Alex I'd think a separate issue would work. Otherwise this becomes a monster issue. We can commit this w/o the migration technically... the migration would just have to follow soon after. But before we could do that, we'd need a viable migration before this could be committed but we can work that out distinct from what is going on here. Good stuff.

          Show
          stack added a comment - @Alex I'd think a separate issue would work. Otherwise this becomes a monster issue. We can commit this w/o the migration technically... the migration would just have to follow soon after. But before we could do that, we'd need a viable migration before this could be committed but we can work that out distinct from what is going on here. Good stuff.
          Hide
          Alex Newman added a comment -

          I think we should be able to hammer out the migration next week, i would rather not commit(and put this on hold) until we have a migration.

          Show
          Alex Newman added a comment - I think we should be able to hammer out the migration next week, i would rather not commit(and put this on hold) until we have a migration.
          Hide
          Alex Newman added a comment -

          Remerged

          Show
          Alex Newman added a comment - Remerged
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2012-01-17 19:54:30, Lars Hofhansl wrote:

          > I love all that removed code!!

          > I think it's fine to have it with patch (in fact removing all that code and the main reason why we're doing this, right?)

          We need this patch also to get rid of the perverse backtracking.

          • Michael

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/#review4427
          -----------------------------------------------------------

          On 2012-01-18 01:22:37, Alex Newman wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/3466/

          -----------------------------------------------------------

          (Updated 2012-01-18 01:22:37)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary

          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.

          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs

          -----

          src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8

          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821

          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d

          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8

          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8

          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f

          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2

          src/main/java/org/apache/hadoop/hbase/client/HTable.java aec7af2

          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28

          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f

          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828

          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896

          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c

          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184

          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8

          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89

          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1

          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0

          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5

          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533

          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1

          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d

          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04

          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41

          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61

          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31

          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d

          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 4b2b97c

          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936

          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167

          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b

          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab

          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6

          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing

          -------

          Unit tests started table.

          Tests in error:

          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - On 2012-01-17 19:54:30, Lars Hofhansl wrote: > I love all that removed code!! > I think it's fine to have it with patch (in fact removing all that code and the main reason why we're doing this, right?) We need this patch also to get rid of the perverse backtracking. Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4427 ----------------------------------------------------------- On 2012-01-18 01:22:37, Alex Newman wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 01:22:37) Review request for hbase, Michael Stack and Lars Hofhansl. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/HTable.java aec7af2 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 8af8c28 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b47423c src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 25ae9a8 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 48f6d77 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 79373e1 src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java f93c81d src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java 3fa5d41 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 4b2b97c src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-18 01:45:19.120980)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java c3be6e3

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 01:45:19.120980) Review request for hbase, Michael Stack and Lars Hofhansl. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java c3be6e3 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-18 01:54:42.329073)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java c3be6e3

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 01:54:42.329073) Review request for hbase, Michael Stack and Lars Hofhansl. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java c3be6e3 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510929/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/801//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/801//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/801//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510929/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8 against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/801//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/801//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/801//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12510930/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/802//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/802//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/802//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510930/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1 against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/802//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/802//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/802//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          HadoopQA says there were no changes?

          Show
          Lars Hofhansl added a comment - HadoopQA says there were no changes?
          Hide
          Lars Hofhansl added a comment -

          @Stack:

          We need this patch also to get rid of the perverse backtracking.

          What is this "backtracking"? Is that different from the code that Alex removed from Store.java?

          Show
          Lars Hofhansl added a comment - @Stack: We need this patch also to get rid of the perverse backtracking. What is this "backtracking"? Is that different from the code that Alex removed from Store.java?
          Hide
          Alex Newman added a comment -

          Fixed spacing.

          Show
          Alex Newman added a comment - Fixed spacing.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-18 19:19:27.118173)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Changes
          -------

          Fixed the spacing and got most of stack's comments in.

          Summary
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d857538
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 57605e6
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 784fdc2
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b1b5a78
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 0431444
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java c7cc402
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 5cb606f
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 3919985
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java b6a6349
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95ab8e6
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 19:19:27.118173) Review request for hbase, Michael Stack and Lars Hofhansl. Changes ------- Fixed the spacing and got most of stack's comments in. Summary ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d857538 src/main/java/org/apache/hadoop/hbase/client/HTable.java 57605e6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 784fdc2 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b1b5a78 src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 0431444 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java c7cc402 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 5cb606f src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 3919985 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java b6a6349 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95ab8e6 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-01-18 19:21:37.796315)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary (updated)
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs


          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821
          src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d
          src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d857538
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 57605e6
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 784fdc2
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b1b5a78
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 0431444
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java c7cc402
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 5cb606f
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533
          src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 3919985
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java b6a6349
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95ab8e6
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 19:21:37.796315) Review request for hbase, Michael Stack and Lars Hofhansl. Summary (updated) ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d857538 src/main/java/org/apache/hadoop/hbase/client/HTable.java 57605e6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 784fdc2 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b1b5a78 src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 0431444 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java c7cc402 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 5cb606f src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 3919985 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java b6a6349 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95ab8e6 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12511025/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 31 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated -145 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
          org.apache.hadoop.hbase.replication.TestReplication
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestImportTsv

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/808//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/808//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/808//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12511025/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 31 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 81 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/808//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/808//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/808//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment - - edited

          @Alex: What is the problem with TestMetaMigrationRemovingHTD? The tar file with that version of HBase is in ./src/test/data/hbase-4388-root.dir.tgz.
          I am actually not sure what it means if that test fails.
          @Stack: That seems to be the last stumbling block to get a successful test run. I assume this test failing indicates something bad (that we cannot migrate meta with this patch applied).

          Show
          Lars Hofhansl added a comment - - edited @Alex: What is the problem with TestMetaMigrationRemovingHTD? The tar file with that version of HBase is in ./src/test/data/hbase-4388-root.dir.tgz. I am actually not sure what it means if that test fails. @Stack: That seems to be the last stumbling block to get a successful test run. I assume this test failing indicates something bad (that we cannot migrate meta with this patch applied).
          Hide
          Ted Yu added a comment -

          I think we should shift our attention from TestMetaMigrationRemovingHTD to creating a new migration test.
          The reason is that 0.90 to 0.94 is not a supported migration path.

          1. A tar ball of 0.92 HBase should be generated.
          2. Verify that we can migrate from 0.92 .META. table to the new format.

          Show
          Ted Yu added a comment - I think we should shift our attention from TestMetaMigrationRemovingHTD to creating a new migration test. The reason is that 0.90 to 0.94 is not a supported migration path. 1. A tar ball of 0.92 HBase should be generated. 2. Verify that we can migrate from 0.92 .META. table to the new format.
          Hide
          Lars Hofhansl added a comment - - edited

          Oh? Did not know that.
          If that's the case, I agree we should remove that test and create a similar test for this issue.

          Show
          Lars Hofhansl added a comment - - edited Oh? Did not know that. If that's the case, I agree we should remove that test and create a similar test for this issue.
          Hide
          stack added a comment -

          Fixing this will fix HBASE-1841. Or, we need to do something weird like keep a block going if it has same key multiple times...

          Show
          stack added a comment - Fixing this will fix HBASE-1841 . Or, we need to do something weird like keep a block going if it has same key multiple times...
          Hide
          Schubert Zhang added a comment -

          Fixing this will also fix HBASE-1978, since I have no ability to complete that.

          Show
          Schubert Zhang added a comment - Fixing this will also fix HBASE-1978 , since I have no ability to complete that.
          Hide
          Schubert Zhang added a comment -

          Change the range/block index scheme from [start,end) to (start, end], and index range/block by endKey, specially in HFile

          Show
          Schubert Zhang added a comment - Change the range/block index scheme from [start,end) to (start, end] , and index range/block by endKey, specially in HFile
          Hide
          Alex Newman added a comment -

          Oh thanks for the reminder. I think this patch is ready. I just need to rebase and retest on my jenkins setup. Expect a patch soon.

          Show
          Alex Newman added a comment - Oh thanks for the reminder. I think this patch is ready. I just need to rebase and retest on my jenkins setup. Expect a patch soon.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/3466/
          -----------------------------------------------------------

          (Updated 2012-03-25 20:11:32.746962)

          Review request for hbase, Michael Stack and Lars Hofhansl.

          Summary (updated)
          -------

          This is an idea that Ryan and I have been kicking around on and off for a while now.

          If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row).

          If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region.

          This issue is about changing the way we name regions.

          If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward).

          Converting to the new method, we'd have to run a migration on startup changing the content in meta.

          Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change.

          public TRegionInfo getRegionInfo(ByteBuffer searchRow) throws IOError { was nulled out and enabled with https://reviews.apache.org/r/3514/. They are listed as dependencies in the jira and will be committed together.

          This addresses bug HBASE-2600.
          https://issues.apache.org/jira/browse/HBASE-2600

          Diffs (updated)


          security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java c1f20de
          src/main/java/org/apache/hadoop/hbase/HConstants.java 8888347
          src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 8d83ff3
          src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java fc5e53e
          src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f
          src/main/java/org/apache/hadoop/hbase/catalog/MetaMigratev2.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 0129ee9
          src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 16e4017
          src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463
          src/main/java/org/apache/hadoop/hbase/client/HTable.java 8e7d7f7
          src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 04150ad
          src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 47381f4
          src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f404999
          src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 197eb71
          src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 18c13c4
          src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 30c61ca
          src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 757f98e
          src/main/java/org/apache/hadoop/hbase/master/HMaster.java dbc9251
          src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 2ec6677
          src/main/java/org/apache/hadoop/hbase/migration/HRegionInfo090x2.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 8174cf5
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 02d55d4
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e0af8fb
          src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 0592f40
          src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5
          src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 0c7b396
          src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 56e31e1
          src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 3535595
          src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java 60eb426
          src/main/java/org/apache/hadoop/hbase/thrift/generated/AlreadyExists.java a5b81f5
          src/main/java/org/apache/hadoop/hbase/thrift/generated/BatchMutation.java d5df940
          src/main/java/org/apache/hadoop/hbase/thrift/generated/ColumnDescriptor.java 4ce85e7
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 6c505c0
          src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java 11e31e3
          src/main/java/org/apache/hadoop/hbase/thrift/generated/IllegalArgument.java ede215f
          src/main/java/org/apache/hadoop/hbase/thrift/generated/Mutation.java ef1817f
          src/main/java/org/apache/hadoop/hbase/thrift/generated/TCell.java 6ee8ca7
          src/main/java/org/apache/hadoop/hbase/thrift/generated/TRegionInfo.java ed251e8
          src/main/java/org/apache/hadoop/hbase/thrift/generated/TRowResult.java e1709b5
          src/main/java/org/apache/hadoop/hbase/thrift/generated/TScan.java f7cc05d
          src/main/java/org/apache/hadoop/hbase/util/FSUtils.java aebe5b0
          src/main/java/org/apache/hadoop/hbase/util/Writables.java 3d20723
          src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift f698a6c
          src/test/data/generate-hbase-2600-root-in-tmp.sh PRE-CREATION
          src/test/data/hbase-2600-root.dir.tgz PRE-CREATION
          src/test/data/hbase-4388-root.dir.tgz da2244e8097d3fd3b0cb04d49cbc615406f7e809
          src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902
          src/test/java/org/apache/hadoop/hbase/catalog/TestMetaUpdate.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java f7430ee
          src/test/java/org/apache/hadoop/hbase/client/TestMetaMigrationRemovingHTD.java d1c15af
          src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936
          src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java d2b3060
          src/test/java/org/apache/hadoop/hbase/migration/TestMigration.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/migration/TestMigrationFrom090To092.java c3651ac
          src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167
          src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6dfba41
          src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab
          src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6
          src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5

          Diff: https://reviews.apache.org/r/3466/diff

          Testing
          -------

          Unit tests started table.

          Tests in error:
          org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META.

          I need to know how to update/recreate the tar ball which is the source for that test.

          Thanks,

          Alex

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-03-25 20:11:32.746962) Review request for hbase, Michael Stack and Lars Hofhansl. Summary (updated) ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. public TRegionInfo getRegionInfo(ByteBuffer searchRow) throws IOError { was nulled out and enabled with https://reviews.apache.org/r/3514/ . They are listed as dependencies in the jira and will be committed together. This addresses bug HBASE-2600 . https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java c1f20de src/main/java/org/apache/hadoop/hbase/HConstants.java 8888347 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 8d83ff3 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java fc5e53e src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f src/main/java/org/apache/hadoop/hbase/catalog/MetaMigratev2.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java 0129ee9 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 16e4017 src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463 src/main/java/org/apache/hadoop/hbase/client/HTable.java 8e7d7f7 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 04150ad src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 47381f4 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f404999 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 197eb71 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 18c13c4 src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 30c61ca src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 757f98e src/main/java/org/apache/hadoop/hbase/master/HMaster.java dbc9251 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 2ec6677 src/main/java/org/apache/hadoop/hbase/migration/HRegionInfo090x2.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 8174cf5 src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 02d55d4 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e0af8fb src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 0592f40 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 0c7b396 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 56e31e1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 3535595 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java 60eb426 src/main/java/org/apache/hadoop/hbase/thrift/generated/AlreadyExists.java a5b81f5 src/main/java/org/apache/hadoop/hbase/thrift/generated/BatchMutation.java d5df940 src/main/java/org/apache/hadoop/hbase/thrift/generated/ColumnDescriptor.java 4ce85e7 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 6c505c0 src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java 11e31e3 src/main/java/org/apache/hadoop/hbase/thrift/generated/IllegalArgument.java ede215f src/main/java/org/apache/hadoop/hbase/thrift/generated/Mutation.java ef1817f src/main/java/org/apache/hadoop/hbase/thrift/generated/TCell.java 6ee8ca7 src/main/java/org/apache/hadoop/hbase/thrift/generated/TRegionInfo.java ed251e8 src/main/java/org/apache/hadoop/hbase/thrift/generated/TRowResult.java e1709b5 src/main/java/org/apache/hadoop/hbase/thrift/generated/TScan.java f7cc05d src/main/java/org/apache/hadoop/hbase/util/FSUtils.java aebe5b0 src/main/java/org/apache/hadoop/hbase/util/Writables.java 3d20723 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift f698a6c src/test/data/generate-hbase-2600-root-in-tmp.sh PRE-CREATION src/test/data/hbase-2600-root.dir.tgz PRE-CREATION src/test/data/hbase-4388-root.dir.tgz da2244e8097d3fd3b0cb04d49cbc615406f7e809 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 src/test/java/org/apache/hadoop/hbase/catalog/TestMetaUpdate.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java f7430ee src/test/java/org/apache/hadoop/hbase/client/TestMetaMigrationRemovingHTD.java d1c15af src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java d2b3060 src/test/java/org/apache/hadoop/hbase/migration/TestMigration.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/migration/TestMigrationFrom090To092.java c3651ac src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6dfba41 src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex
          Hide
          Alex Newman added a comment -

          Combined with 5217 as they need to be committed at the same time.

          Show
          Alex Newman added a comment - Combined with 5217 as they need to be committed at the same time.
          Hide
          Alex Newman added a comment -

          Sorry I apparently don't know how to generate patches with git. That was just the HBASE-2600 patch.

          Show
          Alex Newman added a comment - Sorry I apparently don't know how to generate patches with git. That was just the HBASE-2600 patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12519880/HBASE-2600%2B5217-Sun-Mar-25-2012-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 50 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
          org.apache.hadoop.hbase.catalog.TestMetaUpdate
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519880/HBASE-2600%2B5217-Sun-Mar-25-2012-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 50 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.catalog.TestMetaUpdate org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1303//console This message is automatically generated.
          Hide
          Alex Newman added a comment -

          I bungled the patch. I've been using git diff --no-prefix HEAD^^ > bla . That doesn't seem to include my binary tar ball. ANy ideas

          Show
          Alex Newman added a comment - I bungled the patch. I've been using git diff --no-prefix HEAD^^ > bla . That doesn't seem to include my binary tar ball. ANy ideas
          Hide
          Ted Yu added a comment -

          @Alex:
          Try this:

          git diff --no-prefix --binary 
          

          Thanks

          Show
          Ted Yu added a comment - @Alex: Try this: git diff --no-prefix --binary Thanks
          Hide
          Alex Newman added a comment -

          Binary included

          Show
          Alex Newman added a comment - Binary included
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12519888/HBASE-2600%2B5217-Sun-Mar-25-2012-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 48 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.hbase.catalog.TestMetaUpdate
          org.apache.hadoop.hbase.mapreduce.TestImportTsv
          org.apache.hadoop.hbase.mapred.TestTableMapReduce
          org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12519888/HBASE-2600%2B5217-Sun-Mar-25-2012-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.catalog.TestMetaUpdate org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1304//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          dev-support/test-patch.sh doesn't use '--binary' option when applying patches.

          I tried the following command:

          patch -p0 --binary -i HBASE-2600+5217-Sun-Mar-25-2012-v4.patch
          

          But src/test/data/hbase-2600-root.dir.tgz wasn't unpacked from patch.

          Show
          Ted Yu added a comment - dev-support/test-patch.sh doesn't use '--binary' option when applying patches. I tried the following command: patch -p0 --binary -i HBASE-2600+5217-Sun-Mar-25-2012-v4.patch But src/test/data/hbase-2600-root.dir.tgz wasn't unpacked from patch.
          Hide
          Alex Newman added a comment -

          I really have no idea what's going on here. I can't seem to create patch from svn or git. Also, I've noticed the patch does have the binary snippet, svn and patch just aren't applying it. My jenkins job runs out of memory(/dev/shm). So I gave the build machine a reboot and the branch at http://github.com/posix4e/hbase (branch jenkins) built fine. Can someone with a commit bit just pull it into a svn branch?

          Show
          Alex Newman added a comment - I really have no idea what's going on here. I can't seem to create patch from svn or git. Also, I've noticed the patch does have the binary snippet, svn and patch just aren't applying it. My jenkins job runs out of memory(/dev/shm). So I gave the build machine a reboot and the branch at http://github.com/posix4e/hbase (branch jenkins) built fine. Can someone with a commit bit just pull it into a svn branch?
          Hide
          Ted Yu added a comment -

          @Alex:
          Can you attach hbase-2600-root.dir.tgz to this JIRA ?
          Please briefly describe how you generated the tar ball.

          Thanks

          Show
          Ted Yu added a comment - @Alex: Can you attach hbase-2600-root.dir.tgz to this JIRA ? Please briefly describe how you generated the tar ball. Thanks
          Hide
          Alex Newman added a comment -

          generate-hbase-2600-root-in-tmp.sh was used to generate this tarball

          Show
          Alex Newman added a comment - generate-hbase-2600-root-in-tmp.sh was used to generate this tarball
          Hide
          Ted Yu added a comment -

          TestMetaUpdate passes with the binary provided by Alex.

          Show
          Ted Yu added a comment - TestMetaUpdate passes with the binary provided by Alex.
          Hide
          Alex Newman added a comment -

          Are any tests failing?

          Show
          Alex Newman added a comment - Are any tests failing?
          Hide
          Ted Yu added a comment -

          I was about to go over the patch on review board - I didn't run whole suite.
          Since the tar ball is only used by TestMetaUpdate, I wanted to get some clarification for TestMetaUpdate first.

          Show
          Ted Yu added a comment - I was about to go over the patch on review board - I didn't run whole suite. Since the tar ball is only used by TestMetaUpdate, I wanted to get some clarification for TestMetaUpdate first.
          Hide
          Alex Newman added a comment -

          What do I need to do to move this forward?

          Show
          Alex Newman added a comment - What do I need to do to move this forward?
          Hide
          Alex Newman added a comment -

          I rebased today.

          Show
          Alex Newman added a comment - I rebased today.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12523233/0001-HBASE-2600.v10.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 65 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause mvn compile goal to fail.

          -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1568//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1568//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523233/0001-HBASE-2600.v10.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 65 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1568//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1568//console This message is automatically generated.
          Hide
          Alex Newman added a comment -

          Removed getclosestrow use

          Show
          Alex Newman added a comment - Removed getclosestrow use
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12523242/0001-HBASE-2600-v11.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 65 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 18 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523242/0001-HBASE-2600-v11.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 65 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 18 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1570//console This message is automatically generated.
          Hide
          Alex Newman added a comment -

          looks like it's failing tests on my jira. I'll have to look into this further.

          Show
          Alex Newman added a comment - looks like it's failing tests on my jira. I'll have to look into this further.
          Hide
          Alex Newman added a comment -

          Err I mean it's failing tests on my jenkins

          Show
          Alex Newman added a comment - Err I mean it's failing tests on my jenkins
          Hide
          Alex Newman added a comment -

          Never mind the second jenkins run ran fine. i think it was just bad job interaction.

          Show
          Alex Newman added a comment - Never mind the second jenkins run ran fine. i think it was just bad job interaction.
          Hide
          Alex Newman added a comment -

          I needed to rebase again. I'll upload a patch in a second after I run it through my jenkins.

          Show
          Alex Newman added a comment - I needed to rebase again. I'll upload a patch in a second after I run it through my jenkins.
          Hide
          Matt Corgan added a comment -

          I saw some discussion earlier in this jira about this patch removing the need for the custom MetaKeyComparator... does it end up doing that in its current form?

          Show
          Matt Corgan added a comment - I saw some discussion earlier in this jira about this patch removing the need for the custom MetaKeyComparator... does it end up doing that in its current form?
          Hide
          Matt Corgan added a comment -

          At the risk of confusing this huge issue even more... the current (0.94) META rows are formatted like

          [table],[region start key],[region id]

          . Then I believe this jira changes them to

          [table],[region end key],[region id]

          Is that correct?

          what if we:
          1) replace the first comma with \x00 which sorts before all legal filesystem characters
          2) move the regionId to be a prefix of the qualifier

          Would that format allow us to get rid of the custom MetaKeyComparator? I mention this now as i'm trying to future-proof the HBASE-4676 so it can work with META at some point.

          Show
          Matt Corgan added a comment - At the risk of confusing this huge issue even more... the current (0.94) META rows are formatted like [table],[region start key],[region id] . Then I believe this jira changes them to [table],[region end key],[region id] Is that correct? what if we: 1) replace the first comma with \x00 which sorts before all legal filesystem characters 2) move the regionId to be a prefix of the qualifier Would that format allow us to get rid of the custom MetaKeyComparator? I mention this now as i'm trying to future-proof the HBASE-4676 so it can work with META at some point.
          Hide
          stack added a comment -

          Yes, Matt. We'd use the end key over start key so we query on .META. would not need to do any backing up.

          1. I think the \x00 would work for first delimiter. Rows would be sorted first on table. \x00 could be part of a region key but the sort on table name first should make it so the \x00 delimiter would be found first (We could too make the table name fixed size, a 'code' with its string value kept elsewhere perhaps in another table. This way table renames would be easy. Then we'd need no delimiter).

          2. Regionid as a column qualifier prefix? Thats radical. Tell me more what you are thinking. It'd be sweet if we could do memcmp on row keys. BIG SIMPLIFICATION. Region id as qualifier would make for some interesting changes. On split, for the bottom half of the split, we'd be adding a new column with the new qualifier. There'd be one less delete and add? Is that right? Its a radical notion. Lets tease it out. It could be really good.

          Show
          stack added a comment - Yes, Matt. We'd use the end key over start key so we query on .META. would not need to do any backing up. 1. I think the \x00 would work for first delimiter. Rows would be sorted first on table. \x00 could be part of a region key but the sort on table name first should make it so the \x00 delimiter would be found first (We could too make the table name fixed size, a 'code' with its string value kept elsewhere perhaps in another table. This way table renames would be easy. Then we'd need no delimiter). 2. Regionid as a column qualifier prefix? Thats radical. Tell me more what you are thinking. It'd be sweet if we could do memcmp on row keys. BIG SIMPLIFICATION. Region id as qualifier would make for some interesting changes. On split, for the bottom half of the split, we'd be adding a new column with the new qualifier. There'd be one less delete and add? Is that right? Its a radical notion. Lets tease it out. It could be really good.
          Hide
          stack added a comment -

          Elliott just made an interesting suggestion which was that we could put the regionid into a .META. column of its own... info:regionid (need to think this through more)

          Show
          stack added a comment - Elliott just made an interesting suggestion which was that we could put the regionid into a .META. column of its own... info:regionid (need to think this through more)
          Hide
          Matt Corgan added a comment -

          \x00 could be part of a region key but the sort on table name first should make it so the \x00 delimiter would be found first

          yep - in general, this is how i build compound primary keys with variable length strings. you shouldn't need any padding or anything. the only complication is if your string somehow contains \x00, but that can't happen in this case

          As for moving the regionId to the qualifier, I don't really know enough about how it's used to give detailed ideas, but some thoughts:

          • there will not be many daughter regions at a given time, so we are not talking about wide rows
          • perhaps putting the daughters into the same row adds some transactional benefits that we didn't previously have?
          • as for qualifier-prefix vs separate-qualifier, i actually don't know enough about usage to say if neither/either/both would work. seems like either could work given that each row will be small enough to easily hold in memory and parse however. i first proposed prefixing to keep the KV sort order intact, but if that isn't required then separate-qualifier is cleaner.
          Show
          Matt Corgan added a comment - \x00 could be part of a region key but the sort on table name first should make it so the \x00 delimiter would be found first yep - in general, this is how i build compound primary keys with variable length strings. you shouldn't need any padding or anything. the only complication is if your string somehow contains \x00, but that can't happen in this case As for moving the regionId to the qualifier, I don't really know enough about how it's used to give detailed ideas, but some thoughts: there will not be many daughter regions at a given time, so we are not talking about wide rows perhaps putting the daughters into the same row adds some transactional benefits that we didn't previously have? as for qualifier-prefix vs separate-qualifier, i actually don't know enough about usage to say if neither/either/both would work. seems like either could work given that each row will be small enough to easily hold in memory and parse however. i first proposed prefixing to keep the KV sort order intact, but if that isn't required then separate-qualifier is cleaner.
          Hide
          Alex Newman added a comment -

          Sorry everyone, I was lost at burning man.

          > perhaps putting the daughters into the same row adds some transactional benefits that we didn't previously have?
          Indeed. Currently we can't split meta, and even still, I think we can do atomic operations within a region easily.

          @Stack I like the info:regionid idea. I'll also put on my thinking cap about it. This patch requires a big rework to get it to work.

          Show
          Alex Newman added a comment - Sorry everyone, I was lost at burning man. > perhaps putting the daughters into the same row adds some transactional benefits that we didn't previously have? Indeed. Currently we can't split meta, and even still, I think we can do atomic operations within a region easily. @Stack I like the info:regionid idea. I'll also put on my thinking cap about it. This patch requires a big rework to get it to work.
          Hide
          Jesse Yates added a comment -

          We've been doing a lot of thinking over here at Saleforce about this issue and was thinking about picking up work on this, is Alex is busy. The current approach is pretty good, and has a lot of merits. We also discussed the option of using the multi-row transaction stuff (which will be another reason why we couldn't split META). I did a full write-up/analysis of the options (see https://dl.dropbox.com/u/6147077/Proposal-HBASE-2600.docx).

          What I ended up coming up with is a little bit crazy, but I think it works. (I'm not dealing with tablenames as hashes, but that is pretty trivial). What I'm looking to solve are:

          (1) replacing start key’s with endkeys
          (2) ensuring correct sorting
          (3) ensuring correct split behavior to avoid META holes
          (4) moving the compound key to their own family/qualifier

          There seems to be a couple pieces we can put together to ensure we meet all the above goals.
          First, row keys are encoded as:

          For all non-terminal regions:

          	
          	<table>0x00<endkey> 
          

          For the terminal region:

          		<table>0x01
          

          Then we can move the encoded name into its own cell, under the “info:encodedname” column. Next, the regionid is moved to the timestamp and used for all updates the region in META (this includes offlining and marking the parent as split). Since regionids are already timestamps by convention, this doesn't stray that far afield.

          META then looks something like:

          <table>0x00<endkey> | info |
                                     | encodedname     | <regionid> | <md5 hash>
                                     | regioninfo      | <regionid> | <hri – 1>
                                     | server          | <regionid> | <server:port>
                                     | server.startcode| <regionid> | <startcode 
                                     | splitA          | <regionid> | <hri – 3>
                                     | splitB          | <regionid> | <hri – 4>
          <table>0x01        |  info | encodedname     | <regionid2>| <hri-4> 
                                     |  ...            | <regionid2>| ...
          

          Obviously there are some serious implications for how lookups and splits work.

          Splits need to take the opposite approach with respect to putting children in META. Currently, we write the bottom and then the top child, counting on the htable to retry when it finds an offlined region. Now, we just flip the ordering by: (1) offline the parent, (2) put the 'top' child and then (3) insert the bottom child.

          The problem lies in making sure that the bottom child sorts before the parent. In the previous scheme we ensured that sorting by putting a regionid in the row key. With the above scheme, the 'top' child will always sort before the parent because it has a lower endkey. The 'bottom' child actual has exactly the same row key as the parent. However, the bottom child still sorts first because it has a larger regionid (which is also already baked into the code).

          We also must do a check of the timestamp vs. the expected regionid to ensure that we can get the correct region, but that is a minor overhead.

          NOTE: this also gives us provenance of regions, at least until the catalog janitor cleans up parent regions.

          For lookups, you would query for the first region that matches (similar to the current mechanism):

          	<table>0x00<desired key>999999……
          

          which still finds the correct (bottom) child because its regionid must be greater than its parent causing it to sort before its parent in the same row.

          This give us correct sorting, an easily readable META, and no holes. Oh, and we can remove all the backwords scanning.

          Show
          Jesse Yates added a comment - We've been doing a lot of thinking over here at Saleforce about this issue and was thinking about picking up work on this, is Alex is busy. The current approach is pretty good, and has a lot of merits. We also discussed the option of using the multi-row transaction stuff (which will be another reason why we couldn't split META). I did a full write-up/analysis of the options (see https://dl.dropbox.com/u/6147077/Proposal-HBASE-2600.docx ). What I ended up coming up with is a little bit crazy, but I think it works. (I'm not dealing with tablenames as hashes, but that is pretty trivial). What I'm looking to solve are: (1) replacing start key’s with endkeys (2) ensuring correct sorting (3) ensuring correct split behavior to avoid META holes (4) moving the compound key to their own family/qualifier There seems to be a couple pieces we can put together to ensure we meet all the above goals. First, row keys are encoded as: For all non-terminal regions: <table>0x00<endkey> For the terminal region: <table>0x01 Then we can move the encoded name into its own cell, under the “info:encodedname” column. Next, the regionid is moved to the timestamp and used for all updates the region in META (this includes offlining and marking the parent as split). Since regionids are already timestamps by convention, this doesn't stray that far afield. META then looks something like: <table>0x00<endkey> | info | | encodedname | <regionid> | <md5 hash> | regioninfo | <regionid> | <hri – 1> | server | <regionid> | <server:port> | server.startcode| <regionid> | <startcode | splitA | <regionid> | <hri – 3> | splitB | <regionid> | <hri – 4> <table>0x01 | info | encodedname | <regionid2>| <hri-4> | ... | <regionid2>| ... Obviously there are some serious implications for how lookups and splits work. Splits need to take the opposite approach with respect to putting children in META. Currently, we write the bottom and then the top child, counting on the htable to retry when it finds an offlined region. Now, we just flip the ordering by: (1) offline the parent, (2) put the 'top' child and then (3) insert the bottom child. The problem lies in making sure that the bottom child sorts before the parent. In the previous scheme we ensured that sorting by putting a regionid in the row key. With the above scheme, the 'top' child will always sort before the parent because it has a lower endkey. The 'bottom' child actual has exactly the same row key as the parent. However, the bottom child still sorts first because it has a larger regionid (which is also already baked into the code). We also must do a check of the timestamp vs. the expected regionid to ensure that we can get the correct region, but that is a minor overhead. NOTE: this also gives us provenance of regions, at least until the catalog janitor cleans up parent regions. For lookups, you would query for the first region that matches (similar to the current mechanism): <table>0x00<desired key>999999…… which still finds the correct (bottom) child because its regionid must be greater than its parent causing it to sort before its parent in the same row. This give us correct sorting, an easily readable META, and no holes. Oh, and we can remove all the backwords scanning.
          Hide
          Jesse Yates added a comment -

          As an aside, if don't roll into the hashed tablenames here, we do easy end-key extraction by encoding the length of the table name into the row key as the last 4 bytes of the key. Then you would read in an int from the last 4 bytes to jump right to the correct location in the key for the endkey. This still sorts correctly because the prefix to that length will always sort the same way, so the suffix doesn't affect sorting.

          Show
          Jesse Yates added a comment - As an aside, if don't roll into the hashed tablenames here, we do easy end-key extraction by encoding the length of the table name into the row key as the last 4 bytes of the key. Then you would read in an int from the last 4 bytes to jump right to the correct location in the key for the endkey. This still sorts correctly because the prefix to that length will always sort the same way, so the suffix doesn't affect sorting.
          Hide
          Jean-Daniel Cryans added a comment -

          Patch is stale, all the code moved, unmarking as available.

          Show
          Jean-Daniel Cryans added a comment - Patch is stale, all the code moved, unmarking as available.

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:

                Development