HBase
  1. HBase
  2. HBASE-2468

Improvements to prewarm META cache on clients

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: Client
    • Hadoop Flags:
      Reviewed

      Description

      A couple different use cases cause storms of reads to META during startup. For example, a large MR job will cause each map task to hit meta since it starts with an empty cache.

      A couple possible improvements have been proposed:

      • MR jobs could ship a copy of META for the table in the DistributedCache
      • Clients could prewarm cache by doing a large scan of all the meta for the table instead of random reads for each miss
      • Each miss could fetch ahead some number of rows in META

        Activity

        Hide
        stack added a comment -

        Committed. Thank you for the patch Mingjie Lai.

        Show
        stack added a comment - Committed. Thank you for the patch Mingjie Lai.
        Hide
        HBase Review Board added a comment -

        Message from: stack@duboce.net

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review210
        -----------------------------------------------------------

        Ship it!

        +1

        I did a quick skim. All looks good. This is a weird one in that to prewarm we are going to do a getClosestOrBefore (which we'll be doing anyway) and then we'll open scanner, and scan next 10 rows... then close scanner; i.e. extra rpcs inline w/ first lookup against tale. This latter will actually slow down first lookup some but we can make it faster by making open scanner return results so we don't have to then go call next (already an issue to do this) and also, making it so we scan forward always by changing keys in .META. such that region names are keyed by endkey rather than startkey... also another issue to do this.

        • stack
        Show
        HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review210 ----------------------------------------------------------- Ship it! +1 I did a quick skim. All looks good. This is a weird one in that to prewarm we are going to do a getClosestOrBefore (which we'll be doing anyway) and then we'll open scanner, and scan next 10 rows... then close scanner; i.e. extra rpcs inline w/ first lookup against tale. This latter will actually slow down first lookup some but we can make it faster by making open scanner return results so we don't have to then go call next (already an issue to do this) and also, making it so we scan forward always by changing keys in .META. such that region names are keyed by endkey rather than startkey... also another issue to do this. stack
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        (Updated 2010-06-13 00:49:50.977413)

        Review request for hbase, Todd Lipcon and stack.

        Changes
        -------

        1) resolved conflicts from the latest code commits
        2) added get/setRegionCachePrefetch() in HTable, removed isRegionCachePrefetchEnabled(), enableRegionCachePrefetch(), etc.
        3) in MetaScanner, added Math.min(rowLimit, configuration.getInt(...))

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 03cbf8d
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 8e47f2d
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f11ada7
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- (Updated 2010-06-13 00:49:50.977413) Review request for hbase, Todd Lipcon and stack. Changes ------- 1) resolved conflicts from the latest code commits 2) added get/setRegionCachePrefetch() in HTable, removed isRegionCachePrefetchEnabled(), enableRegionCachePrefetch(), etc. 3) in MetaScanner, added Math.min(rowLimit, configuration.getInt(...)) Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs (updated) src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 03cbf8d src/main/java/org/apache/hadoop/hbase/client/HTable.java 8e47f2d src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f11ada7 src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        HBase Review Board added a comment -

        Message from: "Jonathan Gray" <jgray@apache.org>

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079

        > <http://review.hbase.org/r/98/diff/5/?file=943#file943line1079>

        >

        > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable.

        >

        > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static.

        >

        > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level?

        Mingjie Lai wrote:

        > I guess these are static because of how HTables all share a single HCM per conf...

        Yes.

        > Maybe since these are more advanced calls, they shouldn't be in HTable?

        Two alternatives:

        1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.''

        2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level.

        3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically.

        I like 2) better, but not really sure whether we want to expose it there or not.

        What do you think?

        Jonathan Gray wrote:

        Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config.

        If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM?

        Todd Lipcon wrote:

        I think keeping HConnectionManager an internal interface is a good idea, so kind of -0 there. -1 on HBaseAdmin, since we should keep that for administrative functions that really change something on the cluster. So I'd prefer HTable, but wouldn't cry over HCM.

        Since we can knock it down to just two methods, get/set, let's just put it in HTable.

        But it will be static, right, so people understand it's not per-instance. Let's also make sure there is javadoc that also explains that it is for all HTable instances for the tables you configure.

        • Jonathan

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079 > < http://review.hbase.org/r/98/diff/5/?file=943#file943line1079 > > > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable. > > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static. > > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level? Mingjie Lai wrote: > I guess these are static because of how HTables all share a single HCM per conf... Yes. > Maybe since these are more advanced calls, they shouldn't be in HTable? Two alternatives: 1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.'' 2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level. 3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically. I like 2) better, but not really sure whether we want to expose it there or not. What do you think? Jonathan Gray wrote: Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config. If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM? Todd Lipcon wrote: I think keeping HConnectionManager an internal interface is a good idea, so kind of -0 there. -1 on HBaseAdmin, since we should keep that for administrative functions that really change something on the cluster. So I'd prefer HTable, but wouldn't cry over HCM. Since we can knock it down to just two methods, get/set, let's just put it in HTable. But it will be static, right, so people understand it's not per-instance. Let's also make sure there is javadoc that also explains that it is for all HTable instances for the tables you configure. Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079

        > <http://review.hbase.org/r/98/diff/5/?file=943#file943line1079>

        >

        > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable.

        >

        > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static.

        >

        > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level?

        Mingjie Lai wrote:

        > I guess these are static because of how HTables all share a single HCM per conf...

        Yes.

        > Maybe since these are more advanced calls, they shouldn't be in HTable?

        Two alternatives:

        1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.''

        2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level.

        3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically.

        I like 2) better, but not really sure whether we want to expose it there or not.

        What do you think?

        Jonathan Gray wrote:

        Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config.

        If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM?

        I think keeping HConnectionManager an internal interface is a good idea, so kind of -0 there. -1 on HBaseAdmin, since we should keep that for administrative functions that really change something on the cluster. So I'd prefer HTable, but wouldn't cry over HCM.

        • Todd

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079 > < http://review.hbase.org/r/98/diff/5/?file=943#file943line1079 > > > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable. > > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static. > > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level? Mingjie Lai wrote: > I guess these are static because of how HTables all share a single HCM per conf... Yes. > Maybe since these are more advanced calls, they shouldn't be in HTable? Two alternatives: 1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.'' 2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level. 3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically. I like 2) better, but not really sure whether we want to expose it there or not. What do you think? Jonathan Gray wrote: Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config. If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM? I think keeping HConnectionManager an internal interface is a good idea, so kind of -0 there. -1 on HBaseAdmin, since we should keep that for administrative functions that really change something on the cluster. So I'd prefer HTable, but wouldn't cry over HCM. Todd ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Jonathan Gray" <jgray@apache.org>

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079

        > <http://review.hbase.org/r/98/diff/5/?file=943#file943line1079>

        >

        > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable.

        >

        > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static.

        >

        > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level?

        Mingjie Lai wrote:

        > I guess these are static because of how HTables all share a single HCM per conf...

        Yes.

        > Maybe since these are more advanced calls, they shouldn't be in HTable?

        Two alternatives:

        1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.''

        2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level.

        3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically.

        I like 2) better, but not really sure whether we want to expose it there or not.

        What do you think?

        Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config.

        If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM?

        • Jonathan

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079 > < http://review.hbase.org/r/98/diff/5/?file=943#file943line1079 > > > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable. > > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static. > > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level? Mingjie Lai wrote: > I guess these are static because of how HTables all share a single HCM per conf... Yes. > Maybe since these are more advanced calls, they shouldn't be in HTable? Two alternatives: 1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.'' 2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level. 3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically. I like 2) better, but not really sure whether we want to expose it there or not. What do you think? Adding it to HBaseAdmin could make sense. This one is a bit of an odd one because it's a client-side configuration parameter done at the per-client-jvm level. Typically we have per-query or per-htable-instance configs. HBaseAdmin is generally made up of remote administration commands not local client config. If we provide sufficient javadoc (including as a class comment on HTable) it doesn't matter so much where we put it. Since it's distinct from what's currently in HTable and HBaseAdmin, maybe it does make sense as a static in HCM? Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HConnection.java, line 235

        > <http://review.hbase.org/r/98/diff/5/?file=941#file941line235>

        >

        > +1 on collapsing with a boolean. setRegionCachePrefetch(table, enable)? Seems self descriptive and with a couple lines of javadoc should be clear.

        yes sir. will modify to use set...() and get...().

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079

        > <http://review.hbase.org/r/98/diff/5/?file=943#file943line1079>

        >

        > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable.

        >

        > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static.

        >

        > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level?

        I guess these are static because of how HTables all share a single HCM per conf...

        Yes.

        Maybe since these are more advanced calls, they shouldn't be in HTable?

        Two alternatives:
        1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.''
        2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level.
        3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically.

        I like 2) better, but not really sure whether we want to expose it there or not.

        What do you think?

        • Mingjie

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnection.java, line 235 > < http://review.hbase.org/r/98/diff/5/?file=941#file941line235 > > > +1 on collapsing with a boolean. setRegionCachePrefetch(table, enable)? Seems self descriptive and with a couple lines of javadoc should be clear. yes sir. will modify to use set...() and get...(). On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079 > < http://review.hbase.org/r/98/diff/5/?file=943#file943line1079 > > > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable. > > We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static. > > Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level? I guess these are static because of how HTables all share a single HCM per conf... Yes. Maybe since these are more advanced calls, they shouldn't be in HTable? Two alternatives: 1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only deal with HTable.'' 2) HBaseAdmin: it is a more reasonable design choice since these operation are at HCM level. 3) or, make it a configuration. It would be one global configuration applied to all tables, and cannot be changed dynamically. I like 2) better, but not really sure whether we want to expose it there or not. What do you think? Mingjie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 -----------------------------------------------------------
        Hide
        Mingjie Lai added a comment -

        @stack: Yes, I will create another patch. But before that, I still need to finish the discussion about where to put the static method HTable.setRegionCachePrefetch(table, enable). After we make an agreement it, I will generate a new patch right away.

        Show
        Mingjie Lai added a comment - @stack: Yes, I will create another patch. But before that, I still need to finish the discussion about where to put the static method HTable.setRegionCachePrefetch(table, enable). After we make an agreement it, I will generate a new patch right away.
        Hide
        stack added a comment -

        @ Mingjie Are you going to put up another patch to address the above comments? If you do I'll commit it.

        Show
        stack added a comment - @ Mingjie Are you going to put up another patch to address the above comments? If you do I'll commit it.
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        On 2010-06-10 08:53:41, Jonathan Gray wrote:

        > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 109

        > <http://review.hbase.org/r/98/diff/5/?file=944#file944line109>

        >

        > Hmm, wouldn't that be Math.min then? If rowLimit = 5 and scanner.caching = 100, we want to only do 5?

        woops, yes

        • Todd

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> On 2010-06-10 08:53:41, Jonathan Gray wrote: > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 109 > < http://review.hbase.org/r/98/diff/5/?file=944#file944line109 > > > Hmm, wouldn't that be Math.min then? If rowLimit = 5 and scanner.caching = 100, we want to only do 5? woops, yes Todd ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Jonathan Gray" <jgray@apache.org>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review176
        -----------------------------------------------------------

        This looks great! I think we should think more about where to expose this, and also think about using more of a get/set API to reduce the method calls and make it look more like the other client APIs.

        src/main/java/org/apache/hadoop/hbase/client/HConnection.java
        <http://review.hbase.org/r/98/#comment841>

        +1 on collapsing with a boolean. setRegionCachePrefetch(table, enable)? Seems self descriptive and with a couple lines of javadoc should be clear.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment840>

        I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable.

        We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static.

        Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level?

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment842>

        Should we also make these methods (if we even leave it in HTable) more like setRegionCachePrefetch / getRegionCachePrefetch? HTable is the core client class so the less noise we add the better. We have other methods in client APIs of the get/set form like Scan.setCacheBlocks and Put.setWriteToWAL

        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
        <http://review.hbase.org/r/98/#comment843>

        Hmm, wouldn't that be Math.min then? If rowLimit = 5 and scanner.caching = 100, we want to only do 5?

        • Jonathan
        Show
        HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review176 ----------------------------------------------------------- This looks great! I think we should think more about where to expose this, and also think about using more of a get/set API to reduce the method calls and make it look more like the other client APIs. src/main/java/org/apache/hadoop/hbase/client/HConnection.java < http://review.hbase.org/r/98/#comment841 > +1 on collapsing with a boolean. setRegionCachePrefetch(table, enable)? Seems self descriptive and with a couple lines of javadoc should be clear. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment840 > I guess these are static because of how HTables all share a single HCM per conf. The setting of prefetching is set at the HCM level not the HTable level, however clients are usually not exposed to HCM and only deal with HTable. We should probably make it clear in the javadoc for these methods that they apply to all HTable instances, though that may be clear from being static. Maybe since these are more advanced calls, they shouldn't be in HTable? If we provide proper documentation, it should be easy enough for a user to grab the HCM and apply the config at that level? src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment842 > Should we also make these methods (if we even leave it in HTable) more like setRegionCachePrefetch / getRegionCachePrefetch? HTable is the core client class so the less noise we add the better. We have other methods in client APIs of the get/set form like Scan.setCacheBlocks and Put.setWriteToWAL src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < http://review.hbase.org/r/98/#comment843 > Hmm, wouldn't that be Math.min then? If rowLimit = 5 and scanner.caching = 100, we want to only do 5? Jonathan
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review165
        -----------------------------------------------------------

        Looking good! Just a few notes.

        src/main/java/org/apache/hadoop/hbase/client/HConnection.java
        <http://review.hbase.org/r/98/#comment813>

        I thought we were collapsing these two calls into setRegionCachePrefetchEnabled(tableName, enabled)?

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment816>

        I don't entirely understand why we key these hashes by integer, but it seems like you're following the status quo, so doesn't need to be addressed in this patch.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment822>

        I still don't quite understand the logic about why these should be static. Previously you pointed to the enable/disable calls, but those are more like admin calls, not calls that affect client behavior. Anyone else have an opinion?

        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
        <http://review.hbase.org/r/98/#comment823>

        I think this should be Math.max(rowLimit, configuration.getInt(...)) - if we only want to scan 5 rows, we don't want the scanner to prefetch 100 for us.

        • Todd
        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review165 ----------------------------------------------------------- Looking good! Just a few notes. src/main/java/org/apache/hadoop/hbase/client/HConnection.java < http://review.hbase.org/r/98/#comment813 > I thought we were collapsing these two calls into setRegionCachePrefetchEnabled(tableName, enabled)? src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment816 > I don't entirely understand why we key these hashes by integer, but it seems like you're following the status quo, so doesn't need to be addressed in this patch. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment822 > I still don't quite understand the logic about why these should be static. Previously you pointed to the enable/disable calls, but those are more like admin calls, not calls that affect client behavior. Anyone else have an opinion? src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < http://review.hbase.org/r/98/#comment823 > I think this should be Math.max(rowLimit, configuration.getInt(...)) - if we only want to scan 5 rows, we don't want the scanner to prefetch 100 for us. Todd
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        (Updated 2010-06-09 15:50:59.084657)

        Review request for hbase, Todd Lipcon and stack.

        Changes
        -------

        @St^ack: please see my comments for your feedback regarding getRowOrBefore() issue.

        Invite Todd as a reviewer.

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ed18092
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java d3a0c07
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- (Updated 2010-06-09 15:50:59.084657) Review request for hbase, Todd Lipcon and stack. Changes ------- @St^ack: please see my comments for your feedback regarding getRowOrBefore() issue. Invite Todd as a reviewer. Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ed18092 src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java d3a0c07 src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        On 2010-06-07 14:23:42, stack wrote:

        > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 96

        > <http://review.hbase.org/r/98/diff/5/?file=944#file944line96>

        >

        > getRowOrBefore is an expensive call. Are we sure we are not calling this too often?

        I agree it is an expensive call.

        However I don't think it would bring any performance penalty for existing and potential use cases:
        Use case 1 – existing MetaScanner users: since this method is newly added, existing users won't be affected;
        Use case 2 – hbase clients when locating a region :
        1) if prefetch is on, it calls this MetaScanner with [table + row combination], which calls getRowOrBefore() to get current region info, then number of following regions from meta. After that, the client can get the region info directly from cache.
        2) if prefetch is disabled (current behavior), it eventually calls similar method getClosestRowBefore() to get desired region.

        So no matter prefetch is on or not, getRowOrBefore(or getClosestRowBefore) eventually is called. The only difference is whether to scan following regions from meta or not.

        For future MetaScanner users which scan from one region with desired use table row, it has to take the effort since it is the expected behavior.

        • Mingjie

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review144
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> On 2010-06-07 14:23:42, stack wrote: > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 96 > < http://review.hbase.org/r/98/diff/5/?file=944#file944line96 > > > getRowOrBefore is an expensive call. Are we sure we are not calling this too often? I agree it is an expensive call. However I don't think it would bring any performance penalty for existing and potential use cases: Use case 1 – existing MetaScanner users: since this method is newly added, existing users won't be affected; Use case 2 – hbase clients when locating a region : 1) if prefetch is on, it calls this MetaScanner with [table + row combination] , which calls getRowOrBefore() to get current region info, then number of following regions from meta. After that, the client can get the region info directly from cache. 2) if prefetch is disabled (current behavior), it eventually calls similar method getClosestRowBefore() to get desired region. So no matter prefetch is on or not, getRowOrBefore(or getClosestRowBefore) eventually is called. The only difference is whether to scan following regions from meta or not. For future MetaScanner users which scan from one region with desired use table row, it has to take the effort since it is the expected behavior. Mingjie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review144 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: stack@duboce.net

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review144
        -----------------------------------------------------------

        Ship it!

        I think this good to go. Seem my comments below. See what you think. My one concern is the number of calls to getRowOrBefore... hopefully this patch cuts down overall on our need to use this function. I'd like to hear your opinion on that.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment744>

        This is code duplicated from elsewhere. Can I help make it so we don't have to do this duplication? Or, for now, since this your fist patch, we can put it off IF you file a JIRA to fix the duplication (smile).

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment745>

        So we start scanning at 'row'? Is this the 'row' the user asked for? No, it needs to be the row in the .META. table, right? We need to find the row in .META. that contains the asked for row first? NM, I see below how the row here is made.. .this looks right.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment746>

        This is a nice little facility.

        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
        <http://review.hbase.org/r/98/#comment747>

        OK. This looks right.

        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
        <http://review.hbase.org/r/98/#comment748>

        getRowOrBefore is an expensive call. Are we sure we are not calling this too often?

        • stack
        Show
        HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review144 ----------------------------------------------------------- Ship it! I think this good to go. Seem my comments below. See what you think. My one concern is the number of calls to getRowOrBefore... hopefully this patch cuts down overall on our need to use this function. I'd like to hear your opinion on that. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment744 > This is code duplicated from elsewhere. Can I help make it so we don't have to do this duplication? Or, for now, since this your fist patch, we can put it off IF you file a JIRA to fix the duplication (smile). src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment745 > So we start scanning at 'row'? Is this the 'row' the user asked for? No, it needs to be the row in the .META. table, right? We need to find the row in .META. that contains the asked for row first? NM, I see below how the row here is made.. .this looks right. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment746 > This is a nice little facility. src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < http://review.hbase.org/r/98/#comment747 > OK. This looks right. src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < http://review.hbase.org/r/98/#comment748 > getRowOrBefore is an expensive call. Are we sure we are not calling this too often? stack
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        (Updated 2010-06-02 01:13:42.272750)

        Review request for hbase.

        Changes
        -------

        1. Fix some issues pointed out by Todd.
        2. Re-do MetaScanner so it will start scanning Meta from the region where the given user table row resides, instead of scanning from the next region row in Meta.

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ed18092
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java d3a0c07
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- (Updated 2010-06-02 01:13:42.272750) Review request for hbase. Changes ------- 1. Fix some issues pointed out by Todd. 2. Re-do MetaScanner so it will start scanning Meta from the region where the given user table row resides, instead of scanning from the next region row in Meta. Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs (updated) src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ed18092 src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java d3a0c07 src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        On 2010-05-31 21:52:12, Todd Lipcon wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1075

        > <http://review.hbase.org/r/98/diff/4/?file=872#file872line1075>

        >

        > I feel like it would be cleaner to have the following methods be non-static, so they don't need any arguments. that would reduce the number of functions by factor of two

        That was what I wanted to do in the very beginning. I don't like so many functions either.

        But there is one existing method:
        public static boolean isTableEnabled(String tableName) ...
        public static boolean isTableEnabled(Configuration conf, String tableName) ...

        It's one of the reason that I just want to keep the original coding style to be consistent.

        In addition, I agree we can make isRegionCachePrefetchEnabled() etc. to be non-static. However, enableRegionCachePrefetch(), and disableRegionCachePrefetch() must to be static since we want to enable/disable a table without instantiate HTable. In another word, we cannot really dis/enable cache prefetch for each HTable instance who have the same table name. While we can only enable/disable based on table name. It is pretty similar as table enable/disable.

        On 2010-05-31 21:52:12, Todd Lipcon wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1608

        > <http://review.hbase.org/r/98/diff/4/?file=871#file871line1608>

        >

        > kill this function

        I don't like it either. I will kill all ``is...Disabled()'' methods.

        On 2010-05-31 21:52:12, Todd Lipcon wrote:

        > src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java, line 3639

        > <http://review.hbase.org/r/98/diff/4/?file=874#file874line3639>

        >

        > I really like these unit tests, but I think you should use a row key for the Get that isn't also a region start key, as it may expose different behavior. eg I think you will end up with 11 cached regions instead of 10

        As mentioned above, I reimplemented MetaScanner so it will start scanning Meta from the region row where the given table row resides, instead of scanning from the next region row in Meta. HTable.getRowOrBefore() is called in MetaScanner to achieve it, (however I'm not very sure whether it is the most efficient way to do it or not).

        So for this unit test, no matter the passed row is a region start key or not, we will always get 10 here.

        On 2010-05-31 21:52:12, Todd Lipcon wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, lines 776-781

        > <http://review.hbase.org/r/98/diff/4/?file=871#file871line776>

        >

        > this block should go after the cache check below, right?

        I reimplemented MetaScanner so it will start scanning Meta from the region where the given user table row resides, instead of scanning from the next region row in Meta.

        In this case, prefetchRegionCache() also fetches the queried table+row to the cache. Here I put it before the cache check block, so it can load the result from cache directly. Otherwise it may do an extra meta scan if cache prefetch is enabled.

        However if multiple threads accessing this block concurrently, this way will cause cache-prefetch executed twice. But I think this case is pretty rare, so the penalty should be relatively smaller.

        What do you think?

        • Mingjie

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review107
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> On 2010-05-31 21:52:12, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1075 > < http://review.hbase.org/r/98/diff/4/?file=872#file872line1075 > > > I feel like it would be cleaner to have the following methods be non-static, so they don't need any arguments. that would reduce the number of functions by factor of two That was what I wanted to do in the very beginning. I don't like so many functions either. But there is one existing method: public static boolean isTableEnabled(String tableName) ... public static boolean isTableEnabled(Configuration conf, String tableName) ... It's one of the reason that I just want to keep the original coding style to be consistent. In addition, I agree we can make isRegionCachePrefetchEnabled() etc. to be non-static. However, enableRegionCachePrefetch(), and disableRegionCachePrefetch() must to be static since we want to enable/disable a table without instantiate HTable. In another word, we cannot really dis/enable cache prefetch for each HTable instance who have the same table name. While we can only enable/disable based on table name. It is pretty similar as table enable/disable. On 2010-05-31 21:52:12, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1608 > < http://review.hbase.org/r/98/diff/4/?file=871#file871line1608 > > > kill this function I don't like it either. I will kill all ``is...Disabled()'' methods. On 2010-05-31 21:52:12, Todd Lipcon wrote: > src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java, line 3639 > < http://review.hbase.org/r/98/diff/4/?file=874#file874line3639 > > > I really like these unit tests, but I think you should use a row key for the Get that isn't also a region start key, as it may expose different behavior. eg I think you will end up with 11 cached regions instead of 10 As mentioned above, I reimplemented MetaScanner so it will start scanning Meta from the region row where the given table row resides, instead of scanning from the next region row in Meta. HTable.getRowOrBefore() is called in MetaScanner to achieve it, (however I'm not very sure whether it is the most efficient way to do it or not). So for this unit test, no matter the passed row is a region start key or not, we will always get 10 here. On 2010-05-31 21:52:12, Todd Lipcon wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, lines 776-781 > < http://review.hbase.org/r/98/diff/4/?file=871#file871line776 > > > this block should go after the cache check below, right? I reimplemented MetaScanner so it will start scanning Meta from the region where the given user table row resides, instead of scanning from the next region row in Meta. In this case, prefetchRegionCache() also fetches the queried table+row to the cache. Here I put it before the cache check block, so it can load the result from cache directly. Otherwise it may do an extra meta scan if cache prefetch is enabled. However if multiple threads accessing this block concurrently, this way will cause cache-prefetch executed twice. But I think this case is pretty rare, so the penalty should be relatively smaller. What do you think? Mingjie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review107 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review107
        -----------------------------------------------------------

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment635>

        typo: locationS

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment633>

        I think any of getCachedRegionCount, countCachedRegions, or getNumCachedRegions would be a clearer name.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment634>

        style-wise, why "final" here?

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment636>

        remove these @param javadocs - they just take up space if the param names are self-explanatory (which these are)

        same thing above

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment637>

        this needs a Comparator argument, otherwise it does object identity rather than bytewise comparison of the table names

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment640>

        javadoc out of date - it prefetches the region for the given row, and prefetchLimit regions ahead

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment638>

        should return false to stop scanning at this point, right?

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment639>

        // cache this meta entry

        (it's not caching all)

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment641>

        this block should go after the cache check below, right?

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment647>

        this function needs to synchronized on cachedRegionLocations which is an unsynchronized HashMap

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment648>

        return location != null;

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment654>

        kill this function

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment649>

        you should prefix the file with writeInt(allRegions.size()) so you don't have to use EOFException catching below

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment650>

        yuck, see above

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment651>

        I feel like it would be cleaner to have the following methods be non-static, so they don't need any arguments. that would reduce the number of functions by factor of two

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment653>

        incorrect javadoc, also a few places below

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment652>

        why do we need a separate function for enabled and disabled? aren't they always inverse of each other?

        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
        <http://review.hbase.org/r/98/#comment642>

        should make clear this is the row name in the user table, not the meta table.

        should also be clarified that it will start scanning with the region after row, because it doesn't use getClosestRowBefore

        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        <http://review.hbase.org/r/98/#comment643>

        useless @throws

        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        <http://review.hbase.org/r/98/#comment645>

        you should use util.getTestDir here

        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        <http://review.hbase.org/r/98/#comment644>

        import java.io.File

        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        <http://review.hbase.org/r/98/#comment646>

        I really like these unit tests, but I think you should use a row key for the Get that isn't also a region start key, as it may expose different behavior. eg I think you will end up with 11 cached regions instead of 10

        • Todd
        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review107 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment635 > typo: locationS src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment633 > I think any of getCachedRegionCount, countCachedRegions, or getNumCachedRegions would be a clearer name. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment634 > style-wise, why "final" here? src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment636 > remove these @param javadocs - they just take up space if the param names are self-explanatory (which these are) same thing above src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment637 > this needs a Comparator argument, otherwise it does object identity rather than bytewise comparison of the table names src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment640 > javadoc out of date - it prefetches the region for the given row, and prefetchLimit regions ahead src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment638 > should return false to stop scanning at this point, right? src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment639 > // cache this meta entry (it's not caching all) src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment641 > this block should go after the cache check below, right? src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment647 > this function needs to synchronized on cachedRegionLocations which is an unsynchronized HashMap src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment648 > return location != null; src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment654 > kill this function src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment649 > you should prefix the file with writeInt(allRegions.size()) so you don't have to use EOFException catching below src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment650 > yuck, see above src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment651 > I feel like it would be cleaner to have the following methods be non-static, so they don't need any arguments. that would reduce the number of functions by factor of two src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment653 > incorrect javadoc, also a few places below src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment652 > why do we need a separate function for enabled and disabled? aren't they always inverse of each other? src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java < http://review.hbase.org/r/98/#comment642 > should make clear this is the row name in the user table, not the meta table. should also be clarified that it will start scanning with the region after row, because it doesn't use getClosestRowBefore src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java < http://review.hbase.org/r/98/#comment643 > useless @throws src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java < http://review.hbase.org/r/98/#comment645 > you should use util.getTestDir here src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java < http://review.hbase.org/r/98/#comment644 > import java.io.File src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java < http://review.hbase.org/r/98/#comment646 > I really like these unit tests, but I think you should use a row key for the Get that isn't also a region start key, as it may expose different behavior. eg I think you will end up with 11 cached regions instead of 10 Todd
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        (Updated 2010-05-31 19:49:20.012863)

        Review request for hbase.

        Changes
        -------

        Improved to address comments from tsuna, Paul Crown, and jdcryans.

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- (Updated 2010-05-31 19:49:20.012863) Review request for hbase. Changes ------- Improved to address comments from tsuna, Paul Crown, and jdcryans. Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs (updated) src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        On 2010-05-28 15:05:08, Ryan Rawson wrote:

        > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1604

        > <http://review.hbase.org/r/98/diff/3/?file=741#file741line1604>

        >

        > What does this method actually do? If its a predicate start it with 'is'. Also precaching should be off by default...

        > Is it?

        Will add ``is'' prefix to this method.

        Pre-caching (in case of a cache miss) is set to ``on'' by default right now, according to stack:

        ``I'm for default behavior being grabbing more than just the one row – ten or something. Regards full-table scan as default, I think it should be an option (Its a nice option to have). ''

        https://issues.apache.org/jira/browse/HBASE-2468?focusedCommentId=12870321&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12870321

        • Mingjie

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review95
        -----------------------------------------------------------

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> On 2010-05-28 15:05:08, Ryan Rawson wrote: > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1604 > < http://review.hbase.org/r/98/diff/3/?file=741#file741line1604 > > > What does this method actually do? If its a predicate start it with 'is'. Also precaching should be off by default... > Is it? Will add ``is'' prefix to this method. Pre-caching (in case of a cache miss) is set to ``on'' by default right now, according to stack: ``I'm for default behavior being grabbing more than just the one row – ten or something. Regards full-table scan as default, I think it should be an option (Its a nice option to have). '' https://issues.apache.org/jira/browse/HBASE-2468?focusedCommentId=12870321&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12870321 Mingjie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review95 -----------------------------------------------------------
        Hide
        HBase Review Board added a comment -

        Message from: "Ryan Rawson" <ryanobjc@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review95
        -----------------------------------------------------------

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment553>

        What does this method actually do? If its a predicate start it with 'is'. Also precaching should be off by default...
        Is it?

        • Ryan
        Show
        HBase Review Board added a comment - Message from: "Ryan Rawson" <ryanobjc@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review95 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment553 > What does this method actually do? If its a predicate start it with 'is'. Also precaching should be off by default... Is it? Ryan
        Hide
        HBase Review Board added a comment -

        Message from: "Ryan Rawson" <ryanobjc@gmail.com>

        Show
        HBase Review Board added a comment - Message from: "Ryan Rawson" <ryanobjc@gmail.com>
        Hide
        HBase Review Board added a comment -

        Message from: "Benoit Sigoure" <tsunanet@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/#review93
        -----------------------------------------------------------

        src/main/java/org/apache/hadoop/hbase/client/HConnection.java
        <http://review.hbase.org/r/98/#comment531>

        Properly document this argument.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment534>

        Please use a concurrent collection and remove the synchronized blocks. For guidance, see around slide 30 of this presentation:
        http://www.cs.umd.edu/class/fall2009/cmsc132H/slides/still-effective.pdf

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment532>

        Coding style: put the catch on the previous line.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment533>

        We don't typically call methods using `this.methodname(args)' – remove the `this.'

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment535>

        Add a space before the `:'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment536>

        Use `Boolean.TRUE' instead of `new Boolean(true)'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment537>

        Use `Boolean.FALSE' instead of `new Boolean(false)'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment538>

        Use `Boolean.TRUE' instead of `new Boolean(true)'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment539>

        Remove the outer parentheses.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment540>

        Add a space before the `:'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment541>

        If this fits on the previous line while staying under 80 columns, please wrap it around.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/98/#comment542>

        Remove the space after `cacheLocation'

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment546>

        Use

        {@link #readFields readFields}

        instead of <code>readFields</code>

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment544>

        You can remove this <p>

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment543>

        Use <pre> not <code>

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment545>

        Add a space before the `:'.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment547>

        Use

        {@link #getRegionsInfo getRegionsInfo}

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment548>

        Use

        {@link ...}

        here and below.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment550>

        Wrap this around with the previous line.

        src/main/java/org/apache/hadoop/hbase/client/HTable.java
        <http://review.hbase.org/r/98/#comment549>

        I believe you can remove this block.

        • Benoit
        Show
        HBase Review Board added a comment - Message from: "Benoit Sigoure" <tsunanet@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/#review93 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnection.java < http://review.hbase.org/r/98/#comment531 > Properly document this argument. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment534 > Please use a concurrent collection and remove the synchronized blocks. For guidance, see around slide 30 of this presentation: http://www.cs.umd.edu/class/fall2009/cmsc132H/slides/still-effective.pdf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment532 > Coding style: put the catch on the previous line. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment533 > We don't typically call methods using `this.methodname(args)' – remove the `this.' src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment535 > Add a space before the `:'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment536 > Use `Boolean.TRUE' instead of `new Boolean(true)'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment537 > Use `Boolean.FALSE' instead of `new Boolean(false)'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment538 > Use `Boolean.TRUE' instead of `new Boolean(true)'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment539 > Remove the outer parentheses. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment540 > Add a space before the `:'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment541 > If this fits on the previous line while staying under 80 columns, please wrap it around. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/98/#comment542 > Remove the space after `cacheLocation' src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment546 > Use {@link #readFields readFields} instead of <code>readFields</code> src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment544 > You can remove this <p> src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment543 > Use <pre> not <code> src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment545 > Add a space before the `:'. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment547 > Use {@link #getRegionsInfo getRegionsInfo} src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment548 > Use {@link ...} here and below. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment550 > Wrap this around with the previous line. src/main/java/org/apache/hadoop/hbase/client/HTable.java < http://review.hbase.org/r/98/#comment549 > I believe you can remove this block. Benoit
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        (Updated 2010-05-28 13:20:33.843332)

        Review request for hbase.

        Changes
        -------

        Improved a little, to address the comments from Benoit Sigoure, at http://review.hbase.org/r/78/#review91.

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs (updated)


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- (Updated 2010-05-28 13:20:33.843332) Review request for hbase. Changes ------- Improved a little, to address the comments from Benoit Sigoure, at http://review.hbase.org/r/78/#review91 . Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs (updated) src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        HBase Review Board added a comment -

        Message from: "Benoit Sigoure" <tsunanet@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/78/#review91
        -----------------------------------------------------------

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment515>

        Remove the spurious spaces around the parenthesis.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment516>

        No space before `[]' on this line and the previous line.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment517>

        Move the declaration of this variable to line 675 where it's initialized.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment518>

        I'm not sure I understand this comment but it could be because I'm not very familiar with this part of the code.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment519>

        Do this instead:

        if (value == null)

        { return true; // don't cache it }

        final String serverAddress = Bytes.toString(value);

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment520>

        I don't understand the last part of the comment ("fetch ahead number of rows in META").

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment521>

        No space before `[]'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment522>

        Wrap the `else' on the previous line.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment523>

        Instead of doing `+ e.getMessage()', pass `e' in second argument.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment524>

        No space before `[]'.

        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
        <http://review.hbase.org/r/78/#comment525>

        Remove the second part of the check (tableLogs.values() == null). This can't happen. If it happens, there's a bug in SoftValueSortedMap and we shouldn't hide it.

        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
        <http://review.hbase.org/r/78/#comment526>

        No space before `[]'.

        • Benoit
        Show
        HBase Review Board added a comment - Message from: "Benoit Sigoure" <tsunanet@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/78/#review91 ----------------------------------------------------------- src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment515 > Remove the spurious spaces around the parenthesis. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment516 > No space before `[]' on this line and the previous line. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment517 > Move the declaration of this variable to line 675 where it's initialized. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment518 > I'm not sure I understand this comment but it could be because I'm not very familiar with this part of the code. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment519 > Do this instead: if (value == null) { return true; // don't cache it } final String serverAddress = Bytes.toString(value); src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment520 > I don't understand the last part of the comment ("fetch ahead number of rows in META"). src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment521 > No space before `[]'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment522 > Wrap the `else' on the previous line. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment523 > Instead of doing `+ e.getMessage()', pass `e' in second argument. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment524 > No space before `[]'. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java < http://review.hbase.org/r/78/#comment525 > Remove the second part of the check (tableLogs.values() == null). This can't happen. If it happens, there's a bug in SoftValueSortedMap and we shouldn't hide it. src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java < http://review.hbase.org/r/78/#comment526 > No space before `[]'. Benoit
        Hide
        HBase Review Board added a comment -

        Message from: "Mingjie Lai" <mjlai09@gmail.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/98/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        HBASE-2468: Improvements to prewarm META cache on clients.

        Changes:
        1. Add new HTable methods which support region info de/serialation, and region cache prewarm:

        • void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache
        • Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache
        • prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info.

        2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table.

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs


        src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d
        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac
        src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/98/diff

        Testing
        -------

        Unit tests passed locally for me.

        Thanks,

        Mingjie

        Show
        HBase Review Board added a comment - Message from: "Mingjie Lai" <mjlai09@gmail.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/98/ ----------------------------------------------------------- Review request for hbase. Summary ------- HBASE-2468 : Improvements to prewarm META cache on clients. Changes: 1. Add new HTable methods which support region info de/serialation, and region cache prewarm: void serializeRegionInfo(): clients could perform a large scan for all the meta for the table, serialize the meta to a file. MR job can ship a copy of the meta for the table in the DistributedCache Map<HRegionInfo, HServerAddress> deserializeRegionInfo(): MR job can deserialize the region info from the DistributedCache prewarmRegionCache(Map<HRegionInfo, HServerAddress> regionMap): MR job can prewarm local region cache by the deserialized region info. 2. For each client, each region cache read-miss could trigger read-ahead some number of rows in META. This option could be turned on and off for one particular table. This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs src/main/java/org/apache/hadoop/hbase/client/HConnection.java 853164d src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac src/main/java/org/apache/hadoop/hbase/client/HTable.java 7ec95cb src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/98/diff Testing ------- Unit tests passed locally for me. Thanks, Mingjie
        Hide
        Mingjie Lai added a comment -

        Thanks for your responses. Based on the discussions especially what were suggected by Todd and Andrew, here are my new proposal to address this issue.

        0) (The biggest argument is regarding whole cache prewarm when table initialization. So it won't be supported any more. )

        1) Give clients the option to warm the cache for each table. New interfaces are provided to enable/disable cache prefetch in case of a cache miss.

        public interface HConnection {
          // ... 
        
          /**
           * Enable or disable region cache prewarm for the <i>tableName</i>. 
           */
          public void setRegionCachePreWarm(final byte[] tableName, boolean enable); 
        
          public boolean getRegionCachePreWarm(final byte[] tableName); 
        }
        
        public class HTable implements HTableInterface {
          // ...
          //
          private boolean isRegionCachePreWarmEnabled; // default true; 
          public void setRegionCachePreWarm(boolean enable) {
            this.isRegionCachePreWarmEnabled = enable; 
            this.connection.setRegionCachePreWarm(this.tableName, 
                this.isRegionCachePreWarmEnabled);
          }
        
          public boolean getRegionCachePreWarm() {
            return this.connection.getRegionCachePreWarm(this.tableName);
          } 
        }
        
        HTable t1 = new HTable("foo");
        t1.setRegionCachePreWarm(false); 
        

        2) Prefetch a certain number of regions locations on cache miss when performing location lookup. (This piece has been implemented in the previous patch except configurable readahead number)

        // HConnectionManager.TableServer
        this.preFetchRegionLimit = conf.getInt("hbase.client.prefetch.limit", 10);
        

        3) Serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there.

        // getRegionsInfo does not update the region location cache for the table
        // we don't want to change that
        Map<HRegionInfo,HServerAddress> regionMap = table.getRegionsInfo();
        
        // added: 
        table.serizlizeRegionInfo(regionMap);
        
        // added: deserialize region info from DC
        regionMap = table.deserizlizeRegionInfo();
        
        // added: regionMap will be set to HConnectionManager region cache. 
        table.setRegionsInfo(regionMap);
        

        Please help to review and provide your feedback. If there is no objection, I will provide a new patch based on the proposal.

        Show
        Mingjie Lai added a comment - Thanks for your responses. Based on the discussions especially what were suggected by Todd and Andrew, here are my new proposal to address this issue. 0) (The biggest argument is regarding whole cache prewarm when table initialization. So it won't be supported any more. ) 1) Give clients the option to warm the cache for each table . New interfaces are provided to enable/disable cache prefetch in case of a cache miss. public interface HConnection { // ... /** * Enable or disable region cache prewarm for the <i>tableName</i>. */ public void setRegionCachePreWarm( final byte [] tableName, boolean enable); public boolean getRegionCachePreWarm( final byte [] tableName); } public class HTable implements HTableInterface { // ... // private boolean isRegionCachePreWarmEnabled; // default true ; public void setRegionCachePreWarm( boolean enable) { this .isRegionCachePreWarmEnabled = enable; this .connection.setRegionCachePreWarm( this .tableName, this .isRegionCachePreWarmEnabled); } public boolean getRegionCachePreWarm() { return this .connection.getRegionCachePreWarm( this .tableName); } } HTable t1 = new HTable( "foo" ); t1.setRegionCachePreWarm( false ); 2) Prefetch a certain number of regions locations on cache miss when performing location lookup. (This piece has been implemented in the previous patch except configurable readahead number) // HConnectionManager.TableServer this .preFetchRegionLimit = conf.getInt( "hbase.client.prefetch.limit" , 10); 3) Serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there. // getRegionsInfo does not update the region location cache for the table // we don't want to change that Map<HRegionInfo,HServerAddress> regionMap = table.getRegionsInfo(); // added: table.serizlizeRegionInfo(regionMap); // added: deserialize region info from DC regionMap = table.deserizlizeRegionInfo(); // added: regionMap will be set to HConnectionManager region cache. table.setRegionsInfo(regionMap); Please help to review and provide your feedback. If there is no objection, I will provide a new patch based on the proposal.
        Show
        stack added a comment - I already posted a review above https://issues.apache.org/jira/browse/HBASE-2468?focusedCommentId=12870321&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12870321
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/78/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        Posting review board for this JIRA on behalf of Mingjie Lai

        This addresses bug HBASE-2468.
        http://issues.apache.org/jira/browse/HBASE-2468

        Diffs


        src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac
        src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e
        src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a

        Diff: http://review.hbase.org/r/78/diff

        Testing
        -------

        Thanks,

        Todd

        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/78/ ----------------------------------------------------------- Review request for hbase. Summary ------- Posting review board for this JIRA on behalf of Mingjie Lai This addresses bug HBASE-2468 . http://issues.apache.org/jira/browse/HBASE-2468 Diffs src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 09de2ac src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 3de661e src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95e494a Diff: http://review.hbase.org/r/78/diff Testing ------- Thanks, Todd
        Hide
        Andrew Purtell added a comment -

        For TIF, I don't see any reason to prefetch meta. Each mapper taking input from a table accesses only a small subset of regions, and scanning the whole thing is total overkill

        Yes, good point. I was just generally talking about Table*Format so muddied the waters.

        Your option B is the use case most interesting IMHO but not just long lived clients would benefit; any client writing a reasonable amount of data with well distributed keys would get off the ground faster while being more efficient about META accesses.

        So the current patch can 1) prefetch ahead a few rows on META miss and 2) can be reworked slightly to provide clients an option for preloading region locations for a table from META. I suggest we get the current patch cleaned up and committed as something which can optionally be enabled for a table:

        HTable table = new HTable("foo");
        table.setLocationPrefetch(true);
        

        Prefetch would scan ahead some small configurable number of rows upon cache miss as is discussed in comments above.

        Additionally, we could add HTable#setRegionsInfo(Map<HRegionInfo,HServerAddress> regionMap)

        // getRegionsInfo does not update the region location cache for the table
        // we don't want to change that
        Map<HRegionInfo,HServerAddress> regionMap = table.getRegionsInfo();
        // but we can use the result to do so at the client's option
        table.setRegionsInfo(regionMap);
        

        Regarding whether or not to prefetch all region info for a table when starting a MR job, anticipating it would be a toggle, if the default is 'true' I think this would help common cases especially encountered by newcomers if TOF did that but could/should always be turned off if the operator is running jobs with high frequency/concurrency (so only some limited readahead on miss would be active then). But I do think this is a better idea:

        So, for the MR case, I think we should provide the option to serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there.

        Serialize the result of HTable#getRegionsInfo when setting up the job (in TableMapReduceUtil). Load from DistributedCache and pass to HTable#setRegionsInfo in TOF is jobconf indicates it is available.

        Show
        Andrew Purtell added a comment - For TIF, I don't see any reason to prefetch meta. Each mapper taking input from a table accesses only a small subset of regions, and scanning the whole thing is total overkill Yes, good point. I was just generally talking about Table*Format so muddied the waters. Your option B is the use case most interesting IMHO but not just long lived clients would benefit; any client writing a reasonable amount of data with well distributed keys would get off the ground faster while being more efficient about META accesses. So the current patch can 1) prefetch ahead a few rows on META miss and 2) can be reworked slightly to provide clients an option for preloading region locations for a table from META. I suggest we get the current patch cleaned up and committed as something which can optionally be enabled for a table: HTable table = new HTable( "foo" ); table.setLocationPrefetch( true ); Prefetch would scan ahead some small configurable number of rows upon cache miss as is discussed in comments above. Additionally, we could add HTable#setRegionsInfo(Map<HRegionInfo,HServerAddress> regionMap) // getRegionsInfo does not update the region location cache for the table // we don't want to change that Map<HRegionInfo,HServerAddress> regionMap = table.getRegionsInfo(); // but we can use the result to do so at the client's option table.setRegionsInfo(regionMap); Regarding whether or not to prefetch all region info for a table when starting a MR job, anticipating it would be a toggle, if the default is 'true' I think this would help common cases especially encountered by newcomers if TOF did that but could/should always be turned off if the operator is running jobs with high frequency/concurrency (so only some limited readahead on miss would be active then). But I do think this is a better idea: So, for the MR case, I think we should provide the option to serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there. Serialize the result of HTable#getRegionsInfo when setting up the job (in TableMapReduceUtil ). Load from DistributedCache and pass to HTable#setRegionsInfo in TOF is jobconf indicates it is available.
        Hide
        stack added a comment -

        If we had hbase-2600, this'd be easy to do.

        Show
        stack added a comment - If we had hbase-2600, this'd be easy to do.
        Hide
        Todd Lipcon added a comment -

        For TIF, I don't see any reason to prefetch meta. Each mapper taking input from a table accesses only a small subset of regions, and scanning the whole thing is total overkill.

        For TOF, or jobs that simply access HBase but don't use either of our provided output formats, having a prewarmed meta cache is primarily avoid DDOSing the META server. On a 500 node MR cluster, you might well have 4000 mappers start within the period of just a few seconds and overwhelm META pretty fast. This is a real use case for many people - imagine storing a dimension table in HBase and doing a hash join against it from a MR job processing many TB of logs.

        So, for the MR case, I think we should provide the option to serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there. This would reduce number of mappers actually hitting META to nearly 0.

        Doing a scan of META to fetch all rows for this table, though, probably makes the problem even worse, especially for jobs that only access a subset of the content.

        My opinions are that we should:
        a) prefetch ahead a few rows on any META miss, since it will fill the cache up faster and catch split children. Perhaps we can do a benchmark to see whether a 10-row scan is any harder to service than a 2-row scan - my hunch is that the load on the server is mostly dominated by constant time, so we may as well scan ahead a bit.
        b) allow the option to fetch a row range (which includes the full table range) into the cache. This could be used in startup of long-running processes (eg the thrift gateway or stargate may prefer to warm up its cache before it starts accepting any user requests). This should not be default. I think of this as the equivalent of posix_fadvise(..., POSIX_FADV_WILLNEED). Providing the API as a range will also allow us to do it for multiregion scans, etc.
        c) allow the option to serialize meta rows into a sequencefile and then load them back. This provides the improvement I mentioned above for large MR jobs randomly accessing a cluster.

        I see the above 3 things as separate tasks. It sounds like the current patch can do the first of the three, so maybe we should separate that out, get it committed, and then move on to b and c?

        Show
        Todd Lipcon added a comment - For TIF, I don't see any reason to prefetch meta. Each mapper taking input from a table accesses only a small subset of regions, and scanning the whole thing is total overkill. For TOF, or jobs that simply access HBase but don't use either of our provided output formats, having a prewarmed meta cache is primarily avoid DDOSing the META server. On a 500 node MR cluster, you might well have 4000 mappers start within the period of just a few seconds and overwhelm META pretty fast. This is a real use case for many people - imagine storing a dimension table in HBase and doing a hash join against it from a MR job processing many TB of logs. So, for the MR case, I think we should provide the option to serialize META to disk, put it in the DistributedCache, and then prewarm the meta cache from there. This would reduce number of mappers actually hitting META to nearly 0. Doing a scan of META to fetch all rows for this table, though, probably makes the problem even worse, especially for jobs that only access a subset of the content. My opinions are that we should: a) prefetch ahead a few rows on any META miss, since it will fill the cache up faster and catch split children. Perhaps we can do a benchmark to see whether a 10-row scan is any harder to service than a 2-row scan - my hunch is that the load on the server is mostly dominated by constant time, so we may as well scan ahead a bit. b) allow the option to fetch a row range (which includes the full table range) into the cache. This could be used in startup of long-running processes (eg the thrift gateway or stargate may prefer to warm up its cache before it starts accepting any user requests). This should not be default. I think of this as the equivalent of posix_fadvise(..., POSIX_FADV_WILLNEED). Providing the API as a range will also allow us to do it for multiregion scans, etc. c) allow the option to serialize meta rows into a sequencefile and then load them back. This provides the improvement I mentioned above for large MR jobs randomly accessing a cluster. I see the above 3 things as separate tasks. It sounds like the current patch can do the first of the three, so maybe we should separate that out, get it committed, and then move on to b and c?
        Hide
        Jonathan Gray added a comment -

        Then for the general case that's fine but TIF and TOF should set some flag to true which causes the prefetch on open to kick in. Reasonable?

        That sounds completely reasonable. Along the same lines as doing things like disabling block cache or increasing scanner caching.

        Show
        Jonathan Gray added a comment - Then for the general case that's fine but TIF and TOF should set some flag to true which causes the prefetch on open to kick in. Reasonable? That sounds completely reasonable. Along the same lines as doing things like disabling block cache or increasing scanner caching.
        Hide
        Andrew Purtell added a comment -

        A single table could be made up of many thousands even hundreds of thousands of regions? It could be significant. Even if 1000 regions it could add a huge overhead to instantiating an HTable a user may only use for a single Get.

        Yes but this issue talks about a more likely use case, where someone runs a MR job over a table and causes a storm of individual Gets to META.

        I love these features and have wanted them for a while. Just don't think we should change default HTable behavior significantly.

        Then for the general case that's fine but TIF and TOF should set some flag to true which causes the prefetch on open to kick in. Reasonable?

        Show
        Andrew Purtell added a comment - A single table could be made up of many thousands even hundreds of thousands of regions? It could be significant. Even if 1000 regions it could add a huge overhead to instantiating an HTable a user may only use for a single Get. Yes but this issue talks about a more likely use case, where someone runs a MR job over a table and causes a storm of individual Gets to META. I love these features and have wanted them for a while. Just don't think we should change default HTable behavior significantly. Then for the general case that's fine but TIF and TOF should set some flag to true which causes the prefetch on open to kick in. Reasonable?
        Hide
        stack added a comment -

        Hmm... if you can't scan the whole table, then this patch gets more complicated I believe. You'll need to prefix the scan with a getClosestsRowBefore to find where to start the short scan of the next N items. Also, the way that rowUpperLimit works in this patch seems like it might be broke. We'll only ever get the first rowUpperLimit items in the table... is that right?

        Show
        stack added a comment - Hmm... if you can't scan the whole table, then this patch gets more complicated I believe. You'll need to prefix the scan with a getClosestsRowBefore to find where to start the short scan of the next N items. Also, the way that rowUpperLimit works in this patch seems like it might be broke. We'll only ever get the first rowUpperLimit items in the table... is that right?
        Hide
        Jonathan Gray added a comment -

        A single table could be made up of many thousands even hundreds of thousands of regions? It could be significant. Even if 1000 regions it could add a huge overhead to instantiating an HTable a user may only use for a single Get.

        As for the look-ahead, I think it makes sense if there is a cached value that gets evicted to do a look-ahead of an extra row so you have a good chance of picking up the other side of a split. Beyond that, I don't think it's a good idea to add a default overhead because this fetch is likely blocking a user data request waiting for a region to open and refresh the location from meta.

        I love these features and have wanted them for a while. Just don't think we should change default HTable behavior significantly.

        Show
        Jonathan Gray added a comment - A single table could be made up of many thousands even hundreds of thousands of regions? It could be significant. Even if 1000 regions it could add a huge overhead to instantiating an HTable a user may only use for a single Get. As for the look-ahead, I think it makes sense if there is a cached value that gets evicted to do a look-ahead of an extra row so you have a good chance of picking up the other side of a split. Beyond that, I don't think it's a good idea to add a default overhead because this fetch is likely blocking a user data request waiting for a region to open and refresh the location from meta. I love these features and have wanted them for a while. Just don't think we should change default HTable behavior significantly.
        Hide
        stack added a comment -

        @Andrew Todd was author of the original description. Its not explicit about whether prewarm is optional or not. Sorry for any confusion.

        Show
        stack added a comment - @Andrew Todd was author of the original description. Its not explicit about whether prewarm is optional or not. Sorry for any confusion.
        Hide
        Andrew Purtell added a comment -

        I agree with Jonathan that the "prefetch entire META" should be optional, default false.

        Well you are the reporter so I will defer to that, though the issue description implies otherwise to me.

        Just caught the "entire" META. I'm not suggesting prefetch of entire META, the fetch should just be of the rows involving the table.

        The patch only does a prefetch for the rows involving the table being opened. I believe this is the desired behavior. Why should it be false?

        Show
        Andrew Purtell added a comment - I agree with Jonathan that the "prefetch entire META" should be optional, default false. Well you are the reporter so I will defer to that, though the issue description implies otherwise to me. Just caught the " entire " META. I'm not suggesting prefetch of entire META, the fetch should just be of the rows involving the table. The patch only does a prefetch for the rows involving the table being opened. I believe this is the desired behavior. Why should it be false?
        Hide
        Andrew Purtell added a comment -

        I agree with Jonathan that the "prefetch entire META" should be optional, default false.

        Well you are the reporter so I will defer to that, though the issue description implies otherwise to me.

        Show
        Andrew Purtell added a comment - I agree with Jonathan that the "prefetch entire META" should be optional, default false. Well you are the reporter so I will defer to that, though the issue description implies otherwise to me.
        Hide
        stack added a comment -

        I'm for default behavior being grabbing more than just the one row – ten or something. Regards full-table scan as default, I think it should be an option (Its a nice option to have).

        On code, here's some comments (Congrats Mingjie on first submission):

        Change

        +      this.preFetchRegionLimit = 10; 
        

        .. to be

        +      this.preFetchRegionLimit = conf.getInt("hbase.client.prefetch.limit", 10); 
        

        IMO, never hardcode things like this, and IMO, you don't need to put the config out in the hbse-default.xml.... just do the above. Someone who really needs to change it can read code and figure its possible. If it becomes a popular change people make, then we can move it out to hbase-default.xml.

        Regards methods that you only call from a unit test, should they be package protected rather than protected; i.e. narrower access than protected?

        Nit. You don't need to explain in unit test javadoc why the fix: e.g.:

        +   * This fix is provided to prevent potential storm of META reads for a 
        +   * very large table.  
        

        Otherwise, patch looks great.

        Show
        stack added a comment - I'm for default behavior being grabbing more than just the one row – ten or something. Regards full-table scan as default, I think it should be an option (Its a nice option to have). On code, here's some comments (Congrats Mingjie on first submission): Change + this .preFetchRegionLimit = 10; .. to be + this .preFetchRegionLimit = conf.getInt( "hbase.client.prefetch.limit" , 10); IMO, never hardcode things like this, and IMO, you don't need to put the config out in the hbse-default.xml.... just do the above. Someone who really needs to change it can read code and figure its possible. If it becomes a popular change people make, then we can move it out to hbase-default.xml. Regards methods that you only call from a unit test, should they be package protected rather than protected; i.e. narrower access than protected? Nit. You don't need to explain in unit test javadoc why the fix: e.g.: + * This fix is provided to prevent potential storm of META reads for a + * very large table. Otherwise, patch looks great.
        Hide
        Todd Lipcon added a comment -

        I agree with Jonathan that the "prefetch entire META" should be optional, default false. If we anticipate multi-hundred-TB tables, we're talking on the order of 100K regions at least, and a full scan is quite expensive. For long-running access it can sometimes make sense, but for other cases it certainly does not. I would prefer it to be a call like table.prefetchRegionLocations() or something.

        Regarding the fetch-ahead of META, it seems to make sense to scan forward a couple rows by default if we can measure that there isn't much extra cost. In addition to the split scenario, it will help for the case of longer scans which are quite likely to cross regions. But again, it should be configurable.

        [context: haven't had a chance to look at the patch yet!]

        Show
        Todd Lipcon added a comment - I agree with Jonathan that the "prefetch entire META" should be optional, default false. If we anticipate multi-hundred-TB tables, we're talking on the order of 100K regions at least, and a full scan is quite expensive. For long-running access it can sometimes make sense, but for other cases it certainly does not. I would prefer it to be a call like table.prefetchRegionLocations() or something. Regarding the fetch-ahead of META, it seems to make sense to scan forward a couple rows by default if we can measure that there isn't much extra cost. In addition to the split scenario, it will help for the case of longer scans which are quite likely to cross regions. But again, it should be configurable. [context: haven't had a chance to look at the patch yet!]
        Hide
        Jonathan Gray added a comment -

        IMO this jira is about adding functionality not necessarily changing the default functionality. I think non-warming, non-prefetching clients should still be possible, if not default. Seems odd to make a big change that all of a sudden doing your existing new HTable(name) could possibly result in the scan of potentially thousands of rows or more and now you need concept of preventing this from happening.

        If a split, scanning forward one row gives you 50% chance of also finding the second half of split, don't see what going further than that would give you. This would be at the cost of single-row optimizations that are possible. Going forward one extra row sounds reasonable to me but again it's not with impact so should be optional.

        Show
        Jonathan Gray added a comment - IMO this jira is about adding functionality not necessarily changing the default functionality. I think non-warming, non-prefetching clients should still be possible, if not default. Seems odd to make a big change that all of a sudden doing your existing new HTable(name) could possibly result in the scan of potentially thousands of rows or more and now you need concept of preventing this from happening. If a split, scanning forward one row gives you 50% chance of also finding the second half of split, don't see what going further than that would give you. This would be at the cost of single-row optimizations that are possible. Going forward one extra row sounds reasonable to me but again it's not with impact so should be optional.
        Hide
        Andrew Purtell added a comment -

        I'm unsure if I like the approach of aggressively pre-warming though you don't ask for it (prefetch on HTable instantiation, and auto look-ahead in locateRegionInMeta

        This is what this issue is asking for as I read it.

        I think I would prefer adding an additional constructor w/ a warm boolean to determine whether META should be warmed for that table or not.

        That makes sense to me as long as the boolean defaults to true.

        I don't quite understand the prefetching in locateRegionInMeta(). If you want to pre-cache meta, you should warm the HTable on construction. Is this to try to update neighboring regions in case they got stale?

        If the cache misses if due to a split, scanning forward a couple of entries has a reasonable chance of picking up related changes. Should be less than 10. 1? 2?

        Show
        Andrew Purtell added a comment - I'm unsure if I like the approach of aggressively pre-warming though you don't ask for it (prefetch on HTable instantiation, and auto look-ahead in locateRegionInMeta This is what this issue is asking for as I read it. I think I would prefer adding an additional constructor w/ a warm boolean to determine whether META should be warmed for that table or not. That makes sense to me as long as the boolean defaults to true . I don't quite understand the prefetching in locateRegionInMeta(). If you want to pre-cache meta, you should warm the HTable on construction. Is this to try to update neighboring regions in case they got stale? If the cache misses if due to a split, scanning forward a couple of entries has a reasonable chance of picking up related changes. Should be less than 10. 1? 2?
        Hide
        Jonathan Gray added a comment -

        I'm unsure if I like the approach of aggressively pre-warming though you don't ask for it (prefetch on HTable instantiation, and auto look-ahead in locateRegionInMeta)

        I think I would prefer adding an additional constructor w/ a warm boolean to determine whether META should be warmed for that table or not.

        I don't quite understand the prefetching in locateRegionInMeta(). If you want to pre-cache meta, you should warm the HTable on construction. Is this to try to update neighboring regions in case they got stale? Since 10 seems arbitrary, if we do have this behavior it should probably be configurable.

        Grabbing 9 extra rows could incur several additional block reads across multiple files. It would also eliminate any chances of using blooms or other optimizations if the META retrievals were never single row reads. So there is a cost to this being the default behavior.

        Otherwise the patch itself looks good. I'm in the process of changing MetaScanner a bit for some master stuff but should be okay.

        Good stuff, this has been a long-desired feature!

        Show
        Jonathan Gray added a comment - I'm unsure if I like the approach of aggressively pre-warming though you don't ask for it (prefetch on HTable instantiation, and auto look-ahead in locateRegionInMeta) I think I would prefer adding an additional constructor w/ a warm boolean to determine whether META should be warmed for that table or not. I don't quite understand the prefetching in locateRegionInMeta(). If you want to pre-cache meta, you should warm the HTable on construction. Is this to try to update neighboring regions in case they got stale? Since 10 seems arbitrary, if we do have this behavior it should probably be configurable. Grabbing 9 extra rows could incur several additional block reads across multiple files. It would also eliminate any chances of using blooms or other optimizations if the META retrievals were never single row reads. So there is a cost to this being the default behavior. Otherwise the patch itself looks good. I'm in the process of changing MetaScanner a bit for some master stuff but should be okay. Good stuff, this has been a long-desired feature!
        Hide
        Mingjie Lai added a comment -
        Show
        Mingjie Lai added a comment - Patch https://issues.apache.org/jira/secure/attachment/12445217/HBASE-2468-trunk.patch passed all unit tests locally for me.
        Hide
        Mingjie Lai added a comment -

        Basic ideas:
        1. At o.a.h.h.client.HConnectionManager.locateRegionInMeta(), prefetch certain number of table regions in global region loc cache when scanning META, instead of only caching the desired queried table+row region.
        2. For a fresh table load, all region locations of the table will be prefetched/cached. While after initialization, each miss could fetch ahead some number of rows from META.

        Patched files:

        • o.a.h.h.client.HConnectionManager: added a new method to perform META table prewarm.
        • o.a.h.h.client.MetaScanner: overloaded MetaScanner.metaScan, to support scanning meta table within certain steps:
          public static void metaScan(HBaseConfiguration configuration,
          MetaScannerVisitor visitor, byte[] tableName, int rowLimit)
        • o.a.h.h.client.TestFromClientSide: added a test case for region cache
          prewarm.

        Potential issues:

        • o.a.h.h.client.HTable constructor triggers the cache prewarm right now, by calling HConnection.locateRegion(). I was suggested to defer calling locateRegion() from HTable constructor, but some of existing unit test cases failed because they implicitly rely on the constructor to do something, i.e., waiting for a table to be fully created after a table creation.
        • For 2, ``each miss could fetch ahead some number of rows'': right now, this number is hard-coded as 10. Is it necessary to make it configurable?
        Show
        Mingjie Lai added a comment - Basic ideas: 1. At o.a.h.h.client.HConnectionManager.locateRegionInMeta(), prefetch certain number of table regions in global region loc cache when scanning META, instead of only caching the desired queried table+row region. 2. For a fresh table load, all region locations of the table will be prefetched/cached. While after initialization, each miss could fetch ahead some number of rows from META. Patched files: o.a.h.h.client.HConnectionManager: added a new method to perform META table prewarm. o.a.h.h.client.MetaScanner: overloaded MetaScanner.metaScan, to support scanning meta table within certain steps: public static void metaScan(HBaseConfiguration configuration, MetaScannerVisitor visitor, byte[] tableName, int rowLimit) o.a.h.h.client.TestFromClientSide: added a test case for region cache prewarm. Potential issues: o.a.h.h.client.HTable constructor triggers the cache prewarm right now, by calling HConnection.locateRegion(). I was suggested to defer calling locateRegion() from HTable constructor, but some of existing unit test cases failed because they implicitly rely on the constructor to do something, i.e., waiting for a table to be fully created after a table creation. For 2, ``each miss could fetch ahead some number of rows'': right now, this number is hard-coded as 10. Is it necessary to make it configurable?
        Hide
        stack added a comment -

        Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.

        Show
        stack added a comment - Bulk move of 0.20.5 issues into 0.21.0 after vote that we merge branch into TRUNK up on list.
        Hide
        Mingjie Lai added a comment -

        I'm looking at this issue right now, and want to provide a patch it as a warmup.

        My plan is to add a new HConnectionManager::locateRegionInMeta() which is dedicated for .META. table. It keeps the features of the original version (locate region for a table+row pair), in addition it puts all this table's region info to local cache by usig MetaScanner.

        Show
        Mingjie Lai added a comment - I'm looking at this issue right now, and want to provide a patch it as a warmup. My plan is to add a new HConnectionManager::locateRegionInMeta() which is dedicated for .META. table. It keeps the features of the original version (locate region for a table+row pair), in addition it puts all this table's region info to local cache by usig MetaScanner.

          People

          • Assignee:
            Mingjie Lai
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development