HBase
  1. HBase
  2. HBASE-4224

Need a flush by regionserver rather than by table option

    Details

    • Type: Bug Bug
    • Status: In Progress
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: shell
    • Labels:
      None

      Description

      This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this.

      1. HBase-4224.patch
        12 kB
        Akash Ashok
      2. HBase-4224-v2.patch
        22 kB
        Akash Ashok

        Activity

        Hide
        stack added a comment -

        We need by regionserver because doing it by table, its not possible to control how much flush is done. Imagine cluster has one table only. If you flush it, then all regionservers all flush at same time killing any other server going on concurrently because of the i/o load on hdfs. If you could do it by regionserver, you'd have some means of doing other than a big-bang flush, and so on.

        Show
        stack added a comment - We need by regionserver because doing it by table, its not possible to control how much flush is done. Imagine cluster has one table only. If you flush it, then all regionservers all flush at same time killing any other server going on concurrently because of the i/o load on hdfs. If you could do it by regionserver, you'd have some means of doing other than a big-bang flush, and so on.
        Hide
        stack added a comment -

        An important bit I left off the top is that this 'flush by regionserver' needs to also clean up WAL logs too after the flushing is done. If everything has been flushed, then should be able to clean up any outstanding WAL logs.

        Scenario is the following. Your hosting service tells you that they are going to do a risky operation that could cause you to lose network connectivity or power to your racks. You know that a disaster may be coming and you want to mitigate the damage.

        A facility that flushed and cleaned up all WAL logs by the regionserver (so could throttle i/o) so if bad crash there is little to replay is what I'm talking about here.

        Show
        stack added a comment - An important bit I left off the top is that this 'flush by regionserver' needs to also clean up WAL logs too after the flushing is done. If everything has been flushed, then should be able to clean up any outstanding WAL logs. Scenario is the following. Your hosting service tells you that they are going to do a risky operation that could cause you to lose network connectivity or power to your racks. You know that a disaster may be coming and you want to mitigate the damage. A facility that flushed and cleaned up all WAL logs by the regionserver (so could throttle i/o) so if bad crash there is little to replay is what I'm talking about here.
        Hide
        stack added a comment -

        I'm seeing that a script that flushes all tables on the cluster does not bring on a log cleanup when expectation is that it should.

        Show
        stack added a comment - I'm seeing that a script that flushes all tables on the cluster does not bring on a log cleanup when expectation is that it should.
        Hide
        stack added a comment -

        We need a means of forcing a roll of wal log. That'll bring on check for whether we can let go of any old WALs.

        Show
        stack added a comment - We need a means of forcing a roll of wal log. That'll bring on check for whether we can let go of any old WALs.
        Hide
        Akash Ashok added a comment -

        Hey Stack,
        I would actually like to work on this. I don't know If I am able to grasp the true gravity and understand how big a problem this is or even If I am equipped enough to handle this. But with a little help I could work on this.

        Please let me know.

        Show
        Akash Ashok added a comment - Hey Stack, I would actually like to work on this. I don't know If I am able to grasp the true gravity and understand how big a problem this is or even If I am equipped enough to handle this. But with a little help I could work on this. Please let me know.
        Hide
        stack added a comment -

        @Akash Thanks for volunteering. I don't think this one too hard to do. You'll be able to figure it. Study what happens when you flush in the shell. See how it passes a command to the master. In master, see how it fans it out by region or by table. I think whats needed is a flush command that takes a machine name. The shell code would figure distingush the flush of a region or table from a flush of a regionserver (a bit of regexing should be able to distingush whats wanted – I'd imagine the flush a machine would take a ServerName; i.e. <hostname> ',' <port> ',' <startcode>). The flush of a regionserver would require adding a new API to HRegionInterface I'd imagine. It'd flush all regions on the regionserver. Then on the end, after all flushed, it'd run the check if we can let go of WAL logs. This method currently is checked on WAL log roll. See in the HLog under the wal package. Ask questions if you help figuring stuff. Thanks Akash.

        Show
        stack added a comment - @Akash Thanks for volunteering. I don't think this one too hard to do. You'll be able to figure it. Study what happens when you flush in the shell. See how it passes a command to the master. In master, see how it fans it out by region or by table. I think whats needed is a flush command that takes a machine name. The shell code would figure distingush the flush of a region or table from a flush of a regionserver (a bit of regexing should be able to distingush whats wanted – I'd imagine the flush a machine would take a ServerName; i.e. <hostname> ',' <port> ',' <startcode>). The flush of a regionserver would require adding a new API to HRegionInterface I'd imagine. It'd flush all regions on the regionserver. Then on the end, after all flushed, it'd run the check if we can let go of WAL logs. This method currently is checked on WAL log roll. See in the HLog under the wal package. Ask questions if you help figuring stuff. Thanks Akash.
        Hide
        Akash Ashok added a comment -

        Thanks Stack. I'll start workin on it then.

        Show
        Akash Ashok added a comment - Thanks Stack. I'll start workin on it then.
        Hide
        Akash Ashok added a comment -

        This one had totally slipped my mind. I'll start workin on this right away

        Show
        Akash Ashok added a comment - This one had totally slipped my mind. I'll start workin on this right away
        Hide
        Akash Ashok added a comment -

        My assumption is that I'd do a logRoll after "Table Flush" or "Region Server Flush" and not for a "Ragion Flush" . For "Region Server Flush" I'd roll the logs if there has been atleast one successful "RegionFlush".

        I'd presume that's a fair assumption. Please do let me know in case of any concerns.

        Show
        Akash Ashok added a comment - My assumption is that I'd do a logRoll after "Table Flush" or "Region Server Flush" and not for a "Ragion Flush" . For "Region Server Flush" I'd roll the logs if there has been atleast one successful "RegionFlush". I'd presume that's a fair assumption. Please do let me know in case of any concerns.
        Hide
        Akash Ashok added a comment -

        This is not tested completly. I'm yet to do a complete round of testing and write test cases.
        I've attached the patch so that If there are any review comments I could incorporate them.

        Cheers

        Show
        Akash Ashok added a comment - This is not tested completly. I'm yet to do a complete round of testing and write test cases. I've attached the patch so that If there are any review comments I could incorporate them. Cheers
        Hide
        Ted Yu added a comment -

        Overall, patch looks good.

        A better name for getRegionServerPairs() would be getRegionServerPairsForTable().
        The LOG.info() in isValidRegionServerPair() should be LOG.debug(). Also, offline region isn't covered in the log message.

        There are some formatting improvements, such as:

                if(flush(serverName, pair.getFirst())) {
        

        There should be a space between if and (.

        Please also add support for this feature to hbase shell.

        Show
        Ted Yu added a comment - Overall, patch looks good. A better name for getRegionServerPairs() would be getRegionServerPairsForTable(). The LOG.info() in isValidRegionServerPair() should be LOG.debug(). Also, offline region isn't covered in the log message. There are some formatting improvements, such as: if (flush(serverName, pair.getFirst())) { There should be a space between if and (. Please also add support for this feature to hbase shell.
        Hide
        Akash Ashok added a comment -

        Thanks Ted. I shall make those changes.

        I was testing a few more things and have a few queries

        As per the Javadoc the flush() method in HBaseAdmin is supposed to be asynchronous but its actually not.
        1. It might make sense to make this async for table flushes, where the regions on different Region Servers can be flushed simultaneously instead of waiting for each flush to complete.
        2. Same holds good for log rolling

        Also it might even make sense to have a method to flush a list of regions belonging to a particular region server to support table flushes, instead of making round trip to and from the server for every region.

        Does this sound reasonable ?

        Show
        Akash Ashok added a comment - Thanks Ted. I shall make those changes. I was testing a few more things and have a few queries As per the Javadoc the flush() method in HBaseAdmin is supposed to be asynchronous but its actually not. 1. It might make sense to make this async for table flushes, where the regions on different Region Servers can be flushed simultaneously instead of waiting for each flush to complete. 2. Same holds good for log rolling Also it might even make sense to have a method to flush a list of regions belonging to a particular region server to support table flushes, instead of making round trip to and from the server for every region. Does this sound reasonable ?
        Hide
        Ted Yu added a comment -

        For #1 above, I think we should keep the current behavior. People may have become dependent on the current behavior.
        For #2, we can introduce async log rolling.
        For #3, supporting list of regions is nice to have.

        All new commands and parameters should be documented in respective .rb files.

        Thanks

        Show
        Ted Yu added a comment - For #1 above, I think we should keep the current behavior. People may have become dependent on the current behavior. For #2, we can introduce async log rolling. For #3, supporting list of regions is nice to have. All new commands and parameters should be documented in respective .rb files. Thanks
        Hide
        stack added a comment -

        On #1, sounds great. Add a new API?
        On #2, ditto.
        On #3, that'd be useful, for sure.

        Show
        stack added a comment - On #1, sounds great. Add a new API? On #2, ditto. On #3, that'd be useful, for sure.
        Hide
        Akash Ashok added a comment -

        For #1: I was thinking I'd be better to incorporate withing the same flush() function instead of a new API.
        Considering Ted's comment, It actually makes sense to retain the behavior, thus we could still have the same behavior for the flush() function but make sure it only exit after all the required regions have been flushed, by having waits within the function. That way we don't have to add a new API. The behavior would remain the same. Just that thing would be a little faster as flushes happen in parallel.

        Show
        Akash Ashok added a comment - For #1: I was thinking I'd be better to incorporate withing the same flush() function instead of a new API. Considering Ted's comment, It actually makes sense to retain the behavior, thus we could still have the same behavior for the flush() function but make sure it only exit after all the required regions have been flushed, by having waits within the function. That way we don't have to add a new API. The behavior would remain the same. Just that thing would be a little faster as flushes happen in parallel.
        Hide
        Ted Yu added a comment -

        That sounds good, Akash.

        Show
        Ted Yu added a comment - That sounds good, Akash.
        Hide
        Akash Ashok added a comment -

        Cool. Just taking a deeper look so that we make sure we're making the right decision

        1. New API
        -> Use of RegionServer side threading
        -> Synchronize on HBaseAdmin side using zookeeper
        2. Existing API
        -> Threading on client side

        I am in favour of 2. Any suggestions or concerns ?

        Show
        Akash Ashok added a comment - Cool. Just taking a deeper look so that we make sure we're making the right decision 1. New API -> Use of RegionServer side threading -> Synchronize on HBaseAdmin side using zookeeper 2. Existing API -> Threading on client side I am in favour of 2. Any suggestions or concerns ?
        Hide
        Akash Ashok added a comment -

        Also in HBaseAdmin.java

        public synchronized  byte[][] rollHLogWriter(String serverName)
        

        I don't find a need for this to be synchronized. These are anyways synchronized on the HLog side. Am I missing something here or can I remove "synchronized" ?

        Show
        Akash Ashok added a comment - Also in HBaseAdmin.java public synchronized byte [][] rollHLogWriter( String serverName) I don't find a need for this to be synchronized. These are anyways synchronized on the HLog side. Am I missing something here or can I remove "synchronized" ?
        Hide
        Jean-Daniel Cryans added a comment -

        Regarding how flush currently works with HBA, I did this analysis a few months ago: HBASE-4198

        About this:

        I don't find a need for this to be synchronized. These are anyways synchronized on the HLog side. Am I missing something here or can I remove "synchronized" ?

        Actually there are a few synchronized methods in HBA, and I have no clue why they are.

        Show
        Jean-Daniel Cryans added a comment - Regarding how flush currently works with HBA, I did this analysis a few months ago: HBASE-4198 About this: I don't find a need for this to be synchronized. These are anyways synchronized on the HLog side. Am I missing something here or can I remove "synchronized" ? Actually there are a few synchronized methods in HBA, and I have no clue why they are.
        Hide
        Akash Ashok added a comment -

        I am done with the testing for this patch. But though when I was writing the test cases I could only think of one. Could some1 please help me out as to what different test cases should I be looking to cover for this patch ?

        Show
        Akash Ashok added a comment - I am done with the testing for this patch. But though when I was writing the test cases I could only think of one. Could some1 please help me out as to what different test cases should I be looking to cover for this patch ?
        Hide
        Ted Yu added a comment -

        @Akash:
        Do you have a newer patch ?
        If so, please upload to this JIRA.

        Show
        Ted Yu added a comment - @Akash: Do you have a newer patch ? If so, please upload to this JIRA.
        Hide
        Akash Ashok added a comment -

        2nd Version of Patch. Just looked into HBase-5041. Have to incorporate those changes too for better condition handling.

        Show
        Akash Ashok added a comment - 2nd Version of Patch. Just looked into HBase-5041. Have to incorporate those changes too for better condition handling.
        Hide
        Ted Yu added a comment -

        Just started looking at patch v2.
        For HBaseAdmin.java:

        +import com.google.common.collect.Lists;
        +import com.google.common.collect.Maps;
        

        Dependency on third party jar shouldn't be introduced into client package.

        Show
        Ted Yu added a comment - Just started looking at patch v2. For HBaseAdmin.java: + import com.google.common.collect.Lists; + import com.google.common.collect.Maps; Dependency on third party jar shouldn't be introduced into client package.
        Hide
        Ted Yu added a comment -

        Putting the diff on review board allows me to see the changes easily.
        Please submit review request there.

        For ServerName.java:

        +   * 2. post > 0
        

        The above should read 'port > 0'

        +  public boolean isValid() {
        +    if(StringUtils.isNotBlank(hostname) && port > 0 && startcode > 0)
        +      return true;
        +    return false;
        +  }
        

        A single return statement should be enough.

        For HBaseAdmin.java:

          private ExecutorService executorService= Executors.newCachedThreadPool();
        

        Please give ExecutorService better name and add javadoc for it.

        Show
        Ted Yu added a comment - Putting the diff on review board allows me to see the changes easily. Please submit review request there. For ServerName.java: + * 2. post > 0 The above should read 'port > 0' + public boolean isValid() { + if (StringUtils.isNotBlank(hostname) && port > 0 && startcode > 0) + return true ; + return false ; + } A single return statement should be enough. For HBaseAdmin.java: private ExecutorService executorService= Executors.newCachedThreadPool(); Please give ExecutorService better name and add javadoc for it.
        Hide
        Ted Yu added a comment -

        On review board, the white spaces are easy to see.
        Please remove them.

        Show
        Ted Yu added a comment - On review board, the white spaces are easy to see. Please remove them.
        Hide
        Akash Ashok added a comment -

        "Dependency on third party jar shouldn't be introduced into client package."
        We already have dependency on Apache Commons for logging. When we run HBase admin its usually through the HBaseor hadoop command which anyways load the libs into the environment right? So I presume it should be ok ?

        I shall post to the review board. Thanks.

        Show
        Akash Ashok added a comment - "Dependency on third party jar shouldn't be introduced into client package." We already have dependency on Apache Commons for logging. When we run HBase admin its usually through the HBaseor hadoop command which anyways load the libs into the environment right? So I presume it should be ok ? I shall post to the review board. Thanks.
        Hide
        Ted Yu added a comment -

        com.google.common.collect isn't Apache Commons. No dependency on additional third party jar should be introduced.

        Show
        Ted Yu added a comment - com.google.common.collect isn't Apache Commons. No dependency on additional third party jar should be introduced.
        Hide
        Akash Ashok added a comment -

        Com.google.common.collect is guava I know, but what I meant to say is that ApacheCommons itself is a third party jar with respect to HBase. Is there any particular reason why we shouldn't add additional third party dependencies other than the fact that its a client side code ?

        Let me give an example:
        Assume we are using HBaseAdmin in our code to create a table. Now to run this code we need to add dependency on HBase*.jar. Now since there's a dependency on apache commons logging so this additional step of adding this into your classpath is anyways already there, along with which u need to add one more dependency.

        Here since it's its basic usage of guava I could remove the dependency. But could you help me understand as to why this is to be done ?

        Show
        Akash Ashok added a comment - Com.google.common.collect is guava I know, but what I meant to say is that ApacheCommons itself is a third party jar with respect to HBase. Is there any particular reason why we shouldn't add additional third party dependencies other than the fact that its a client side code ? Let me give an example: Assume we are using HBaseAdmin in our code to create a table. Now to run this code we need to add dependency on HBase*.jar. Now since there's a dependency on apache commons logging so this additional step of adding this into your classpath is anyways already there, along with which u need to add one more dependency. Here since it's its basic usage of guava I could remove the dependency. But could you help me understand as to why this is to be done ?
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/
        -----------------------------------------------------------

        Review request for hbase.

        Summary
        -------

        Flush by RegionServer

        This addresses bug HBase-4224.
        https://issues.apache.org/jira/browse/HBase-4224

        Diffs


        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902
        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902
        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902
        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902

        Diff: https://reviews.apache.org/r/3308/diff

        Testing
        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Akash Ashok added a comment -

        Have subbmitted on the review board. https://reviews.apache.org/r/3308/.
        Thanks

        Show
        Akash Ashok added a comment - Have subbmitted on the review board. https://reviews.apache.org/r/3308/ . Thanks
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4116
        -----------------------------------------------------------

        Please give us some results from testing in cluster.

        /src/main/java/org/apache/hadoop/hbase/ServerName.java
        <https://reviews.apache.org/r/3308/#comment9269>

        I think we should perform stricter checking on hostname, without using DNS.
        See http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2&AspxAutoDetectCookieSupport=1

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9270>

        The 'execute' after 'the' should be removed.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9271>

        The ctor with ThreadFactory parameter should be used so that threads in this pool can have names.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9272>

        Second component should read 'all regions on a region server'

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9273>

        serverRegionsMap might be null upon return.
        I don't see null check below.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9274>

        Should read 'every region server'

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9275>

        Should read 'whose WAL'

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9276>

        I think Future.get(long timeout, TimeUnit unit) should be used here so that we don't wait indefinitely.

        • Ted

        On 2011-12-24 04:31:50, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2011-12-24 04:31:50)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4116 ----------------------------------------------------------- Please give us some results from testing in cluster. /src/main/java/org/apache/hadoop/hbase/ServerName.java < https://reviews.apache.org/r/3308/#comment9269 > I think we should perform stricter checking on hostname, without using DNS. See http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2&AspxAutoDetectCookieSupport=1 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9270 > The 'execute' after 'the' should be removed. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9271 > The ctor with ThreadFactory parameter should be used so that threads in this pool can have names. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9272 > Second component should read 'all regions on a region server' /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9273 > serverRegionsMap might be null upon return. I don't see null check below. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9274 > Should read 'every region server' /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9275 > Should read 'whose WAL' /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9276 > I think Future.get(long timeout, TimeUnit unit) should be used here so that we don't wait indefinitely. Ted On 2011-12-24 04:31:50, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2011-12-24 04:31:50) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Lars Hofhansl added a comment -

        Sorry for coming late to this.

        It seems we're bundling two unrelated changes into the same issue:
        1. flush per region server
        2. asynchronous flush for tables (with tread pool in HBaseAdmin).

        I think these should be two separate issues.

        Show
        Lars Hofhansl added a comment - Sorry for coming late to this. It seems we're bundling two unrelated changes into the same issue: 1. flush per region server 2. asynchronous flush for tables (with tread pool in HBaseAdmin). I think these should be two separate issues.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4120
        -----------------------------------------------------------

        See minor comments inline.
        Generally I don't understand why this change needs to be so big (but I haven't had time to look through the new code in detail, yet).

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9293>

        I would echo Ted's sentiment here that we shouldn't add more dependencies to the client side code.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9294>

        Also, should only instantiate this pool when needed.

        • Lars

        On 2011-12-24 04:31:50, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2011-12-24 04:31:50)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4120 ----------------------------------------------------------- See minor comments inline. Generally I don't understand why this change needs to be so big (but I haven't had time to look through the new code in detail, yet). /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9293 > I would echo Ted's sentiment here that we shouldn't add more dependencies to the client side code. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9294 > Also, should only instantiate this pool when needed. Lars On 2011-12-24 04:31:50, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2011-12-24 04:31:50) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Ted Yu added a comment -

        @Akash:
        Can you address the review comments and upload a new patch ?

        Thanks

        Show
        Ted Yu added a comment - @Akash: Can you address the review comments and upload a new patch ? Thanks
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-25 17:53:31, Ted Yu wrote:

        > /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 1392

        > <https://reviews.apache.org/r/3308/diff/2/?file=65853#file65853line1392>

        >

        > I think Future.get(long timeout, TimeUnit unit) should be used here so that we don't wait indefinitely.

        The flush would depend on the number of regions the Regions Server is hosting and ofcourse other system dependent factors. So how do we decide the timeouts. Secondly if the get times out then that RegionServer's WAL will not be rolled, as the logic to roll is on the client side currently. Could move this to server side so that roll happens after a flush ?

        • Akash

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4116
        -----------------------------------------------------------

        On 2011-12-24 04:31:50, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2011-12-24 04:31:50)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-25 17:53:31, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 1392 > < https://reviews.apache.org/r/3308/diff/2/?file=65853#file65853line1392 > > > I think Future.get(long timeout, TimeUnit unit) should be used here so that we don't wait indefinitely. The flush would depend on the number of regions the Regions Server is hosting and ofcourse other system dependent factors. So how do we decide the timeouts. Secondly if the get times out then that RegionServer's WAL will not be rolled, as the logic to roll is on the client side currently. Could move this to server side so that roll happens after a flush ? Akash ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4116 ----------------------------------------------------------- On 2011-12-24 04:31:50, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2011-12-24 04:31:50) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Ted Yu added a comment -

        I searched 0.92 codebase for HBA.rollHLogWriter() and only found two references in test code.

        Show
        Ted Yu added a comment - I searched 0.92 codebase for HBA.rollHLogWriter() and only found two references in test code.
        Hide
        Akash Ashok added a comment -

        I was talking about the call to rollHLogWriter() from the HBA.flush() function. This call is made after the flush is successful. If we use Future.get(timeout). Then even if the flush is taking long time due to some reason it would time out and the log roll wouldn't happen. Should we move this to flush on the Region Server side so that even if the client doesn't want the roll happens after the flush finishes ?

        Show
        Akash Ashok added a comment - I was talking about the call to rollHLogWriter() from the HBA.flush() function. This call is made after the flush is successful. If we use Future.get(timeout). Then even if the flush is taking long time due to some reason it would time out and the log roll wouldn't happen. Should we move this to flush on the Region Server side so that even if the client doesn't want the roll happens after the flush finishes ?
        Hide
        Akash Ashok added a comment -

        I'm sorry It came out wrong. What I meant to ask is that
        "Should we move the call to Log Roll to the flush on the Region Server side so that even if the client times out the roll happens after the flush finishes ?"

        Show
        Akash Ashok added a comment - I'm sorry It came out wrong. What I meant to ask is that "Should we move the call to Log Roll to the flush on the Region Server side so that even if the client times out the roll happens after the flush finishes ?"
        Hide
        Ted Yu added a comment -

        @Akash:
        As Stack commented @ 04/Sep/11 03:22, we should be adding a new method to HRegionInterface.
        In the patch, you have:

              isRollWAL = regionServer.flushRegions(regions);
        

        I think your suggestion @ 02/Jan/12 02:55 resonates with Stack's comment. That is, the log roll should be carried out on region server side.

        Thanks

        Show
        Ted Yu added a comment - @Akash: As Stack commented @ 04/Sep/11 03:22, we should be adding a new method to HRegionInterface. In the patch, you have: isRollWAL = regionServer.flushRegions(regions); I think your suggestion @ 02/Jan/12 02:55 resonates with Stack's comment. That is, the log roll should be carried out on region server side. Thanks
        Hide
        Akash Ashok added a comment -

        Coming back to the timeout on Future.get(), which can be so unpredictable, I feel it makes sense not to have a timeout and just let it be completly async and not wait for the flush to finish.

        Show
        Akash Ashok added a comment - Coming back to the timeout on Future.get(), which can be so unpredictable, I feel it makes sense not to have a timeout and just let it be completly async and not wait for the flush to finish.
        Hide
        Ted Yu added a comment -

        If you read Stack's comment again, you should find that it may not be necessary to use Future.get() since you should have a new method in HRegionInterface.

        Show
        Ted Yu added a comment - If you read Stack's comment again, you should find that it may not be necessary to use Future.get() since you should have a new method in HRegionInterface.
        Hide
        Akash Ashok added a comment -

        Agreed Ted. But on the client side the method is still the same. Considering your concern regarding async flushes

        "For #1 above, I think we should keep the current behavior. People may have become dependent on the current behavior."

        This behavior would change for table flushes too. Earlier they were sync flushes, now they becomes async if we remove Future.get().

        This was the reason why I revisited this issue.

        Show
        Akash Ashok added a comment - Agreed Ted. But on the client side the method is still the same. Considering your concern regarding async flushes "For #1 above, I think we should keep the current behavior. People may have become dependent on the current behavior." This behavior would change for table flushes too. Earlier they were sync flushes, now they becomes async if we remove Future.get(). This was the reason why I revisited this issue.
        Hide
        Jean-Daniel Cryans added a comment -

        There's already a jira opened for the async flushes and friends, HBASE-4198.

        Sync flush in my experience is useless. On a big enough table it takes hours to run and by the time it's done data might have been inserted already and other flushes might have run. Plus, compactions are still async.

        Show
        Jean-Daniel Cryans added a comment - There's already a jira opened for the async flushes and friends, HBASE-4198 . Sync flush in my experience is useless. On a big enough table it takes hours to run and by the time it's done data might have been inserted already and other flushes might have run. Plus, compactions are still async.
        Hide
        Akash Ashok added a comment -

        JD can we make HBASE-4198 as sub-task of this as Its already included in the current patch ?

        Show
        Akash Ashok added a comment - JD can we make HBASE-4198 as sub-task of this as Its already included in the current patch ?
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-25 17:53:31, Ted Yu wrote:

        > /src/main/java/org/apache/hadoop/hbase/ServerName.java, line 277

        > <https://reviews.apache.org/r/3308/diff/2/?file=65852#file65852line277>

        >

        > I think we should perform stricter checking on hostname, without using DNS.

        > See http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2&AspxAutoDetectCookieSupport=1

        I am assuming hostname can be either a DNS address or IPv4 address. Is IPv6 supported yet ?

        • Akash

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4116
        -----------------------------------------------------------

        On 2011-12-24 04:31:50, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2011-12-24 04:31:50)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-25 17:53:31, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/ServerName.java, line 277 > < https://reviews.apache.org/r/3308/diff/2/?file=65852#file65852line277 > > > I think we should perform stricter checking on hostname, without using DNS. > See http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2&AspxAutoDetectCookieSupport=1 I am assuming hostname can be either a DNS address or IPv4 address. Is IPv6 supported yet ? Akash ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4116 ----------------------------------------------------------- On 2011-12-24 04:31:50, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2011-12-24 04:31:50) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1222902 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Ted Yu added a comment -

        Advice to HBase users is to turn off IPv6. e.g. see thread entitled 'hbase master is not starting @ 60010 on new ubutnu 11.04 system' on user@hbase.

        You can leave a comment or TODO stating the need to revisit when IPv6 support is added.

        Show
        Ted Yu added a comment - Advice to HBase users is to turn off IPv6. e.g. see thread entitled 'hbase master is not starting @ 60010 on new ubutnu 11.04 system' on user@hbase. You can leave a comment or TODO stating the need to revisit when IPv6 support is added.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/
        -----------------------------------------------------------

        (Updated 2012-01-06 18:48:11.359389)

        Review request for hbase.

        Changes
        -------

        Changes:
        1. Stricter hostname checking in ServerName.java
        2. Flushes made completly Async.
        3. WAL Roll moved to HRegionServer.java only for RegionServer flushes
        4. Executor Service with named ThreadFactory
        5. Dependency on Google Guava removed

        Summary
        -------

        Flush by RegionServer

        This addresses bug HBase-4224.
        https://issues.apache.org/jira/browse/HBase-4224

        Diffs (updated)


        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330
        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330
        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330
        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330

        Diff: https://reviews.apache.org/r/3308/diff

        Testing
        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2012-01-06 18:48:11.359389) Review request for hbase. Changes ------- Changes: 1. Stricter hostname checking in ServerName.java 2. Flushes made completly Async. 3. WAL Roll moved to HRegionServer.java only for RegionServer flushes 4. Executor Service with named ThreadFactory 5. Dependency on Google Guava removed Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs (updated) /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330 /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4236
        -----------------------------------------------------------

        /src/main/java/org/apache/hadoop/hbase/ServerName.java
        <https://reviews.apache.org/r/3308/#comment9579>

        Please enclose the assignment on line 292 in curly braces.

        /src/main/java/org/apache/hadoop/hbase/ServerName.java
        <https://reviews.apache.org/r/3308/#comment9580>

        Since IPv4 support is built in, I suggest naming this method isValidHost.

        /src/main/java/org/apache/hadoop/hbase/ServerName.java
        <https://reviews.apache.org/r/3308/#comment9581>

        Why the extra line here ?

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9587>

        I think threadPool is a better name for this field.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9582>

        I think Async is unnecessary here - that's what threads provide.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9584>

        Why not call tableExists() directly ?

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9583>

        Should include the actual name passed in exception message.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9585>

        Should read 'Exception parsing server name'

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9586>

        Since serverToRegionsMap has been created, you can return serverToRegionsMap here.

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
        <https://reviews.apache.org/r/3308/#comment9588>

        regions could be empty, right ?

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
        <https://reviews.apache.org/r/3308/#comment9589>

        This and flushAllRegions() are similar.
        Can we introduce just one new method which checks whether the list is empty to decide what to do ?
        i.e. move the check @ line 1403 of HBaseAdmin to the implementation of the new method.

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        <https://reviews.apache.org/r/3308/#comment9590>

        Curly braces, please.

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        <https://reviews.apache.org/r/3308/#comment9591>

        Do we need to place a try/catch block around line 2795 ?
        Currently the first failure would stop subsequent flushes.

        • Ted

        On 2012-01-06 18:48:11, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2012-01-06 18:48:11)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4236 ----------------------------------------------------------- /src/main/java/org/apache/hadoop/hbase/ServerName.java < https://reviews.apache.org/r/3308/#comment9579 > Please enclose the assignment on line 292 in curly braces. /src/main/java/org/apache/hadoop/hbase/ServerName.java < https://reviews.apache.org/r/3308/#comment9580 > Since IPv4 support is built in, I suggest naming this method isValidHost. /src/main/java/org/apache/hadoop/hbase/ServerName.java < https://reviews.apache.org/r/3308/#comment9581 > Why the extra line here ? /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9587 > I think threadPool is a better name for this field. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9582 > I think Async is unnecessary here - that's what threads provide. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9584 > Why not call tableExists() directly ? /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9583 > Should include the actual name passed in exception message. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9585 > Should read 'Exception parsing server name' /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9586 > Since serverToRegionsMap has been created, you can return serverToRegionsMap here. /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java < https://reviews.apache.org/r/3308/#comment9588 > regions could be empty, right ? /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java < https://reviews.apache.org/r/3308/#comment9589 > This and flushAllRegions() are similar. Can we introduce just one new method which checks whether the list is empty to decide what to do ? i.e. move the check @ line 1403 of HBaseAdmin to the implementation of the new method. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < https://reviews.apache.org/r/3308/#comment9590 > Curly braces, please. /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < https://reviews.apache.org/r/3308/#comment9591 > Do we need to place a try/catch block around line 2795 ? Currently the first failure would stop subsequent flushes. Ted On 2012-01-06 18:48:11, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2012-01-06 18:48:11) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330 /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2012-01-07 17:27:49, Ted Yu wrote:

        > /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 2794

        > <https://reviews.apache.org/r/3308/diff/3/?file=66890#file66890line2794>

        >

        > Do we need to place a try/catch block around line 2795 ?

        > Currently the first failure would stop subsequent flushes.

        Ted this sounds good and we'd want to surround it with a try catch. But then we'd be masking those errors right?

        If I were the user and called flush through the shell, I'd not only wanna know that all flushes didn't go successful but I'd also want to know how many of them were unsuccessful and which region flushes failed due to what reason. So do you think it would be a good idea to return a status object which gives the client more information regarding the flushes ?

        On 2012-01-07 17:27:49, Ted Yu wrote:

        > /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java, line 486

        > <https://reviews.apache.org/r/3308/diff/3/?file=66889#file66889line486>

        >

        > This and flushAllRegions() are similar.

        > Can we introduce just one new method which checks whether the list is empty to decide what to do ?

        > i.e. move the check @ line 1403 of HBaseAdmin to the implementation of the new method.

        This is rather a flag defined on the client which means when the regions == null then u flush the entire RegionServer. IF we move this to the region server side then we'd be making this an API level definition. I think flushAllRegions is more intuitive as a RegionServer API than an API which flushes the Region Server when the regions parameter is null.

        Assume a new person wants to flushAllRegions. flush(List<HRegionInfo> regions) tells him that hey I can flush a list of regions and unless he looks at the implementation or the javadoc he'd miss the fact that this API can also flush all the regions. On the other hand flushAllRegions invites the person with open arms saying that hey i'll flush all the regions for you.

        On 2012-01-07 17:27:49, Ted Yu wrote:

        > /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 1409

        > <https://reviews.apache.org/r/3308/diff/3/?file=66888#file66888line1409>

        >

        > regions could be empty, right ?

        The only condition when regions could be empty is when the name to be flushed is a RegionServer and not a single Region or table.

        This is based on the presumption that MetaReader.getTableRegionsAndLocations doesn't return null regions and isRegionsName is null then it never executes the flush.

        So this is fine ?

        • Akash

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3308/#review4236
        -----------------------------------------------------------

        On 2012-01-06 18:48:11, Akash Ashok wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3308/

        -----------------------------------------------------------

        (Updated 2012-01-06 18:48:11)

        Review request for hbase.

        Summary

        -------

        Flush by RegionServer

        This addresses bug HBase-4224.

        https://issues.apache.org/jira/browse/HBase-4224

        Diffs

        -----

        /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330

        /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330

        /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330

        /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330

        Diff: https://reviews.apache.org/r/3308/diff

        Testing

        -------

        Thanks,

        Akash

        Show
        jiraposter@reviews.apache.org added a comment - On 2012-01-07 17:27:49, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 2794 > < https://reviews.apache.org/r/3308/diff/3/?file=66890#file66890line2794 > > > Do we need to place a try/catch block around line 2795 ? > Currently the first failure would stop subsequent flushes. Ted this sounds good and we'd want to surround it with a try catch. But then we'd be masking those errors right? If I were the user and called flush through the shell, I'd not only wanna know that all flushes didn't go successful but I'd also want to know how many of them were unsuccessful and which region flushes failed due to what reason. So do you think it would be a good idea to return a status object which gives the client more information regarding the flushes ? On 2012-01-07 17:27:49, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java, line 486 > < https://reviews.apache.org/r/3308/diff/3/?file=66889#file66889line486 > > > This and flushAllRegions() are similar. > Can we introduce just one new method which checks whether the list is empty to decide what to do ? > i.e. move the check @ line 1403 of HBaseAdmin to the implementation of the new method. This is rather a flag defined on the client which means when the regions == null then u flush the entire RegionServer. IF we move this to the region server side then we'd be making this an API level definition. I think flushAllRegions is more intuitive as a RegionServer API than an API which flushes the Region Server when the regions parameter is null. Assume a new person wants to flushAllRegions. flush(List<HRegionInfo> regions) tells him that hey I can flush a list of regions and unless he looks at the implementation or the javadoc he'd miss the fact that this API can also flush all the regions. On the other hand flushAllRegions invites the person with open arms saying that hey i'll flush all the regions for you. On 2012-01-07 17:27:49, Ted Yu wrote: > /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 1409 > < https://reviews.apache.org/r/3308/diff/3/?file=66888#file66888line1409 > > > regions could be empty, right ? The only condition when regions could be empty is when the name to be flushed is a RegionServer and not a single Region or table. This is based on the presumption that MetaReader.getTableRegionsAndLocations doesn't return null regions and isRegionsName is null then it never executes the flush. So this is fine ? Akash ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/#review4236 ----------------------------------------------------------- On 2012-01-06 18:48:11, Akash Ashok wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3308/ ----------------------------------------------------------- (Updated 2012-01-06 18:48:11) Review request for hbase. Summary ------- Flush by RegionServer This addresses bug HBase-4224. https://issues.apache.org/jira/browse/HBase-4224 Diffs ----- /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1226330 /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 1226330 /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1226330 /src/main/java/org/apache/hadoop/hbase/ServerName.java 1226330 Diff: https://reviews.apache.org/r/3308/diff Testing ------- Thanks, Akash
        Hide
        Ted Yu added a comment -

        w.r.t. whether flushAllRegions() should be introduced into HRegionInterface. In your patch HBaseAdmin has this method:

        {ocde}
        private boolean flush(ServerName serverName, List<HRegionInfo> regions){ocde}

        which decides the scope of flushing by checking regions. I think this check can be performed on region server side.

        Your argument about javadoc makes sense. We should clearly document the functionality of the (single) new method added to HRegionInterface.

        But then we'd be masking those errors right?

        Please refer to HRegion.bulkLoadHFiles() for example of buffering multiple IOE's:

              if (ioes.size() != 0) {
                LOG.error("There were IO errors when checking if bulk load is ok.  " +
                    "throwing exception!");
                throw MultipleIOException.createIOException(ioes);
              }
        
        Show
        Ted Yu added a comment - w.r.t. whether flushAllRegions() should be introduced into HRegionInterface. In your patch HBaseAdmin has this method: {ocde} private boolean flush(ServerName serverName, List<HRegionInfo> regions){ocde} which decides the scope of flushing by checking regions. I think this check can be performed on region server side. Your argument about javadoc makes sense. We should clearly document the functionality of the (single) new method added to HRegionInterface. But then we'd be masking those errors right? Please refer to HRegion.bulkLoadHFiles() for example of buffering multiple IOE's: if (ioes.size() != 0) { LOG.error( "There were IO errors when checking if bulk load is ok. " + "throwing exception!" ); throw MultipleIOException.createIOException(ioes); }
        Hide
        Akash Ashok added a comment -

        "Your argument about javadoc makes sense. We should clearly document the functionality of the (single) new method added to HRegionInterface."

        What I meant to say was that it's easy enough to over look this functionality even with good amount of javadoc. I think it would make a lot of difference if we have this extra new method flushAllRegions().

        I am just trying to understand if you see any issue with this new extra API being added on the Region Server ?

        +1 on the MultipleIOException. Thanks . I like that class

        Show
        Akash Ashok added a comment - "Your argument about javadoc makes sense. We should clearly document the functionality of the (single) new method added to HRegionInterface." What I meant to say was that it's easy enough to over look this functionality even with good amount of javadoc. I think it would make a lot of difference if we have this extra new method flushAllRegions(). I am just trying to understand if you see any issue with this new extra API being added on the Region Server ? +1 on the MultipleIOException. Thanks . I like that class
        Hide
        Ted Yu added a comment -

        I am not strongly opposed to adding a new method.
        Comment from other committers is welcome.

        Show
        Ted Yu added a comment - I am not strongly opposed to adding a new method. Comment from other committers is welcome.
        Hide
        Harsh J added a comment -

        [Dropping by from the dev lists…, have not followed otherwise]

        I'd certainly like reading flushAllRegions() over flushRegions(null). Can we not also have it as a utility function in HRServer instead if HRI/f, if the interface changing is much to be worried about?

        Show
        Harsh J added a comment - [Dropping by from the dev lists…, have not followed otherwise] I'd certainly like reading flushAllRegions() over flushRegions(null). Can we not also have it as a utility function in HRServer instead if HRI/f, if the interface changing is much to be worried about?
        Hide
        Jean-Daniel Cryans added a comment -

        Not a fan of the patch, the first problem I see is that flushing a list of regions and keeping the handler open will trigger socket exceptions too easily when you have more than a handful of regions. Also it adds a lot of code.

        Unfortunately I don't have a better option in mind and the original problem seems sketchy to me. There's no easy way to tell when all flushes are done (as shown by the acrobatics of the latest patch) and currently flushes are done in sync with the client call. At this point I'm of the opinion that we should just:

        • add an async call to flush (that does what compact currently does).
        • add the functionality in HBA to request the flushing of all regions on one region server. It would call the RS x times and those calls would be quick since it just queues the flush request.

        You can then ask for the flush queue size through JMX (and I'm sure there are other means) so that when you are close to one you call for the log roll. This last step could also be manual or scripted.

        Show
        Jean-Daniel Cryans added a comment - Not a fan of the patch, the first problem I see is that flushing a list of regions and keeping the handler open will trigger socket exceptions too easily when you have more than a handful of regions. Also it adds a lot of code. Unfortunately I don't have a better option in mind and the original problem seems sketchy to me. There's no easy way to tell when all flushes are done (as shown by the acrobatics of the latest patch) and currently flushes are done in sync with the client call. At this point I'm of the opinion that we should just: add an async call to flush (that does what compact currently does). add the functionality in HBA to request the flushing of all regions on one region server. It would call the RS x times and those calls would be quick since it just queues the flush request. You can then ask for the flush queue size through JMX (and I'm sure there are other means) so that when you are close to one you call for the log roll. This last step could also be manual or scripted.
        Hide
        Cosmin Lehene added a comment -

        Stack, Akash Ashok, Jean-Daniel Cryans is this still valid?

        Show
        Cosmin Lehene added a comment - Stack , Akash Ashok , Jean-Daniel Cryans is this still valid?

          People

          • Assignee:
            Akash Ashok
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development