HBase
  1. HBase
  2. HBASE-1364

[performance] Distributed splitting of regionserver commit logs

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.92.0
    • Component/s: Coprocessors
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Adds distributed WAL log splitting in place of single-process master orchestrated splitting. Feature is ON by default (To disable, set hbase.master.distributed.log.splitting=false).

      Description

      HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster.

      (Below is from HBASE-1008)

      In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting.

      1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error.
      2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits.

      1. HBASE-1364.patch
        88 kB
        Alex Newman
      2. 1364-v5.txt
        162 kB
        stack
      3. org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt
        5.72 MB
        stack

        Issue Links

          Activity

          Hide
          Jonathan Gray added a comment -

          Target for 0.21 so we make sure to talk about it at the hackathon.

          Show
          Jonathan Gray added a comment - Target for 0.21 so we make sure to talk about it at the hackathon.
          Hide
          Jean-Daniel Cryans added a comment -

          The WAL situation currently looks like this:

          A region server has a default maximum of 30 log files.
          A log file is synced every 10 seconds or when it reaches 100 entries.
          A log file is rolled when it gets to 95% of a block size.

          What's happening during a log split is:
          Until all log files are split

          • 3 readers read one log each into memory
          • 3 writers output to differents oldlogfile.log

          That means that in the worst case 10 iterations will be done, a maximum of 182MB of data will be put into memory
          during the reading phase and a maximum of 1824MB will pass through memory... probably getting us some GC pauses.
          Also supposing that thefailing RS had 100 regions and only 10 of them had edits, you hold off 90 regions for nothing. The only
          thing holding us is that we don't have any knowledge about which regions are in which logs.

          One proposition:
          Make that every hlog publishes in ZK its location under every region's folder and in its own folder. (both new in 0.21)
          When a RS crashes, just reassign the whole thing right away.
          When a RS opens a new region, it should check if there's any hlog in the region's ZK folder.
          If so, it should require other RS to help it split the logs and watch the logs (something wise).
          Log splitting :

          • Take a lock on the hlog folder.
          • Split the log in as many files as there are regions and put the files in the region's folder in HDFS.
          • oldlogfile.log should now have a sequence id.
          • Finally delete the hlog znodes.

          When all logs are split for a region, the RS opening it opens file by file the logs (in order) and runs them.

          When a region server tries to log split and a lock is held, it should watch it.

          Show
          Jean-Daniel Cryans added a comment - The WAL situation currently looks like this: A region server has a default maximum of 30 log files. A log file is synced every 10 seconds or when it reaches 100 entries. A log file is rolled when it gets to 95% of a block size. What's happening during a log split is: Until all log files are split 3 readers read one log each into memory 3 writers output to differents oldlogfile.log That means that in the worst case 10 iterations will be done, a maximum of 182MB of data will be put into memory during the reading phase and a maximum of 1824MB will pass through memory... probably getting us some GC pauses. Also supposing that thefailing RS had 100 regions and only 10 of them had edits, you hold off 90 regions for nothing. The only thing holding us is that we don't have any knowledge about which regions are in which logs. One proposition: Make that every hlog publishes in ZK its location under every region's folder and in its own folder. (both new in 0.21) When a RS crashes, just reassign the whole thing right away. When a RS opens a new region, it should check if there's any hlog in the region's ZK folder. If so, it should require other RS to help it split the logs and watch the logs (something wise). Log splitting : Take a lock on the hlog folder. Split the log in as many files as there are regions and put the files in the region's folder in HDFS. oldlogfile.log should now have a sequence id. Finally delete the hlog znodes. When all logs are split for a region, the RS opening it opens file by file the logs (in order) and runs them. When a region server tries to log split and a lock is held, it should watch it.
          Hide
          Jean-Daniel Cryans added a comment -

          New similar solution after discussing it at hackathon.

          The map of regions in each HLog should be a file written in the same folder on HDFS as its hlog.
          When a region server crashes, the master does 2 things:

          • Read every map of regions, see which hlogs has which region's edits, open
            all the regions and pass a message to those that have edits to tell them where to look for them.
          • Start a distributed sorting. For every hlog, output a sorted file by region name with the index to seek to for each region.

          When a RS opens a region, it waits for all the sorted files to be available and replays the edits in order.
          The master deletes the hlog folders when all regions that had edits are opened.

          Show
          Jean-Daniel Cryans added a comment - New similar solution after discussing it at hackathon. The map of regions in each HLog should be a file written in the same folder on HDFS as its hlog. When a region server crashes, the master does 2 things: Read every map of regions, see which hlogs has which region's edits, open all the regions and pass a message to those that have edits to tell them where to look for them. Start a distributed sorting. For every hlog, output a sorted file by region name with the index to seek to for each region. When a RS opens a region, it waits for all the sorted files to be available and replays the edits in order. The master deletes the hlog folders when all regions that had edits are opened.
          Hide
          stack added a comment -

          I wonder if we could add list of regions to end of hlog as metadata? There might be other useful metadata we'd want to add to an hlog. Currenlty hlog is a sequencefile. SequenceFile takes metadata but only at its head which is no good to us. Maybe we can hack around an hfile-like thing to use here?

          Show
          stack added a comment - I wonder if we could add list of regions to end of hlog as metadata? There might be other useful metadata we'd want to add to an hlog. Currenlty hlog is a sequencefile. SequenceFile takes metadata but only at its head which is no good to us. Maybe we can hack around an hfile-like thing to use here?
          Hide
          Erik Holstad added a comment -

          Yeah, would be nice to have our own file format for this, but I thought JD said something about keeping two different files and not writing the regions effected in the same file, cause we would append to two files right and not writing it out afterwards?

          Show
          Erik Holstad added a comment - Yeah, would be nice to have our own file format for this, but I thought JD said something about keeping two different files and not writing the regions effected in the same file, cause we would append to two files right and not writing it out afterwards?
          Hide
          stack added a comment -

          Part of master rewrite would include the running of the distributed split. Lets coordinate.

          Show
          stack added a comment - Part of master rewrite would include the running of the distributed split. Lets coordinate.
          Hide
          Jean-Daniel Cryans added a comment -

          My proposition optimizes regions that aren't in any HLog file, mgilbert on the IRC was mentioning that a RS failure took a long time to recover because many flushed edits were processed because the files had un-flushed edits from other regions. We should probably also look out for that case.

          Show
          Jean-Daniel Cryans added a comment - My proposition optimizes regions that aren't in any HLog file, mgilbert on the IRC was mentioning that a RS failure took a long time to recover because many flushed edits were processed because the files had un-flushed edits from other regions. We should probably also look out for that case.
          Hide
          Jonathan Gray added a comment -

          JD, Stack. Where do we stand on this? We've started talks on this topic here at FB... seems like there's cross-over with work being done on replication?

          Show
          Jonathan Gray added a comment - JD, Stack. Where do we stand on this? We've started talks on this topic here at FB... seems like there's cross-over with work being done on replication?
          Hide
          Jean-Daniel Cryans added a comment -

          I think the crossover could be about publishing which HLogs a RS is holding. Other than that I'm not sure about others.

          Show
          Jean-Daniel Cryans added a comment - I think the crossover could be about publishing which HLogs a RS is holding. Other than that I'm not sure about others.
          Hide
          Jonathan Gray added a comment -

          Yeah, I was just thinking about the use of ZK could be very similar. Will mull back over previous comments and think more on this.

          Show
          Jonathan Gray added a comment - Yeah, I was just thinking about the use of ZK could be very similar. Will mull back over previous comments and think more on this.
          Hide
          stack added a comment -

          I was thinking that we could use a coprocessor to do log sort/split. One thing that needs addressing is that each Store in a Region reads the whole log itself. Seems like reading of the old edits should be done once at region level (I don't think this would be hard to do). Other things, in BT paper, they talk of regions asking for edits after they come back online. First, the region is already online again before log split starts – maybe its already taking on writes? – and second, somehow the region knew what logs they want to get edits from (Recheck paper. I'm pulling from rusty memory).

          Show
          stack added a comment - I was thinking that we could use a coprocessor to do log sort/split. One thing that needs addressing is that each Store in a Region reads the whole log itself. Seems like reading of the old edits should be done once at region level (I don't think this would be hard to do). Other things, in BT paper, they talk of regions asking for edits after they come back online. First, the region is already online again before log split starts – maybe its already taking on writes? – and second, somehow the region knew what logs they want to get edits from (Recheck paper. I'm pulling from rusty memory).
          Hide
          Lars George added a comment -

          @stack: I had the same suggestion using Coprocessors back when we discussed this initially. Andrew said that it would not make sense because CP's can only access local data, not random DFS file because of security concerns. I gave up on that point knowing I had much less insight into how things work. But there seems to be more interest apparently

          The presorting of the log into pieces per region was also discussed here (or a related issue), one idea was to use Zookeeper to stick in the ranges to be processed and RS's working on those distributed until they all report back that sorting has been done. Then the regions can read their edits locally. The other notion here (pretty sure it was JD) was that index file kept with the log that would give us know knowledge which regions are dirty and therefore unedited regions can go live right away. Starting to accept writes before the region is up seems like a doable technique as well. Not sure about the implications (say what if they write with no WAL - or even with WAL for that matter I guess - and you need to flush while the region is still in that "replay" state etc.)

          I cannot recall though the BT approach you refer to though, i.e. the regions are already online while log splitting. It sounded like they do that distributed split and then have the regions read those edits. I need to read this again it seems.

          But seems like there is a lot of room for improvement there.

          Show
          Lars George added a comment - @stack: I had the same suggestion using Coprocessors back when we discussed this initially. Andrew said that it would not make sense because CP's can only access local data, not random DFS file because of security concerns. I gave up on that point knowing I had much less insight into how things work. But there seems to be more interest apparently The presorting of the log into pieces per region was also discussed here (or a related issue), one idea was to use Zookeeper to stick in the ranges to be processed and RS's working on those distributed until they all report back that sorting has been done. Then the regions can read their edits locally. The other notion here (pretty sure it was JD) was that index file kept with the log that would give us know knowledge which regions are dirty and therefore unedited regions can go live right away. Starting to accept writes before the region is up seems like a doable technique as well. Not sure about the implications (say what if they write with no WAL - or even with WAL for that matter I guess - and you need to flush while the region is still in that "replay" state etc.) I cannot recall though the BT approach you refer to though, i.e. the regions are already online while log splitting. It sounded like they do that distributed split and then have the regions read those edits. I need to read this again it seems. But seems like there is a lot of room for improvement there.
          Hide
          Andrew Purtell added a comment -

          ZK 3.3.0 claims to have a worked distributed queue recipe implementation. Hand out logs to regionservers to split via this primitive.

          Build the logic into core, no need for CPs here (and there are security problems with that anyway).

          Show
          Andrew Purtell added a comment - ZK 3.3.0 claims to have a worked distributed queue recipe implementation. Hand out logs to regionservers to split via this primitive. Build the logic into core, no need for CPs here (and there are security problems with that anyway).
          Hide
          Alex Newman added a comment -

          People have asked about my design so I thought I would give a quick status report and what I am doing.

          Design:
          When the HMaster server calls splitLog, under the hood, it is just enqueing a znode as SEQUENTIAL PERSISTANT to build a quick queue. It then blocks until that queue is drained.
          RegionServer now have a LogSplitter thread, who is responsible for doing the leg work of splitting a log. They watch the znode queue, so when an update happens, and they are not doing work, they will attempt to claim responsibility for working on that log.
          RegionServers take responsibility by writing an ephemeral node under a seperate znode, which acts as a bnar

          Status:

          • I need to rejigger some stuff to handle an entire cluster loosing power(kill -9)
          • More testing infrastructure is needed
          • I am thinking about incorporating HBASE-2437
          Show
          Alex Newman added a comment - People have asked about my design so I thought I would give a quick status report and what I am doing. Design: When the HMaster server calls splitLog, under the hood, it is just enqueing a znode as SEQUENTIAL PERSISTANT to build a quick queue. It then blocks until that queue is drained. RegionServer now have a LogSplitter thread, who is responsible for doing the leg work of splitting a log. They watch the znode queue, so when an update happens, and they are not doing work, they will attempt to claim responsibility for working on that log. RegionServers take responsibility by writing an ephemeral node under a seperate znode, which acts as a bnar Status: I need to rejigger some stuff to handle an entire cluster loosing power(kill -9) More testing infrastructure is needed I am thinking about incorporating HBASE-2437
          Hide
          Alex Newman added a comment -

          Sorry I got cut off before finishing that comment. could someone please remove it?

          Show
          Alex Newman added a comment - Sorry I got cut off before finishing that comment. could someone please remove it?
          Hide
          Alex Newman added a comment -

          People have asked about my design so I thought I would give a quick status report and what I am doing.
          Design:
          When the HMaster server calls splitLog, under the hood, it is just enqueing a znode as SEQUENTIAL PERSISTANT to build a quick queue. It then blocks until that queue is drained.

          RegionServer now have a LogSplitter thread, who is responsible for doing the leg work of splitting a log. They watch the znode queue, so when an update happens, and they are not doing work, they will attempt to claim responsibility for working on that log.

          RegionServers take responsibility by writing an ephemeral node under a separate znode, which acts as a lock, preventing other regionservers from attempting to split that same log. Technically this could still happen if a regionserver gc so long that its ephemeral node expires. In which case we rely on the fact that logsplits are idempotent.
          The resultant per region logs are named sn_oldlog.log where the sn is the first sequence number in that log. (Maybe it should be the last). When it is done, i delete the queue entry and then the lock.
          The main remaining issue, is that if the whole cluster goes down, the regionserver threads are not started up until they are associated with a master so the logsplitter never runs, and the master stays blocked waiting for the queue to drain.
          Status:
          ▪ I need to rejigger some stuff to handle an entire cluster loosing power(kill -9)
          ▪ More testing infrastructure is needed
          ▪ I am thinking about incorporating HBASE-2437

          Show
          Alex Newman added a comment - People have asked about my design so I thought I would give a quick status report and what I am doing. Design: When the HMaster server calls splitLog, under the hood, it is just enqueing a znode as SEQUENTIAL PERSISTANT to build a quick queue. It then blocks until that queue is drained.
 RegionServer now have a LogSplitter thread, who is responsible for doing the leg work of splitting a log. They watch the znode queue, so when an update happens, and they are not doing work, they will attempt to claim responsibility for working on that log.
 RegionServers take responsibility by writing an ephemeral node under a separate znode, which acts as a lock, preventing other regionservers from attempting to split that same log. Technically this could still happen if a regionserver gc so long that its ephemeral node expires. In which case we rely on the fact that logsplits are idempotent. The resultant per region logs are named sn_oldlog.log where the sn is the first sequence number in that log. (Maybe it should be the last). When it is done, i delete the queue entry and then the lock. The main remaining issue, is that if the whole cluster goes down, the regionserver threads are not started up until they are associated with a master so the logsplitter never runs, and the master stays blocked waiting for the queue to drain. Status: ▪ I need to rejigger some stuff to handle an entire cluster loosing power(kill -9) ▪ More testing infrastructure is needed ▪ I am thinking about incorporating HBASE-2437
          Hide
          ryan rawson added a comment -

          this generally looks really good, sounds like zk has really helped us a lot.

          Show
          ryan rawson added a comment - this generally looks really good, sounds like zk has really helped us a lot.
          Hide
          Stack added a comment -

          Looks great

          On May 6, 2010, at 6:20 PM, "Alex Newman (JIRA)" <jira@apache.org>

          Show
          Stack added a comment - Looks great On May 6, 2010, at 6:20 PM, "Alex Newman (JIRA)" <jira@apache.org>
          Hide
          Alex Newman added a comment -

          So I tested just launching the LogSplitter thread early and it works just fine. It required putting more information into zk. Basically when the logSplitter is finished splitting it drops the path of the split made into zookeeper, so the master can receive it, when they are all done.

          Stack can you remove my earlier comments.

          Show
          Alex Newman added a comment - So I tested just launching the LogSplitter thread early and it works just fine. It required putting more information into zk. Basically when the logSplitter is finished splitting it drops the path of the split made into zookeeper, so the master can receive it, when they are all done. Stack can you remove my earlier comments.
          Hide
          stack added a comment -

          You want me to delete https://issues.apache.org/jira/browse/HBASE-1364?focusedCommentId=12864989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12864989? Whats wrong w/ it? Ain't it better to have the painful history all there out in the open rather than have folks get the impression that you some kinda ultra-human who just makes solutions absent supposition, trial and error! (You can't remove it yourself? You are a contrib. Maybe only admin can remove stuff?)

          Show
          stack added a comment - You want me to delete https://issues.apache.org/jira/browse/HBASE-1364?focusedCommentId=12864989&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12864989? Whats wrong w/ it? Ain't it better to have the painful history all there out in the open rather than have folks get the impression that you some kinda ultra-human who just makes solutions absent supposition, trial and error! (You can't remove it yourself? You are a contrib. Maybe only admin can remove stuff?)
          Hide
          stack added a comment -

          Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.

          Show
          stack added a comment - Moved from 0.21 to 0.22 just after merge of old 0.20 branch into TRUNK.
          Hide
          Todd Lipcon added a comment -

          Sorry, just finally got time to look at this one

          Some questions about the design:

          • What happens if the regionserver crashes in the middle of splitting a log? Who is monitoring its znode so that the log split gets re-enqueued?
          • You mention that log splitting is idempotent, however, it's not 100% true since the last step of log splitting is to remove the old logs. How will you address this issue? eg RS A starts splitting, gets almost done, goes into GC pause. RS B starts splitting. RS A wakes up and removes logs. RS B gets missing block errors, etc.
          Show
          Todd Lipcon added a comment - Sorry, just finally got time to look at this one Some questions about the design: What happens if the regionserver crashes in the middle of splitting a log? Who is monitoring its znode so that the log split gets re-enqueued? You mention that log splitting is idempotent, however, it's not 100% true since the last step of log splitting is to remove the old logs. How will you address this issue? eg RS A starts splitting, gets almost done, goes into GC pause. RS B starts splitting. RS A wakes up and removes logs. RS B gets missing block errors, etc.
          Hide
          Alex Newman added a comment -

          I think we need some more tests. Anyone got any good ideas on the whole multiple simulations splits writing to the same region tests?

          Show
          Alex Newman added a comment - I think we need some more tests. Anyone got any good ideas on the whole multiple simulations splits writing to the same region tests?
          Hide
          Alex Newman added a comment -

          So I posted the first patch. I think we need to so some magic in regards to multiple regionservers writing logs to the same region still.

          Show
          Alex Newman added a comment - So I posted the first patch. I think we need to so some magic in regards to multiple regionservers writing logs to the same region still.
          Hide
          Alex Newman added a comment -

          v.v Todd's comments

          • The node isn't deleted til after the log is completed. We use a ephemeral lock to prevent contention
          • B is testable so let's test it!

          Any other tetst?

          Show
          Alex Newman added a comment - v.v Todd's comments The node isn't deleted til after the log is completed. We use a ephemeral lock to prevent contention B is testable so let's test it! Any other tetst?
          Hide
          Alex Newman added a comment -

          This allows multiple splitters to split to the same region.

          Show
          Alex Newman added a comment - This allows multiple splitters to split to the same region.
          Hide
          Alex Newman added a comment -

          So as I understand it, processServerShutdown always runs one split at a time due to locking issues. I would prefer if we were to relax this requirement that it be done in a separate jira. Any objections? What else should I do here to move things forward. I think the only thing that needs to be done is

          • Add some more tests (suggestions)
          • switch all calls from splitLog to distributedSplitLog
          Show
          Alex Newman added a comment - So as I understand it, processServerShutdown always runs one split at a time due to locking issues. I would prefer if we were to relax this requirement that it be done in a separate jira. Any objections? What else should I do here to move things forward. I think the only thing that needs to be done is Add some more tests (suggestions) switch all calls from splitLog to distributedSplitLog
          Hide
          Todd Lipcon added a comment -

          How hard would it be to refactor the log splitting to be configurable which algorithm to use? eg have a LogSplitter interface with MasterLogSplitter, DistributedLogSplitter, perhaps MapReduceLogSplitter if someone wants to write that one?

          Show
          Todd Lipcon added a comment - How hard would it be to refactor the log splitting to be configurable which algorithm to use? eg have a LogSplitter interface with MasterLogSplitter, DistributedLogSplitter, perhaps MapReduceLogSplitter if someone wants to write that one?
          Hide
          ryan rawson added a comment -

          for this JIRA we should do the inline splitting, which means waiting on the distributed split to finish before going to the next ProcessServerShutdown item on the queue.

          It isnt just as simple as saying 'just make it parallel' since there is an order to these TODO items and there are a number of different types of work items on the queue. This should be addressed in the next JIRA.

          Show
          ryan rawson added a comment - for this JIRA we should do the inline splitting, which means waiting on the distributed split to finish before going to the next ProcessServerShutdown item on the queue. It isnt just as simple as saying 'just make it parallel' since there is an order to these TODO items and there are a number of different types of work items on the queue. This should be addressed in the next JIRA.
          Hide
          Alex Newman added a comment -

          Todd: not hard at all. I exposed it as a different api, but if we wish we could have it be conf driven. I could see scenarios where the same cluster could use different types of splitting for different things shrug

          Ryan: I totally agree

          Is their any reason why we are targeting 22.0 vs 21.0 for this patch?

          Show
          Alex Newman added a comment - Todd: not hard at all. I exposed it as a different api, but if we wish we could have it be conf driven. I could see scenarios where the same cluster could use different types of splitting for different things shrug Ryan: I totally agree Is their any reason why we are targeting 22.0 vs 21.0 for this patch?
          Hide
          Todd Lipcon added a comment -

          Is their any reason why we are targeting 22.0 vs 21.0 for this patch?

          My personal opinion is that this should not be default in 0.21, since it doesn't quite fit with the current priorities (durability, stability, correctness) and may actually introduce new complicated bugs in the above. The most carefully programmed distributed operation is usually buggier than the least carefully programmed single-node operation

          Having it be configurable seems like a great thing for 0.21 so people can try it out, and if it seems bug-free we can make it default in the next release. Of course that's just my opinion, and if others disagree, happy to be outvoted.

          Show
          Todd Lipcon added a comment - Is their any reason why we are targeting 22.0 vs 21.0 for this patch? My personal opinion is that this should not be default in 0.21, since it doesn't quite fit with the current priorities (durability, stability, correctness) and may actually introduce new complicated bugs in the above. The most carefully programmed distributed operation is usually buggier than the least carefully programmed single-node operation Having it be configurable seems like a great thing for 0.21 so people can try it out, and if it seems bug-free we can make it default in the next release. Of course that's just my opinion, and if others disagree, happy to be outvoted.
          Hide
          Alex Newman added a comment -

          todd: sounds about right.

          Show
          Alex Newman added a comment - todd: sounds about right.
          Hide
          Alex Newman added a comment -

          Going to put some work in to get this to reapply to trun

          Show
          Alex Newman added a comment - Going to put some work in to get this to reapply to trun
          Hide
          HBase Review Board added a comment -

          Message from: "Alex Newman" <newalex@cloudera.com>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/370/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          This build on the previous work. It does some smarter stuff with testing and now splitting is configurable.

          This addresses bug hbase-1364.
          http://issues.apache.org/jira/browse/hbase-1364

          Diffs


          src/main/java/org/apache/hadoop/hbase/HConstants.java c77ebf5
          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java f251d54
          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java PRE-CREATION
          src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 5688c03
          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 8225178
          src/main/resources/hbase-default.xml e3a9669
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/BaseTestHLogSplit.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplit.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplitSkipErrors.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestLogRolling.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java ad8f9e5
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 908633e
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplitSkipErrors.java PRE-CREATION
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogActionsListener.java 776d78c
          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 9eae4b4
          src/test/resources/hbase-site.xml 3c0601a

          Diff: http://review.hbase.org/r/370/diff

          Testing
          -------

          ran on our private hudson

          Thanks,

          Alex

          Show
          HBase Review Board added a comment - Message from: "Alex Newman" <newalex@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/ ----------------------------------------------------------- Review request for hbase. Summary ------- This build on the previous work. It does some smarter stuff with testing and now splitting is configurable. This addresses bug hbase-1364. http://issues.apache.org/jira/browse/hbase-1364 Diffs src/main/java/org/apache/hadoop/hbase/HConstants.java c77ebf5 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java f251d54 src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 5688c03 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 8225178 src/main/resources/hbase-default.xml e3a9669 src/test/java/org/apache/hadoop/hbase/regionserver/wal/BaseTestHLogSplit.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplit.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplitSkipErrors.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestLogRolling.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java ad8f9e5 src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 908633e src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplitSkipErrors.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogActionsListener.java 776d78c src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 9eae4b4 src/test/resources/hbase-site.xml 3c0601a Diff: http://review.hbase.org/r/370/diff Testing ------- ran on our private hudson Thanks, Alex
          Hide
          HBase Review Board added a comment -

          Message from: "Jean-Daniel Cryans" <jdcryans@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/370/#review464
          -----------------------------------------------------------

          First pass on this patch. Lots of cleanup that needs to be done, and it's a bit hard to follow the flow of events without any clear documentation that gives an overview of distributed splitting. Nothing big, just some use cases that could be put in the class javadoc of LogSplitter?

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          <http://review.hbase.org/r/370/#comment1903>

          I'm sure you have a good reason of putting that there, but at least one issue I'm seeing is that this code is also in init() (which will be run just after that) and it's almost the same thing.

          Also, fs.automatic.close is handled by the ShutdownHook class, you shouldn't be setting it.

          src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
          <http://review.hbase.org/r/370/#comment1904>

          Fix those long lines.

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1905>

          rogue "q"

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1914>

          Why are those static?

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1906>

          remove that white space and all the others in that class at the same place

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1922>

          both process and run call this method, can there be a race?

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1923>

          don't need to declare this here

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1907>

          What does that mean?

          src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java
          <http://review.hbase.org/r/370/#comment1918>

          why two lines?

          src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          <http://review.hbase.org/r/370/#comment1908>

          rogue "b"

          src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          <http://review.hbase.org/r/370/#comment1909>

          ?

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1917>

          confusing name when looking at what's returned, fix that

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1921>

          Why two lines for nodes? Also, if nodes is null for any reason, won't that throw an NPE?

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1924>

          most of that stuff can be removed and put into the

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1928>

          so you create a lock with data=null?

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1930>

          Or you were just disconnected, could mean a lot of things right?

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1915>

          JavaBean convention, don't start parameters' name with upper case

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1916>

          So we log here and we log in LogSplitter, remove one of them?

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1933>

          again, name confusing WRT returned type

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1931>

          same comment

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1934>

          don't start with upper case

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1938>

          Usually ppl check that the other way around

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1936>

          use HConstants.EMPTY_BYTE_ARRAY

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1940>

          third ERROR line if splitPath is null, keep only one

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1943>

          pull the next lines on this one with a tertiary operator

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1941>

          EMPTY_BYTE_ARRAY

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1942>

          Use Bytes.toBytes

          src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
          <http://review.hbase.org/r/370/#comment1944>

          Use Bytes.toString

          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java
          <http://review.hbase.org/r/370/#comment1911>

          copy pasta, we're in 2010 now!

          src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java
          <http://review.hbase.org/r/370/#comment1910>

          clean

          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
          <http://review.hbase.org/r/370/#comment1912>

          white spaces...

          src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
          <http://review.hbase.org/r/370/#comment1913>

          clean

          • Jean-Daniel
          Show
          HBase Review Board added a comment - Message from: "Jean-Daniel Cryans" <jdcryans@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/#review464 ----------------------------------------------------------- First pass on this patch. Lots of cleanup that needs to be done, and it's a bit hard to follow the flow of events without any clear documentation that gives an overview of distributed splitting. Nothing big, just some use cases that could be put in the class javadoc of LogSplitter? src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < http://review.hbase.org/r/370/#comment1903 > I'm sure you have a good reason of putting that there, but at least one issue I'm seeing is that this code is also in init() (which will be run just after that) and it's almost the same thing. Also, fs.automatic.close is handled by the ShutdownHook class, you shouldn't be setting it. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java < http://review.hbase.org/r/370/#comment1904 > Fix those long lines. src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1905 > rogue "q" src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1914 > Why are those static? src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1906 > remove that white space and all the others in that class at the same place src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1922 > both process and run call this method, can there be a race? src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1923 > don't need to declare this here src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1907 > What does that mean? src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java < http://review.hbase.org/r/370/#comment1918 > why two lines? src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java < http://review.hbase.org/r/370/#comment1908 > rogue "b" src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java < http://review.hbase.org/r/370/#comment1909 > ? src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1917 > confusing name when looking at what's returned, fix that src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1921 > Why two lines for nodes? Also, if nodes is null for any reason, won't that throw an NPE? src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1924 > most of that stuff can be removed and put into the src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1928 > so you create a lock with data=null? src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1930 > Or you were just disconnected, could mean a lot of things right? src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1915 > JavaBean convention, don't start parameters' name with upper case src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1916 > So we log here and we log in LogSplitter, remove one of them? src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1933 > again, name confusing WRT returned type src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1931 > same comment src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1934 > don't start with upper case src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1938 > Usually ppl check that the other way around src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1936 > use HConstants.EMPTY_BYTE_ARRAY src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1940 > third ERROR line if splitPath is null, keep only one src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1943 > pull the next lines on this one with a tertiary operator src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1941 > EMPTY_BYTE_ARRAY src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1942 > Use Bytes.toBytes src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java < http://review.hbase.org/r/370/#comment1944 > Use Bytes.toString src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java < http://review.hbase.org/r/370/#comment1911 > copy pasta, we're in 2010 now! src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java < http://review.hbase.org/r/370/#comment1910 > clean src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java < http://review.hbase.org/r/370/#comment1912 > white spaces... src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java < http://review.hbase.org/r/370/#comment1913 > clean Jean-Daniel
          Hide
          HBase Review Board added a comment -

          Message from: "Jonathan Gray" <jgray@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/370/#review467
          -----------------------------------------------------------

          Hey Alex, this is looking good. The master rewrite branch has a refactoring of ZooKeeperWrapper and general ZK usage inside HBase that conflicts with this pretty significantly.

          Do you think you could pull the new methods and classes nested in ZooKeeperWrapper into a separate class of static methods? If you need the instantiated instance of ZKW, pass it in as the first argument to the static methods? That will make my life WAY easier when I have to merge the branch back into trunk.

          Also gives an opportunity to have a class comment in the new class explaining the overall usage of zk.

          Stuff like the names of the nodes can be left in the instantiated ZKW class since it makes sense to pull those in from the confs on instantiation.

          Cool? Let me know if you want an example.

          • Jonathan
          Show
          HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/#review467 ----------------------------------------------------------- Hey Alex, this is looking good. The master rewrite branch has a refactoring of ZooKeeperWrapper and general ZK usage inside HBase that conflicts with this pretty significantly. Do you think you could pull the new methods and classes nested in ZooKeeperWrapper into a separate class of static methods? If you need the instantiated instance of ZKW, pass it in as the first argument to the static methods? That will make my life WAY easier when I have to merge the branch back into trunk. Also gives an opportunity to have a class comment in the new class explaining the overall usage of zk. Stuff like the names of the nodes can be left in the instantiated ZKW class since it makes sense to pull those in from the confs on instantiation. Cool? Let me know if you want an example. Jonathan
          Hide
          HBase Review Board added a comment -

          Message from: "Alex Newman" <newalex@cloudera.com>

          On 2010-07-23 13:41:09, Jonathan Gray wrote:

          > Hey Alex, this is looking good. The master rewrite branch has a refactoring of ZooKeeperWrapper and general ZK usage inside HBase that conflicts with this pretty significantly.

          >

          > Do you think you could pull the new methods and classes nested in ZooKeeperWrapper into a separate class of static methods? If you need the instantiated instance of ZKW, pass it in as the first argument to the static methods? That will make my life WAY easier when I have to merge the branch back into trunk.

          >

          > Also gives an opportunity to have a class comment in the new class explaining the overall usage of zk.

          >

          > Stuff like the names of the nodes can be left in the instantiated ZKW class since it makes sense to pull those in from the confs on instantiation.

          >

          > Cool? Let me know if you want an example.

          sounds good

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/370/#review467
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Alex Newman" <newalex@cloudera.com> On 2010-07-23 13:41:09, Jonathan Gray wrote: > Hey Alex, this is looking good. The master rewrite branch has a refactoring of ZooKeeperWrapper and general ZK usage inside HBase that conflicts with this pretty significantly. > > Do you think you could pull the new methods and classes nested in ZooKeeperWrapper into a separate class of static methods? If you need the instantiated instance of ZKW, pass it in as the first argument to the static methods? That will make my life WAY easier when I have to merge the branch back into trunk. > > Also gives an opportunity to have a class comment in the new class explaining the overall usage of zk. > > Stuff like the names of the nodes can be left in the instantiated ZKW class since it makes sense to pull those in from the confs on instantiation. > > Cool? Let me know if you want an example. sounds good Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/#review467 -----------------------------------------------------------
          Hide
          HBase Review Board added a comment -

          Message from: "Alex Newman" <newalex@cloudera.com>

          On 2010-07-23 12:01:53, Jean-Daniel Cryans wrote:

          > First pass on this patch. Lots of cleanup that needs to be done, and it's a bit hard to follow the flow of events without any clear documentation that gives an overview of distributed splitting. Nothing big, just some use cases that could be put in the class javadoc of LogSplitter?

          I'll add the story and get these changes in.

          • Alex

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/370/#review464
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Alex Newman" <newalex@cloudera.com> On 2010-07-23 12:01:53, Jean-Daniel Cryans wrote: > First pass on this patch. Lots of cleanup that needs to be done, and it's a bit hard to follow the flow of events without any clear documentation that gives an overview of distributed splitting. Nothing big, just some use cases that could be put in the class javadoc of LogSplitter? I'll add the story and get these changes in. Alex ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/#review464 -----------------------------------------------------------
          Hide
          Alex Newman added a comment -
          Show
          Alex Newman added a comment - Updated patch v.v https://review.cloudera.org/r/1472/
          Hide
          Alex Newman added a comment -

          After rereading this jira I realized I should probably add some additional javadoc to the patch. Shouldn't be to bad to find time sometime this week.

          Show
          Alex Newman added a comment - After rereading this jira I realized I should probably add some additional javadoc to the patch. Shouldn't be to bad to find time sometime this week.
          Hide
          Andrew Purtell added a comment -

          Cancelling patch, stale.

          Show
          Andrew Purtell added a comment - Cancelling patch, stale.
          Hide
          Jonathan Gray added a comment -

          FYI, Prakash Khemani is working on this right now. Not sure when a patch will be up but it's looking good so far. It is built on top of the new ZK stuff in 0.90 and above.

          Show
          Jonathan Gray added a comment - FYI, Prakash Khemani is working on this right now. Not sure when a patch will be up but it's looking good so far. It is built on top of the new ZK stuff in 0.90 and above.
          Hide
          Alex Newman added a comment -

          Yea sorry sorry about that. Are you taking a different approach?

          Show
          Alex Newman added a comment - Yea sorry sorry about that. Are you taking a different approach?
          Hide
          Jonathan Gray added a comment -

          The approach is similar, but based on the new master/rs interfaces, ZK classes, etc.

          Show
          Jonathan Gray added a comment - The approach is similar, but based on the new master/rs interfaces, ZK classes, etc.
          Hide
          Prakash Khemani added a comment -

          I justed posted not yet fully done patch for review https://review.cloudera.org/r/1655/ (For some reason it isn't getting automatically linked)

          Show
          Prakash Khemani added a comment - I justed posted not yet fully done patch for review https://review.cloudera.org/r/1655/ (For some reason it isn't getting automatically linked)
          Hide
          stack added a comment -

          @Prakash I posted a review of about 50% of your patch over on rb. It looks great so far.

          Show
          stack added a comment - @Prakash I posted a review of about 50% of your patch over on rb. It looks great so far.
          Hide
          stack added a comment -

          I did the top 50% or so again over on rb. Over there I write "...I'd be up for committing this sooner rather than later, especially if its bascially working for you. My thought is this is a big patch and its critical functionality so commit and get the rest of the community helping iron out bugs. Let us know when you think its good to commit."

          FYI Prakash, you need to update here when you post a new patch, at least for the moment, because email of notifications is not working (We'll be moving to Apache's review board instance some time soon).

          Show
          stack added a comment - I did the top 50% or so again over on rb. Over there I write "...I'd be up for committing this sooner rather than later, especially if its bascially working for you. My thought is this is a big patch and its critical functionality so commit and get the rest of the community helping iron out bugs. Let us know when you think its good to commit." FYI Prakash, you need to update here when you post a new patch, at least for the moment, because email of notifications is not working (We'll be moving to Apache's review board instance some time soon).
          Hide
          Ted Yu added a comment -

          I got the following from TestDistributedLogSplitting:

          testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)  Time elapsed: 33.203 sec  <<< ERROR!
          java.lang.Exception: Unexpected exception, expected<org.apache.hadoop.hbase.regionserver.wal.OrphanHLogAfterSplitException> but was<java.lang.Error>
                  at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:28)
                  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
                  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
                  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
                  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
                  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
                  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
                  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
                  at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
                  at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
                  at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                  at java.lang.reflect.Method.invoke(Method.java:597)
                  at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
                  at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
          Caused by: java.lang.Error: Unresolved compilation problem:
                  The method appendNoSync(HRegionInfo, byte[], WALEdit, long) is undefined for the type HLog
          
          Show
          Ted Yu added a comment - I got the following from TestDistributedLogSplitting: testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 33.203 sec <<< ERROR! java.lang.Exception: Unexpected exception, expected<org.apache.hadoop.hbase.regionserver.wal.OrphanHLogAfterSplitException> but was<java.lang.Error> at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:28) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165) at org.apache.maven.surefire.Surefire.run(Surefire.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005) Caused by: java.lang.Error: Unresolved compilation problem: The method appendNoSync(HRegionInfo, byte [], WALEdit, long ) is undefined for the type HLog
          Hide
          Ted Yu added a comment -

          TestDistributedLogSplitting.testThreeRSAbort() consistently hangs on my laptop.

          Show
          Ted Yu added a comment - TestDistributedLogSplitting.testThreeRSAbort() consistently hangs on my laptop.
          Hide
          dhruba borthakur added a comment -


          Connect to me at http://www.facebook.com/dhruba

          Show
          dhruba borthakur added a comment - – Connect to me at http://www.facebook.com/dhruba
          Hide
          Prakash Khemani added a comment -

          posted a revised patch at https://review.cloudera.org/r/1655/

          Show
          Prakash Khemani added a comment - posted a revised patch at https://review.cloudera.org/r/1655/
          Hide
          Prakash Khemani added a comment -
          Show
          Prakash Khemani added a comment - updated patch at https://review.cloudera.org/r/1655/
          Hide
          stack added a comment -

          org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort fails for me w/ latest patch:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.master.TestDistributedLogSplitting
          -------------------------------------------------------------------------------
          Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 182.303 sec <<< FAILURE!
          testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)  Time elapsed: 60.731 sec  <<< FAILURE!
          java.lang.AssertionError: expected:<1> but was:<0>
                  at org.junit.Assert.fail(Assert.java:91)
                  at org.junit.Assert.failNotEquals(Assert.java:645)
                  at org.junit.Assert.assertEquals(Assert.java:126)
                  at org.junit.Assert.assertEquals(Assert.java:470)
                  at org.junit.Assert.assertEquals(Assert.java:454)
                  at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:288)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
                  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
                  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
                  at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
                  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
                  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
                  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
                  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
                  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
                  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
                  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
                  at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
                  at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
                  at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
                  at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
          
          Show
          stack added a comment - org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort fails for me w/ latest patch: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.master.TestDistributedLogSplitting ------------------------------------------------------------------------------- Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 182.303 sec <<< FAILURE! testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 60.731 sec <<< FAILURE! java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.junit.Assert.assertEquals(Assert.java:454) at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:288) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165) at org.apache.maven.surefire.Surefire.run(Surefire.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
          Hide
          stack added a comment -

          Version of prakash's patch that applies to TRUNK (there's a few rejects when I apply whats out on rb).

          Show
          stack added a comment - Version of prakash's patch that applies to TRUNK (there's a few rejects when I apply whats out on rb).
          Hide
          stack added a comment -

          Did some review over on rb but not done yet. Will finish this evening.

          Show
          stack added a comment - Did some review over on rb but not done yet. Will finish this evening.
          Hide
          Prakash Khemani added a comment -

          TestDistributedLogSplitting.testWorkerAbort test failed because the SplitLogWorker was incrementing the tot_wkr_task_resigned twice. My test runs were passing because of a race - if the test happens to look at the counter between the 2 increments then the test will pass. Fixing this in the latest patch.

          Show
          Prakash Khemani added a comment - TestDistributedLogSplitting.testWorkerAbort test failed because the SplitLogWorker was incrementing the tot_wkr_task_resigned twice. My test runs were passing because of a race - if the test happens to look at the counter between the 2 increments then the test will pass. Fixing this in the latest patch.
          Hide
          stack added a comment -

          @Prakash I uploaded more review. See if it helps.

          Show
          stack added a comment - @Prakash I uploaded more review. See if it helps.
          Hide
          Prakash Khemani added a comment -

          I uploaded a new diff at the review board https://review.cloudera.org/r/1655/

          I think it takes care of all of Stack's comments.

          added a new test in TestHLogSplit to test that when skip-errors is set to true then corrupted log files are ignored and correctly moved to the .corrupted directory.

          Some of the tests - especially in TestDistributedLogSplitting - are somewhat timing dependent. For example I will abort a few region servers and wait at most few seconds for all those servers to go down. Sometimes it takes longer and the test fails. Last night I had to bump up the time-limit in one such test (testThreeRSAbort()). I am sure these tests can be made more robust ....

          Show
          Prakash Khemani added a comment - I uploaded a new diff at the review board https://review.cloudera.org/r/1655/ I think it takes care of all of Stack's comments. added a new test in TestHLogSplit to test that when skip-errors is set to true then corrupted log files are ignored and correctly moved to the .corrupted directory. Some of the tests - especially in TestDistributedLogSplitting - are somewhat timing dependent. For example I will abort a few region servers and wait at most few seconds for all those servers to go down. Sometimes it takes longer and the test fails. Last night I had to bump up the time-limit in one such test (testThreeRSAbort()). I am sure these tests can be made more robust ....
          Hide
          stack added a comment -

          Latest patch looks good but still fails TestDistributedLogSplitting it seems. I'm on apache hbase trunk and apply your r6 patch from review board. I get:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.master.TestDistributedLogSplitting
          -------------------------------------------------------------------------------
          Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 164.652 sec <<< FAILURE!
          testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)  Time elapsed: 48.77 sec  <<< FAILURE!
          java.lang.AssertionError: region server completed the split before aborting
                  at org.junit.Assert.fail(Assert.java:91)
                  at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:291)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
                  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
                  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
                  at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
                  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
                  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
                  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
                  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
                  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
                  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
                  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
                  at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
                  at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
                  at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
                  at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
          

          I'll attach the log.

          If you want me to dig in too Prakash, just say and I'll have a go. Otherwise defer to you since you know best whats in motion here.

          Meantime I'm going to try it out on a cluster. Good stuff Prakash!

          Show
          stack added a comment - Latest patch looks good but still fails TestDistributedLogSplitting it seems. I'm on apache hbase trunk and apply your r6 patch from review board. I get: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.master.TestDistributedLogSplitting ------------------------------------------------------------------------------- Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 164.652 sec <<< FAILURE! testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 48.77 sec <<< FAILURE! java.lang.AssertionError: region server completed the split before aborting at org.junit.Assert.fail(Assert.java:91) at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165) at org.apache.maven.surefire.Surefire.run(Surefire.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005) I'll attach the log. If you want me to dig in too Prakash, just say and I'll have a go. Otherwise defer to you since you know best whats in motion here. Meantime I'm going to try it out on a cluster. Good stuff Prakash!
          Hide
          stack added a comment -

          Failed test output.

          Show
          stack added a comment - Failed test output.
          Hide
          stack added a comment -

          I tried it on a small cluster w/ loads of regions under load > 1k regions each. It worked for me. Its great. Fast.

          Show
          stack added a comment - I tried it on a small cluster w/ loads of regions under load > 1k regions each. It worked for me. Its great. Fast.
          Hide
          Prakash Khemani added a comment -

          This is a problem with the test-case which I will fix.

          You got this error because of (1) not being able to interrupt the split-log-worker thread when it is doing dfs operations (I think the interrupt is swallowed somewhere) (2) timing issues where in the aborting region server the filesystem and the zk session don't close before the split-log-worker thread completes its splitting task ...

          I will fix this by removing fail("region server completed the split before aborting") from the test case.

          Show
          Prakash Khemani added a comment - This is a problem with the test-case which I will fix. You got this error because of (1) not being able to interrupt the split-log-worker thread when it is doing dfs operations (I think the interrupt is swallowed somewhere) (2) timing issues where in the aborting region server the filesystem and the zk session don't close before the split-log-worker thread completes its splitting task ... I will fix this by removing fail("region server completed the split before aborting") from the test case.
          Hide
          Prakash Khemani added a comment -

          'fixed' TestDistriButedLogSplitting.testWorkerAbort() by not letting the test fail if the aborting region server completes the split before it closes dfs or zk session.

          uploaded a new patch in rb

          Show
          Prakash Khemani added a comment - 'fixed' TestDistriButedLogSplitting.testWorkerAbort() by not letting the test fail if the aborting region server completes the split before it closes dfs or zk session. uploaded a new patch in rb
          Hide
          stack added a comment -

          This fails for me Prakash. Does it pass for you?

          Running org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
          Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.107 sec <<< FAILURE!
          

          Here is the fail:

          -------------------------------------------------------------------------------
          Test set: org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
          -------------------------------------------------------------------------------
          Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.107 sec <<< FAILURE!
          testRaceForTask(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker)  Time elapsed: 0.182 sec  <<< FAILURE!
          java.lang.AssertionError: 
                  at org.junit.Assert.fail(Assert.java:91)
                  at org.junit.Assert.assertTrue(Assert.java:43)
                  at org.junit.Assert.assertTrue(Assert.java:54)
                  at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker.waitForCounter(TestSplitLogWorker.java:75)
                  at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker.testRaceForTask(TestSplitLogWorker.java:165)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
                  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
                  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
                  at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
                  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
                  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
                  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
                  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
                  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
                  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
                  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
                  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
                  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
                  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
                  at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
                  at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
                  at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
                  at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
                  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                  at java.lang.reflect.Method.invoke(Method.java:616)
                  at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
                  at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
          

          Let me know if you want me to post the log.

          Good stuff.

          Show
          stack added a comment - This fails for me Prakash. Does it pass for you? Running org.apache.hadoop.hbase.regionserver.TestSplitLogWorker Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.107 sec <<< FAILURE! Here is the fail: ------------------------------------------------------------------------------- Test set: org.apache.hadoop.hbase.regionserver.TestSplitLogWorker ------------------------------------------------------------------------------- Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.107 sec <<< FAILURE! testRaceForTask(org.apache.hadoop.hbase.regionserver.TestSplitLogWorker) Time elapsed: 0.182 sec <<< FAILURE! java.lang.AssertionError: at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker.waitForCounter(TestSplitLogWorker.java:75) at org.apache.hadoop.hbase.regionserver.TestSplitLogWorker.testRaceForTask(TestSplitLogWorker.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165) at org.apache.maven.surefire.Surefire.run(Surefire.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005) Let me know if you want me to post the log. Good stuff.
          Hide
          Prakash Khemani added a comment -

          Yes, it passes for me - just ran it again.

          This is another of timing related errors.

          164 waitForCounter(tot_wkr_task_acquired, 0, 1, 100);
          165 waitForCounter(tot_wkr_failed_to_grab_task_lost_race, 0, 1, 100);

          In your case the failure occurred when in line 165 the counter tot_wkr_failed_to_grab_task_lost_race did not change value from 0 to 1 in 100ms.

          Can you please increase the timeout in both these lines from 100ms to 1000ms and retry ...

          I will go over all my tests and try to improve them but I won't be able to get to that before the end of this week.

          Show
          Prakash Khemani added a comment - Yes, it passes for me - just ran it again. This is another of timing related errors. 164 waitForCounter(tot_wkr_task_acquired, 0, 1, 100); 165 waitForCounter(tot_wkr_failed_to_grab_task_lost_race, 0, 1, 100); In your case the failure occurred when in line 165 the counter tot_wkr_failed_to_grab_task_lost_race did not change value from 0 to 1 in 100ms. Can you please increase the timeout in both these lines from 100ms to 1000ms and retry ... I will go over all my tests and try to improve them but I won't be able to get to that before the end of this week.
          Hide
          stack added a comment -

          Hmm... i ran test on mac and linux and now it passes. I'll up the timer on commit anyways.

          Show
          stack added a comment - Hmm... i ran test on mac and linux and now it passes. I'll up the timer on commit anyways.
          Hide
          stack added a comment -

          Committed to TRUNK. Thanks for the sweet feature Prakash. If issues w/ tests subsequent to this commit, lets address them in separate issues (like I say, tests are passing for me on linux and macos).

          Show
          stack added a comment - Committed to TRUNK. Thanks for the sweet feature Prakash. If issues w/ tests subsequent to this commit, lets address them in separate issues (like I say, tests are passing for me on linux and macos).
          Hide
          Jonathan Gray added a comment -

          Great work Prakash!

          Show
          Jonathan Gray added a comment - Great work Prakash!
          Hide
          mingjian added a comment -

          It's a great work!
          But I can't pass the test unit TestDistributedLogSplitting. I used 1364-v5.txt.

          Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 159.419 sec <<< FAILURE!
          testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 60.897 sec <<< FAILURE!
          java.lang.AssertionError: region server completed the split before aborting
          at org.junit.Assert.fail(Assert.java:91)
          at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:289)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:616)

          Show
          mingjian added a comment - It's a great work! But I can't pass the test unit TestDistributedLogSplitting. I used 1364-v5.txt. Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 159.419 sec <<< FAILURE! testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 60.897 sec <<< FAILURE! java.lang.AssertionError: region server completed the split before aborting at org.junit.Assert.fail(Assert.java:91) at org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testWorkerAbort(TestDistributedLogSplitting.java:289) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616)
          Hide
          stack added a comment -

          @mingjian Are you running HBase trunk? That test passes for me.

          Show
          stack added a comment - @mingjian Are you running HBase trunk? That test passes for me.
          Hide
          mingjian added a comment -

          I used 0.90.3 + this patch.
          Because I want to patch it into our system.

          Show
          mingjian added a comment - I used 0.90.3 + this patch. Because I want to patch it into our system.
          Hide
          mingjian added a comment -

          I saw the function "public void splitLog(final String serverName)":

                  try {
                    splitLogSize = splitLogManager.splitLogDistributed(logDir);
                  } catch (OrphanHLogAfterSplitException e) {
                    LOG.warn("Retrying distributed splitting for " +
                        serverName + "because of:", e);
                    splitLogManager.splitLogDistributed(logDir);
                  }
          

          The method "splitLogDistributed" blocks until all the log files of the given region server have been rocessed. In this way, it would spend much time if there are many regionserver dirs in the .logs dir.

          Why not put all regionserver log dirs in this method?

          Show
          mingjian added a comment - I saw the function "public void splitLog(final String serverName)": try { splitLogSize = splitLogManager.splitLogDistributed(logDir); } catch (OrphanHLogAfterSplitException e) { LOG.warn( "Retrying distributed splitting for " + serverName + "because of:" , e); splitLogManager.splitLogDistributed(logDir); } The method "splitLogDistributed" blocks until all the log files of the given region server have been rocessed. In this way, it would spend much time if there are many regionserver dirs in the .logs dir. Why not put all regionserver log dirs in this method?
          Hide
          Prakash Khemani added a comment -

          splitLog is called for one region server at a time - it is not called directly on the .logs directory. The servername passed is the name of the regionserver's log directory under .logs. It is called at startup by the function splitLogAfterStartup(). It is splitLogAfterStartup() that calls splitLog() for one regionserver at a time and that can take a long time.

          It is probably better to call splitLog() for all region servers simultaneously. A large number of splitlog tasks will get scheduled - one for each log file. But a splitlog-worker (region server) executes only one task at a time and there shouldn't be a danger of DFS overload.

          Show
          Prakash Khemani added a comment - splitLog is called for one region server at a time - it is not called directly on the .logs directory. The servername passed is the name of the regionserver's log directory under .logs. It is called at startup by the function splitLogAfterStartup(). It is splitLogAfterStartup() that calls splitLog() for one regionserver at a time and that can take a long time. It is probably better to call splitLog() for all region servers simultaneously. A large number of splitlog tasks will get scheduled - one for each log file. But a splitlog-worker (region server) executes only one task at a time and there shouldn't be a danger of DFS overload.
          Hide
          stack added a comment -

          @Prakash Should I open a new JIRA to schedule all logs for all regionservers? Good stuff.

          Show
          stack added a comment - @Prakash Should I open a new JIRA to schedule all logs for all regionservers? Good stuff.
          Hide
          Prakash Khemani added a comment -

          Filed https://issues.apache.org/jira/browse/HBASE-3963. Will try to get this done.

          Show
          Prakash Khemani added a comment - Filed https://issues.apache.org/jira/browse/HBASE-3963 . Will try to get this done.
          Hide
          stack added a comment -

          @mingjian Since you are looking the distributed code now, maybe you'd be up for having a go at HBASE-3963? Or at least posting a patch that you've tried for Prakash and/or I to review? Thanks.

          Show
          stack added a comment - @mingjian Since you are looking the distributed code now, maybe you'd be up for having a go at HBASE-3963 ? Or at least posting a patch that you've tried for Prakash and/or I to review? Thanks.
          Hide
          mingjian added a comment -

          @stack & Prakash I will attach a patch in HBASE-3963.

          Show
          mingjian added a comment - @stack & Prakash I will attach a patch in HBASE-3963 .

            People

            • Assignee:
              Prakash Khemani
              Reporter:
              stack
            • Votes:
              1 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 2h Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10h
                10h

                  Development