HBase
  1. HBase
  2. HBASE-69

[hbase] Make cache flush triggering less simplistic

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 0.2.0
    • Component/s: regionserver
    • Labels:
      None

      Description

      When flusher runs – its triggered when the sum of all Stores in a Region > a configurable max size – we flush all Stores though a Store memcache might have but a few bytes.

      I would think Stores should only dump their memcache disk if they have some substance.

      The problem becomes more acute, the more families you have in a Region.

      Possible behaviors would be to dump the biggest Store only, or only those Stores > 50% of max memcache size. Behavior would vary dependent on the prompt that provoked the flush. Would also log why the flush is running: optional or > max size.

      This issue comes out of HADOOP-2621.

      1. patch.txt
        183 kB
        Jim Kellerman
      2. patch.txt
        183 kB
        Jim Kellerman
      3. patch.txt
        168 kB
        Jim Kellerman
      4. patch.txt
        146 kB
        Jim Kellerman
      5. patch.txt
        144 kB
        Jim Kellerman
      6. patch.txt
        181 kB
        Jim Kellerman
      7. patch.txt
        180 kB
        Jim Kellerman
      8. patch.txt
        180 kB
        Jim Kellerman
      9. patch.txt
        179 kB
        Jim Kellerman
      10. patch.txt
        176 kB
        Jim Kellerman
      11. patch.txt
        171 kB
        Jim Kellerman
      12. patch.txt
        168 kB
        Jim Kellerman
      13. patch.txt
        169 kB
        Jim Kellerman
      14. patch.txt
        145 kB
        Jim Kellerman

        Issue Links

          Activity

          Hide
          stack added a comment -

          I'd suggest making a new patch that has in it the improvements that are contained in this patch and lets commit that (probably in a new issue). Here's some comments:

          + +1 on FSUtils changes.
          + On migrations, would suggest that you emit logging of what step you are currently in: e.g. "moving to version 2". Will help should a step fail.
          + This is not a change made by you in Migrate but I wonder if the making fully qualified works w/ our new hbase.rootdir format where you need to specify the filesystem to use:

          -      Path rootdir =
          -        fs.makeQualified(new Path(this.conf.get(HConstants.HBASE_DIR)));
          +      rootdir = fs.makeQualified(new Path(this.conf.get(HConstants.HBASE_DIR)));
          

          + +1 on new names of methods in Migrate.java
          + +1 on IdentityTableReduce.java changes
          + What about the HSK changes? Should we add them? They get rid of TextSequence. Do you think this change for the better?
          + Whats going on here?

          -  final AtomicInteger activeScanners = new AtomicInteger(0);
          +  volatile AtomicInteger activeScanners = new AtomicInteger(0);
          

          Why make reference to 'activeScanners' volatile rather than final? Once created as part of class creation, its not going to change? Volatile doesn't effect the content of the AtomicInteger, does it?
          + +1 on change to TestMigrate.java. Fixes currently broken migration scripts (They are failing on hudson).

          Rest of the patch seems like all or nothing.

          Show
          stack added a comment - I'd suggest making a new patch that has in it the improvements that are contained in this patch and lets commit that (probably in a new issue). Here's some comments: + +1 on FSUtils changes. + On migrations, would suggest that you emit logging of what step you are currently in: e.g. "moving to version 2". Will help should a step fail. + This is not a change made by you in Migrate but I wonder if the making fully qualified works w/ our new hbase.rootdir format where you need to specify the filesystem to use: - Path rootdir = - fs.makeQualified( new Path( this .conf.get(HConstants.HBASE_DIR))); + rootdir = fs.makeQualified( new Path( this .conf.get(HConstants.HBASE_DIR))); + +1 on new names of methods in Migrate.java + +1 on IdentityTableReduce.java changes + What about the HSK changes? Should we add them? They get rid of TextSequence. Do you think this change for the better? + Whats going on here? - final AtomicInteger activeScanners = new AtomicInteger(0); + volatile AtomicInteger activeScanners = new AtomicInteger(0); Why make reference to 'activeScanners' volatile rather than final? Once created as part of class creation, its not going to change? Volatile doesn't effect the content of the AtomicInteger, does it? + +1 on change to TestMigrate.java. Fixes currently broken migration scripts (They are failing on hudson). Rest of the patch seems like all or nothing.
          Hide
          Jim Kellerman added a comment -

          Unfortunately this patch did not yield the expected results. It still OOMe's during a sequential write test (which I don't understand), and really isn't an improvement over trunk.

          Closing as "won't fix"

          Show
          Jim Kellerman added a comment - Unfortunately this patch did not yield the expected results. It still OOMe's during a sequential write test (which I don't understand), and really isn't an improvement over trunk. Closing as "won't fix"
          Hide
          Jim Kellerman added a comment -

          Latest patch. OOMEs during PerformanceEvaluation sequential write test, which doesn't make sense since sequential write should have at most two regions open and cached ~128MB of data in memory. Since this patch cannot handle this simple case,I don't think it should be committed.

          We can salvage the changes to HMaster and Migrate since those are needed, but otherwise, I think this effort should be shut down and tossed.

          I have spent too much time on it and it does not bring enough to the table to be considered for inclusion.

          Show
          Jim Kellerman added a comment - Latest patch. OOMEs during PerformanceEvaluation sequential write test, which doesn't make sense since sequential write should have at most two regions open and cached ~128MB of data in memory. Since this patch cannot handle this simple case,I don't think it should be committed. We can salvage the changes to HMaster and Migrate since those are needed, but otherwise, I think this effort should be shut down and tossed. I have spent too much time on it and it does not bring enough to the table to be considered for inclusion.
          Hide
          Jim Kellerman added a comment -

          Seems to actually work. Your milage may vary.

          Show
          Jim Kellerman added a comment - Seems to actually work. Your milage may vary.
          Hide
          Jim Kellerman added a comment -

          Ok, I finally have a patch for this issue that works !

          It started at:
          2008-02-21 01:17:22,401 DEBUG [regionserver/0:0:0:0:0:0:0:0:8020] hbase.HRegionServer(1070): Done telling master we are up

          and ended with:

          2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(158): Starting shutdown thread.
          2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(163): Shutdown thread complete

          Show
          Jim Kellerman added a comment - Ok, I finally have a patch for this issue that works ! It started at: 2008-02-21 01:17:22,401 DEBUG [regionserver/0:0:0:0:0:0:0:0:8020] hbase.HRegionServer(1070): Done telling master we are up and ended with: 2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(158): Starting shutdown thread. 2008-02-21 02:50:45,783 INFO [Thread-8] hbase.HRegionServer$ShutdownThread(163): Shutdown thread complete
          Hide
          Billy Pearson added a comment -

          I will have to run the test again with the patch I remove logs often as the debug takes up a lots of GB when running large jobs.

          Show
          Billy Pearson added a comment - I will have to run the test again with the patch I remove logs often as the debug takes up a lots of GB when running large jobs.
          Hide
          Jim Kellerman added a comment -

          Billy,

          Clearly there is a problem. Can you post a bit more of that region server's log (before the blocking message)? Maybe that will point me in the right direction.

          Show
          Jim Kellerman added a comment - Billy, Clearly there is a problem. Can you post a bit more of that region server's log (before the blocking message)? Maybe that will point me in the right direction.
          Hide
          Billy Pearson added a comment -

          I get these

          INFO org.apache.hadoop.hbase.HRegion: Blocking updates for 'IPC Server handler 21 on 60020': cache flushes requested 8 is >= max flush request count 8
          

          then everything stops not long after a job starts an no more processing everything is blocked I waited 3 hours to see if the option flush would unblock but it does not seam to be happening
          might take a look at the memcache blocker again

          Show
          Billy Pearson added a comment - I get these INFO org.apache.hadoop.hbase.HRegion: Blocking updates for 'IPC Server handler 21 on 60020': cache flushes requested 8 is >= max flush request count 8 then everything stops not long after a job starts an no more processing everything is blocked I waited 3 hours to see if the option flush would unblock but it does not seam to be happening might take a look at the memcache blocker again
          Hide
          stack added a comment -

          If memcache(s) are full, we should block updates. Otherwise, this patch is a step backward.

          Show
          stack added a comment - If memcache(s) are full, we should block updates. Otherwise, this patch is a step backward.
          Hide
          Jim Kellerman added a comment -

          > stack - 12/Feb/08 09:07 AM
          > Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher
          > in one-store case?)

          If we block based on the number of outstanding cache flush requests (which is currently how it works for
          more than one store), then we cannot take updates while that single store is flushed, which didn't seem
          like a good idea.

          What I wanted to do was to put a policy in place for this patch that was "good enough" until more
          sophisticated policies (which balance the memory in the region server) were done in HBASE-70,
          which is going to address region server memory management.

          Show
          Jim Kellerman added a comment - > stack - 12/Feb/08 09:07 AM > Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher > in one-store case?) If we block based on the number of outstanding cache flush requests (which is currently how it works for more than one store), then we cannot take updates while that single store is flushed, which didn't seem like a good idea. What I wanted to do was to put a policy in place for this patch that was "good enough" until more sophisticated policies (which balance the memory in the region server) were done in HBASE-70 , which is going to address region server memory management.
          Hide
          Jim Kellerman added a comment -

          > Billy Pearson - 12/Feb/08 08:32 AM
          > sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them
          > that would mean that I am rolling the hlog many times after several optional memcash flushes while
          > jobs are running.

          If the hlogs have > 4 hours between them then you can only expect a garbage collection once every
          4 hours.

          > Might check to make sure the option flush is updating the sequence id even if there
          > is 0 entries and the flush is skipped. The problem could be from the code that removed the code before
          > I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these
          > messages after this patches.

          I can assure you that the sequence id is getting updated and flushed to the log with an optional cache
          fllush even with no entries.

          In this patch and in trunk, optional flushes are put on the flush queue just like requested flushes.
          (See HRegionServer$Flusher.run)

          When their queue entry triggers,

          • in trunk: HRegion.flushCache() is called.
          • in the patch HRegion.flushCache(HStore) is called

          They both end up in HRegion.internalFlushcache, which:

          • first obtains a sequence Id for the log.
          • calls HStore.flushCache(sequenceId)

          Even if the cache is not flushed, in HRegion.internalFlushcache, both trunk and the patch call
          HLog.completeCacheFlush which writes the new sequence id (for the region in the case of trunk
          or for the store in the case of the patch).

          However no log files are removed until the current log is rolled (closed and a new one opened).

          Show
          Jim Kellerman added a comment - > Billy Pearson - 12/Feb/08 08:32 AM > sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them > that would mean that I am rolling the hlog many times after several optional memcash flushes while > jobs are running. If the hlogs have > 4 hours between them then you can only expect a garbage collection once every 4 hours. > Might check to make sure the option flush is updating the sequence id even if there > is 0 entries and the flush is skipped. The problem could be from the code that removed the code before > I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these > messages after this patches. I can assure you that the sequence id is getting updated and flushed to the log with an optional cache fllush even with no entries. In this patch and in trunk, optional flushes are put on the flush queue just like requested flushes. (See HRegionServer$Flusher.run) When their queue entry triggers, in trunk: HRegion.flushCache() is called. in the patch HRegion.flushCache(HStore) is called They both end up in HRegion.internalFlushcache, which: first obtains a sequence Id for the log. calls HStore.flushCache(sequenceId) Even if the cache is not flushed, in HRegion.internalFlushcache, both trunk and the patch call HLog.completeCacheFlush which writes the new sequence id (for the region in the case of trunk or for the store in the case of the patch). However no log files are removed until the current log is rolled (closed and a new one opened).
          Hide
          Billy Pearson added a comment -

          ok I am seeing the rolling messages now and old logs are getting removed. going to update to latest patch and try again

          Show
          Billy Pearson added a comment - ok I am seeing the rolling messages now and old logs are getting removed. going to update to latest patch and try again
          Hide
          stack added a comment -

          Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher in one-store case?)

          Show
          stack added a comment - Why when only one store, blocking is disabled? (Whats to stop update rate from overwhelming flusher in one-store case?)
          Hide
          Billy Pearson added a comment -

          sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them that would mean that I am rolling the hlog many times after several optional memcash flushes while jobs are running.

          Might check to make sure the option flush is updating the sequence id even if there is 0 entries and the flush is skipped.
          The problem could be from the code that removed the code before I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these messages after this patches.

          Show
          Billy Pearson added a comment - sorry my last post I am talking about while the cluster is busy. The hlogs have > 4 hours between them that would mean that I am rolling the hlog many times after several optional memcash flushes while jobs are running. Might check to make sure the option flush is updating the sequence id even if there is 0 entries and the flush is skipped. The problem could be from the code that removed the code before I used to see 0 logs removed while having debug turned on and hlog rolling but I do not see these messages after this patches.
          Hide
          Jim Kellerman added a comment -

          > Billy Pearson - 12/Feb/08 01:56 AM
          > I thank we are still missing something when I talk about hlog build up I mean there is never a removal of
          > the hlogs unless there is a split or a shutdown.

          As I explained previously, the only time that HLogs are garbage collected, is when either the log fills, or
          when the region server shuts down. Since there are not many updates coming in in the situation you
          described, the log will not fill and consequently old logs will not be garbage collected.

          HBASE-440, will add optional log rolling, and log rolling when a region is closed in addition to the current
          log rolling when the log fills. Both of these new events will cause logs to be garbage collected. Region
          server shutdown will delete all the old logs.

          Show
          Jim Kellerman added a comment - > Billy Pearson - 12/Feb/08 01:56 AM > I thank we are still missing something when I talk about hlog build up I mean there is never a removal of > the hlogs unless there is a split or a shutdown. As I explained previously, the only time that HLogs are garbage collected, is when either the log fills, or when the region server shuts down. Since there are not many updates coming in in the situation you described, the log will not fill and consequently old logs will not be garbage collected. HBASE-440 , will add optional log rolling, and log rolling when a region is closed in addition to the current log rolling when the log fills. Both of these new events will cause logs to be garbage collected. Region server shutdown will delete all the old logs.
          Hide
          Billy Pearson added a comment - - edited

          have not tried newest patch yet for this problem.
          I thank we are still missing something when I talk about hlog build up I mean there is never a removal of the hlogs unless there is a split or a shutdown.

          I have over 106 hlog on one server
          hbase.regionserver.maxlogentries = 250000
          The modification time of the oldest to the newest logs files are much greater then the 30 min option flush time set in hbase-default.xml

          Show
          Billy Pearson added a comment - - edited have not tried newest patch yet for this problem. I thank we are still missing something when I talk about hlog build up I mean there is never a removal of the hlogs unless there is a split or a shutdown. I have over 106 hlog on one server hbase.regionserver.maxlogentries = 250000 The modification time of the oldest to the newest logs files are much greater then the 30 min option flush time set in hbase-default.xml
          Hide
          Jim Kellerman added a comment -
          • Update migration tool and change file system version number
          • FSUtils.checkVersion now returns the version string
          • Less simplistic update blocking mechanism is based on the number of stores and how many have requested cache flushes. Block if the number of stores that have requested flushes >= number of stores - 1, unless there is only one store. If only one store no blocking is done. A more sophisticated memory management system will be part of HBASE-70
          • HRegion now implements HStoreListener so it can intercept and count cache flush requests
          Show
          Jim Kellerman added a comment - Update migration tool and change file system version number FSUtils.checkVersion now returns the version string Less simplistic update blocking mechanism is based on the number of stores and how many have requested cache flushes. Block if the number of stores that have requested flushes >= number of stores - 1, unless there is only one store. If only one store no blocking is done. A more sophisticated memory management system will be part of HBASE-70 HRegion now implements HStoreListener so it can intercept and count cache flush requests
          Hide
          Jim Kellerman added a comment -

          > Billy Pearson - 11/Feb/08 08:59 PM
          > I am still seeing a hlog build up problem
          >
          > example I see these a lot after a job is done and the server are idle
          >
          > 2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for
          > 519281761/anchor because it has 0 entries
          > 2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache
          > flush for store 519281761/anchor in 1ms, sequenceid=131598230
          >
          > I assume this is a optional flush which is good to have but if there is no entries can we update the
          > columns current sequence id for that column to our current max sequence id so we can remove the
          > old logs after the next hlog ?

          Yes, this is an optional cache flush. The column's max sequence id is updated after a cache flush:

              long sequenceId = log.startCacheFlush();
          ...
              this.log.completeCacheFlush(store.storeName, getTableDesc().getName(),
                  sequenceId);
          

          > What I am seeing is low to no updated columns never get a memcache flush so there sequence id
          > never changes (unless there is a split) and the old hlogs never get removed.

          Low to no updated columns will only get an optional cache flush, which will set their sequence
          number. However log files are not garbage collected until the log fills up and is rolled, which
          is not happening if the region server is idle. The only other time log files are cleaned up is when
          the region server shuts down.

          I have created issue HBASE-440 to add optional log rolling so that idle region servers will garbage collect old log files.

          Show
          Jim Kellerman added a comment - > Billy Pearson - 11/Feb/08 08:59 PM > I am still seeing a hlog build up problem > > example I see these a lot after a job is done and the server are idle > > 2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for > 519281761/anchor because it has 0 entries > 2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache > flush for store 519281761/anchor in 1ms, sequenceid=131598230 > > I assume this is a optional flush which is good to have but if there is no entries can we update the > columns current sequence id for that column to our current max sequence id so we can remove the > old logs after the next hlog ? Yes, this is an optional cache flush. The column's max sequence id is updated after a cache flush: long sequenceId = log.startCacheFlush(); ... this .log.completeCacheFlush(store.storeName, getTableDesc().getName(), sequenceId); > What I am seeing is low to no updated columns never get a memcache flush so there sequence id > never changes (unless there is a split) and the old hlogs never get removed. Low to no updated columns will only get an optional cache flush, which will set their sequence number. However log files are not garbage collected until the log fills up and is rolled, which is not happening if the region server is idle. The only other time log files are cleaned up is when the region server shuts down. I have created issue HBASE-440 to add optional log rolling so that idle region servers will garbage collect old log files.
          Hide
          Billy Pearson added a comment -

          I am still seeing a hlog build up problem

          example I see these a lot after a job is done and the server are idle

          2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for 519281761/anchor because it has 0 entries
          2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 519281761/anchor in 1ms, sequenceid=131598230
          

          I assume this is a optional flush which is good to have but if there is no entries can we update the columns current sequence id for that column to our current max sequence id so we can remove the old logs after the next hlog ?

          What I am seeing is low to no updated columns never get a memcache flush so there sequence id never changes (unless there is a split) and the old hlogs never get removed.

          Show
          Billy Pearson added a comment - I am still seeing a hlog build up problem example I see these a lot after a job is done and the server are idle 2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for 519281761/anchor because it has 0 entries 2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 519281761/anchor in 1ms, sequenceid=131598230 I assume this is a optional flush which is good to have but if there is no entries can we update the columns current sequence id for that column to our current max sequence id so we can remove the old logs after the next hlog ? What I am seeing is low to no updated columns never get a memcache flush so there sequence id never changes (unless there is a split) and the old hlogs never get removed.
          Hide
          Jim Kellerman added a comment -

          > Billy Pearson - 11/Feb/08 02:00 PM
          > I get Blocking updates messages but no flushing happens so everything gos on hold and no new
          > transaction happen all my threads get blocked

          This logic is currently very unsophisticated. I'll try to come up with something better in the next patch. In the mean time, setting hbase.hregion.memcache.block.multiplier to (at least) the number of families should make things work.

          Show
          Jim Kellerman added a comment - > Billy Pearson - 11/Feb/08 02:00 PM > I get Blocking updates messages but no flushing happens so everything gos on hold and no new > transaction happen all my threads get blocked This logic is currently very unsophisticated. I'll try to come up with something better in the next patch. In the mean time, setting hbase.hregion.memcache.block.multiplier to (at least) the number of families should make things work.
          Hide
          Billy Pearson added a comment -

          I get Blocking updates messages but no flushing happens so everything gos on hold and no new transaction happen all my threads get blocked

          Show
          Billy Pearson added a comment - I get Blocking updates messages but no flushing happens so everything gos on hold and no new transaction happen all my threads get blocked
          Hide
          Jim Kellerman added a comment -

          This patch should address the issue with only store being flushed.

          It contains fixes for TTMR and TTI

          It speeds up TestLogRolling by a factor of 3

          Cache sizes bubble up from HStore to HRegion, but aside from blocking updates, no other memory accounting is done.

          Please review and test.

          Show
          Jim Kellerman added a comment - This patch should address the issue with only store being flushed. It contains fixes for TTMR and TTI It speeds up TestLogRolling by a factor of 3 Cache sizes bubble up from HStore to HRegion, but aside from blocking updates, no other memory accounting is done. Please review and test.
          Hide
          Jim Kellerman added a comment -

          Other things to do for this issue:

          • HStores need to inform region about cache sizes, etc so we can do region level memory usage accounting
          • Update migration tool so that it will check for unrecovered region server logs
          Show
          Jim Kellerman added a comment - Other things to do for this issue: HStores need to inform region about cache sizes, etc so we can do region level memory usage accounting Update migration tool so that it will check for unrecovered region server logs
          Hide
          Jim Kellerman added a comment -

          Billy,

          Thanks for the feedback. Clearly there is a bug here.

          Show
          Jim Kellerman added a comment - Billy, Thanks for the feedback. Clearly there is a bug here.
          Hide
          Billy Pearson added a comment -

          shut down region server before it could crash and the flushes of 2 other columns where in the 200MB range and flush size was set at 64MB

          Show
          Billy Pearson added a comment - shut down region server before it could crash and the flushes of 2 other columns where in the 200MB range and flush size was set at 64MB
          Hide
          Billy Pearson added a comment -

          latest patch only the first column family is getting flushed, I got 3 heave inserted columns and only one is getting flushed and the region memory usage is rising fast.

          Show
          Billy Pearson added a comment - latest patch only the first column family is getting flushed, I got 3 heave inserted columns and only one is getting flushed and the region memory usage is rising fast.
          Hide
          Jim Kellerman added a comment -

          A new version of this patch that builds on, rather than ignores HBASE-138.

          What I'd like to see is:

          • a code review from stack
          • Billy Pearson, Bryan Duxbury and Stack run this patch through their bulk load tests to ensure that this patch is better than the current code.
          Show
          Jim Kellerman added a comment - A new version of this patch that builds on, rather than ignores HBASE-138 . What I'd like to see is: a code review from stack Billy Pearson, Bryan Duxbury and Stack run this patch through their bulk load tests to ensure that this patch is better than the current code.
          Hide
          Billy Pearson added a comment -

          I thank what is happening is while the compaction is working on a column any request to add a compaction to the queue is rejected but after the compaction starts new mapfiles may exceed the threshold of 3 map files. This leaves extra map files waiting for the next memcache flush to add a new compaction to the queue after the compaction completes on the region.

          I do not thank this is much of a problem there will be a memcache flush some time down the road starting a new compaction.

          Show
          Billy Pearson added a comment - I thank what is happening is while the compaction is working on a column any request to add a compaction to the queue is rejected but after the compaction starts new mapfiles may exceed the threshold of 3 map files. This leaves extra map files waiting for the next memcache flush to add a new compaction to the queue after the compaction completes on the region. I do not thank this is much of a problem there will be a memcache flush some time down the road starting a new compaction.
          Hide
          Billy Pearson added a comment - - edited

          Might check out the compaction queue I had 1 out or 2 regions on one server split and reload the splits on the different server and my job finished with in a few mins after the split
          Only two more columns compacted after the split and the job finished. I now have column family's that have more then 3 map files and no compaction happening.

          Not sure if the stop of updates coming in or the split mess up the compaction queue or the split cleared it out.
          Before the split happened there was several requested for compaction for the region that did not split.

          Show
          Billy Pearson added a comment - - edited Might check out the compaction queue I had 1 out or 2 regions on one server split and reload the splits on the different server and my job finished with in a few mins after the split Only two more columns compacted after the split and the job finished. I now have column family's that have more then 3 map files and no compaction happening. Not sure if the stop of updates coming in or the split mess up the compaction queue or the split cleared it out. Before the split happened there was several requested for compaction for the region that did not split.
          Hide
          Billy Pearson added a comment -

          example of hlog above

          Region server log

          2008-02-04 19:36:50,813 FATAL org.apache.hadoop.hbase.HRegionServer: unable to report to master for 34439 milliseconds - aborting server
          

          Master Server Log

          2008-02-04 19:47:45,456 DEBUG org.apache.hadoop.hbase.HLog: Applied 15820000 edits
          

          The file for the region
          oldlogfile.log = 3.46GB

          Show
          Billy Pearson added a comment - example of hlog above Region server log 2008-02-04 19:36:50,813 FATAL org.apache.hadoop.hbase.HRegionServer: unable to report to master for 34439 milliseconds - aborting server Master Server Log 2008-02-04 19:47:45,456 DEBUG org.apache.hadoop.hbase.HLog: Applied 15820000 edits The file for the region oldlogfile.log = 3.46GB
          Hide
          Billy Pearson added a comment -

          Answer above yes it works with out the blocking.

          Show
          Billy Pearson added a comment - Answer above yes it works with out the blocking.
          Hide
          Billy Pearson added a comment -

          Currently flushes and compaction's work good from what I can tell on my setup and test.

          There are two area I have concerns in and have not got a chance to test.

          1. hlogs: If I have a column family that does not get updates but say 1 out 100-250 updates then is that region going to hold up the removal of old hlogs waiting for a flush from this column. If this is so on column family could make a recovery take a long time if the region server fails. This is one of the reason besides memory usage I thank we need to leave/add back the option flusher to flush every 30-60 mins.

          2. Splits: If I have a large region split in to two the compaction starts on the reload of the new splits. But say the columns take 50 mins to compact in that 50 min. If I get updates to cause a split again will this fail if the region has not finished compacting all the regions reference files from the original split.

          Out side of the above concerns I have not noticed any bugs in the patch while flushing or compaction's all seams ok in that area.

          Show
          Billy Pearson added a comment - Currently flushes and compaction's work good from what I can tell on my setup and test. There are two area I have concerns in and have not got a chance to test. 1. hlogs: If I have a column family that does not get updates but say 1 out 100-250 updates then is that region going to hold up the removal of old hlogs waiting for a flush from this column. If this is so on column family could make a recovery take a long time if the region server fails. This is one of the reason besides memory usage I thank we need to leave/add back the option flusher to flush every 30-60 mins. 2. Splits: If I have a large region split in to two the compaction starts on the reload of the new splits. But say the columns take 50 mins to compact in that 50 min. If I get updates to cause a split again will this fail if the region has not finished compacting all the regions reference files from the original split. Out side of the above concerns I have not noticed any bugs in the patch while flushing or compaction's all seams ok in that area.
          Hide
          Jim Kellerman added a comment -

          Billy Pearson - 03/Feb/08 09:56 PM:
          > are we loseing the hbase.hregion.memcache.block.multiplier option in this patch? Before I applied the
          > patch I was seeing blocking messages logged while flushes where happening but not after I applied
          > the patch.

          Yes, in this version of the patch, blocking of updates was removed because all the memcache size accounting is done at the store level rather than the region level.

          What needs to be done is to bubble up store memcache size to the region level so that we can implement blocking based on the largest store memcache and not just the sum of the size of all the updates to the region.

          The question is, does this patch work without the blocking or does it need to be put back?

          Show
          Jim Kellerman added a comment - Billy Pearson - 03/Feb/08 09:56 PM: > are we loseing the hbase.hregion.memcache.block.multiplier option in this patch? Before I applied the > patch I was seeing blocking messages logged while flushes where happening but not after I applied > the patch. Yes, in this version of the patch, blocking of updates was removed because all the memcache size accounting is done at the store level rather than the region level. What needs to be done is to bubble up store memcache size to the region level so that we can implement blocking based on the largest store memcache and not just the sum of the size of all the updates to the region. The question is, does this patch work without the blocking or does it need to be put back?
          Hide
          Billy Pearson added a comment -

          are we loseing the hbase.hregion.memcache.block.multiplier option in this patch?

          Before I applied the patch I was seeing blocking messages logged while flushes where happening but not after I applied the patch.

          Show
          Billy Pearson added a comment - are we loseing the hbase.hregion.memcache.block.multiplier option in this patch? Before I applied the patch I was seeing blocking messages logged while flushes where happening but not after I applied the patch.
          Hide
          Jim Kellerman added a comment -

          Integrates HADOOP-2731 with the changes for this issue.

          Will not submit to hudson until we have feedback from others who have tried this patch.

          Show
          Jim Kellerman added a comment - Integrates HADOOP-2731 with the changes for this issue. Will not submit to hudson until we have feedback from others who have tried this patch.
          Hide
          Billy Pearson added a comment -

          Copied from HADOOP-2615

          I got a second ideal on this to help with hot spots

          If we could add a way to set a priority for compaction's this would help with the hot spots regions building up to many map files flushes.

          Example if we have a region with 25 map files and one with 10

          Region with 25 map files would have a priority of 25
          and the one with 10 map files have a priority of 10 we would compact the region with 25 before 10

          If we could add/update the priority when we do a flush then the compactor could work on region that need it the most in order.

          Show
          Billy Pearson added a comment - Copied from HADOOP-2615 I got a second ideal on this to help with hot spots If we could add a way to set a priority for compaction's this would help with the hot spots regions building up to many map files flushes. Example if we have a region with 25 map files and one with 10 Region with 25 map files would have a priority of 25 and the one with 10 map files have a priority of 10 we would compact the region with 25 before 10 If we could add/update the priority when we do a flush then the compactor could work on region that need it the most in order.
          Hide
          Billy Pearson added a comment -

          sense HADOOP-2731 was committed hbase has been much more stable on my test. flushes are working correctly now. if we could build off that and set flusher to only flush columns that need to be flushed (have data to flush).

          I been thanking about the option flusher and we might need to leave it in there to help keep memory usage lower on a region server if we left it we would be able to remove the regions from memory that have got updates but have not got the amount needed for a flush. This would be useful for users with 32bit cpus and are limited by the 2000MB memory usage and users with limited memory. maybe just leave it in there at 30 mins or 60 mins so it will clean up memory usage once an hour per region if needed. Setting to 30-60 mins would not impact the compactor as much if we are only flushing regions columns that have got updates.

          Show
          Billy Pearson added a comment - sense HADOOP-2731 was committed hbase has been much more stable on my test. flushes are working correctly now. if we could build off that and set flusher to only flush columns that need to be flushed (have data to flush). I been thanking about the option flusher and we might need to leave it in there to help keep memory usage lower on a region server if we left it we would be able to remove the regions from memory that have got updates but have not got the amount needed for a flush. This would be useful for users with 32bit cpus and are limited by the 2000MB memory usage and users with limited memory. maybe just leave it in there at 30 mins or 60 mins so it will clean up memory usage once an hour per region if needed. Setting to 30-60 mins would not impact the compactor as much if we are only flushing regions columns that have got updates.
          Hide
          Jim Kellerman added a comment -

          Fixes NPE in loadHStoreFiles

          Show
          Jim Kellerman added a comment - Fixes NPE in loadHStoreFiles
          Hide
          Jim Kellerman added a comment -

          This patch adds a compaction request when an HStore starts up if there are too many files or if the store has references.

          Show
          Jim Kellerman added a comment - This patch adds a compaction request when an HStore starts up if there are too many files or if the store has references.
          Hide
          Billy Pearson added a comment -

          latest patch looks good on my end
          +1
          all flushes >= max_flush_size!

          Show
          Billy Pearson added a comment - latest patch looks good on my end +1 all flushes >= max_flush_size!
          Hide
          Jim Kellerman added a comment -

          Latest revision of patch. Fixes some really dumb bugs.

          Show
          Jim Kellerman added a comment - Latest revision of patch. Fixes some really dumb bugs.
          Hide
          stack added a comment -

          On migration, one suggestion would be to up the version number on disk and then have the script check for log files. If any found, warn user they need to revert their software to clean up log files. If none found, just up the version number.

          Also, regards my comment above where I punt the acounting of Store-level memcaches to another issue – i.e. "Hard part is keeping account of all the memcaches in all the Stores on all the Regions on an HRS, but thats another issue." – I'm now thinking thats a mistake. My thinking is that Store-level memcaches will have us reporting a regionserver overloaded – that is, that its memory is occupited – when it fact its not.

          When memcaches are at the Region level, a single memcache is used by all Stores, no matter how many families. Making an accounting of allocated memory is just a case of summing Region memcaches.

          When memcaches are at the Store level, accounting becomes summing of the memcache size times all families in the region, only in general usage, many of the Store memcaches won't be used at all (Presumption is that it will be unusual that we'll be updating all Stores during an upload outside of the initial upload).

          A lesser, related issue is that Store-level caches will inevitably be smaller. A Store that is being hammered will flush lots of small files. Lots of small files instead of a few big files seems to be harder on the compactor.

          Show
          stack added a comment - On migration, one suggestion would be to up the version number on disk and then have the script check for log files. If any found, warn user they need to revert their software to clean up log files. If none found, just up the version number. Also, regards my comment above where I punt the acounting of Store-level memcaches to another issue – i.e. "Hard part is keeping account of all the memcaches in all the Stores on all the Regions on an HRS, but thats another issue." – I'm now thinking thats a mistake. My thinking is that Store-level memcaches will have us reporting a regionserver overloaded – that is, that its memory is occupited – when it fact its not. When memcaches are at the Region level, a single memcache is used by all Stores, no matter how many families. Making an accounting of allocated memory is just a case of summing Region memcaches. When memcaches are at the Store level, accounting becomes summing of the memcache size times all families in the region, only in general usage, many of the Store memcaches won't be used at all (Presumption is that it will be unusual that we'll be updating all Stores during an upload outside of the initial upload). A lesser, related issue is that Store-level caches will inevitably be smaller. A Store that is being hammered will flush lots of small files. Lots of small files instead of a few big files seems to be harder on the compactor.
          Hide
          Billy Pearson added a comment -

          all flushes after the above lines of log are around ~100K in size and happened as fast as the flush in front of it can finish.

          Show
          Billy Pearson added a comment - all flushes after the above lines of log are around ~100K in size and happened as fast as the flush in front of it can finish.
          Hide
          Billy Pearson added a comment -

          No the table is all new no data so the split would not come for quite a while
          Also sense I been testing this issue patches I have not has a successfully first split while under load but I figured that I would wait until we get the flushes worked out and see what was happening then.

          Show
          Billy Pearson added a comment - No the table is all new no data so the split would not come for quite a while Also sense I been testing this issue patches I have not has a successfully first split while under load but I figured that I would wait until we get the flushes worked out and see what was happening then.
          Hide
          Jim Kellerman added a comment -

          Hmmm... the last few cache flushes are real small. Looks like the region was about to split. Did the region split after these log entries?

          Show
          Jim Kellerman added a comment - Hmmm... the last few cache flushes are real small. Looks like the region was about to split. Did the region split after these log entries?
          Hide
          Jim Kellerman added a comment -

          The first 3 flushes look right on. but then the go nuts as you noted. Gives me a place to look. Thanks!

          Show
          Jim Kellerman added a comment - The first 3 flushes look right on. but then the go nuts as you noted. Gives me a place to look. Thanks!
          Hide
          Jim Kellerman added a comment -

          Billy,

          This was really useful input. Thank you! I am not sure why this is happening, but your log input gives me some clues as to where to look.

          -Jim

          Show
          Jim Kellerman added a comment - Billy, This was really useful input. Thank you! I am not sure why this is happening, but your log input gives me some clues as to where to look. -Jim
          Hide
          Billy Pearson added a comment -

          maybe adding some logging in all the spots that a flush would be called and why its being called would help also spot the problem.

          Show
          Billy Pearson added a comment - maybe adding some logging in all the spots that a flush would be called and why its being called would help also spot the problem.
          Hide
          Billy Pearson added a comment -

          latest patch #5

          I get a ton of these per sec after the first flush starts

          2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in progress
          2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction request because one is already in progress
          2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in queue
          2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction because one is already in the queue
          2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in queue
          2008-01-31 18:01:09,962 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction because one is already in the queue
          

          but still flushing lots of small flushes back top back
          flush size set to 16MB (16777216)
          I comment out the above logs from the code so I could get this part

          2008-01-31 18:26:09,314 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/7445623251870888179 with 151286 entries, sequence id 468191, and size 16.0m for 51984898/in_rank
          2008-01-31 18:26:09,315 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 5918ms, sequenceid=468191
          2008-01-31 18:26:09,368 DEBUG org.apache.hadoop.hbase.HLog: Closing current log writer hdfs://10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.015 to get a new one
          2008-01-31 18:26:09,374 INFO org.apache.hadoop.hbase.HLog: new log writer created at hdfs://10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.016
          2008-01-31 18:26:09,374 DEBUG org.apache.hadoop.hbase.HLog: Found 0 logs to remove using oldest outstanding seqnum of 0 from HStore 51984898/size
          2008-01-31 18:26:09,408 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/anchor
          2008-01-31 18:26:16,574 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/anchor/5033243542266094285 with 159021 entries, sequence id 492132, and size 16.4m for 51984898/anchor
          2008-01-31 18:26:16,574 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/anchor in 7166ms, sequenceid=492132
          2008-01-31 18:26:16,611 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/last_seen
          2008-01-31 18:26:17,223 INFO org.apache.hadoop.hbase.HRegionServer: Rolling hlog. Number of entries: 30018
          2008-01-31 18:26:22,469 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/last_seen/789487054095285345 with 167871 entries, sequence id 519368, and size 17.2m for 51984898/last_seen
          2008-01-31 18:26:22,470 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/last_seen in 5859ms, sequenceid=519368
          2008-01-31 18:26:22,520 DEBUG org.apache.hadoop.hbase.HLog: Closing current log writer hdfs://10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.016 to get a new one
          2008-01-31 18:26:22,526 INFO org.apache.hadoop.hbase.HLog: new log writer created at hdfs://10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.017
          2008-01-31 18:26:22,526 DEBUG org.apache.hadoop.hbase.HLog: Found 0 logs to remove using oldest outstanding seqnum of 0 from HStore 51984898/size
          2008-01-31 18:26:22,532 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/in_rank
          2008-01-31 18:26:23,643 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/6607743928392488812 with 24913 entries, sequence id 545013, and size 2.7m for 51984898/in_rank
          2008-01-31 18:26:23,643 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 1111ms, sequenceid=545013
          2008-01-31 18:26:23,648 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/anchor
          2008-01-31 18:26:24,707 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/anchor/3932738492736374231 with 18919 entries, sequence id 550311, and size 1.9m for 51984898/anchor
          2008-01-31 18:26:24,707 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/anchor in 1059ms, sequenceid=550311
          2008-01-31 18:26:24,710 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/last_seen
          2008-01-31 18:26:25,352 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/last_seen/5324440789419095782 with 11724 entries, sequence id 555402, and size 1.1m for 51984898/last_seen
          2008-01-31 18:26:25,352 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/last_seen in 642ms, sequenceid=555402
          2008-01-31 18:26:25,355 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/in_rank
          2008-01-31 18:26:25,869 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/2378178173153292440 with 4425 entries, sequence id 558549, and size 491.2k for 51984898/in_rank
          2008-01-31 18:26:25,870 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 515ms, sequenceid=558549
          

          Flushes are still way smaller after the first flush then the set size.

          I thank we should be looking at the process that calls the flush its seams the be the part thats going crazy after the first flush maybe some loop we are not getting out of or something along those lines.

          Show
          Billy Pearson added a comment - latest patch #5 I get a ton of these per sec after the first flush starts 2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in progress 2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction request because one is already in progress 2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in queue 2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction because one is already in the queue 2008-01-31 18:01:09,961 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing flush request because one is already in queue 2008-01-31 18:01:09,962 DEBUG org.apache.hadoop.hbase.HRegionServer: not queuing compaction because one is already in the queue but still flushing lots of small flushes back top back flush size set to 16MB (16777216) I comment out the above logs from the code so I could get this part 2008-01-31 18:26:09,314 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/7445623251870888179 with 151286 entries, sequence id 468191, and size 16.0m for 51984898/in_rank 2008-01-31 18:26:09,315 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 5918ms, sequenceid=468191 2008-01-31 18:26:09,368 DEBUG org.apache.hadoop.hbase.HLog: Closing current log writer hdfs: //10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.015 to get a new one 2008-01-31 18:26:09,374 INFO org.apache.hadoop.hbase.HLog: new log writer created at hdfs: //10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.016 2008-01-31 18:26:09,374 DEBUG org.apache.hadoop.hbase.HLog: Found 0 logs to remove using oldest outstanding seqnum of 0 from HStore 51984898/size 2008-01-31 18:26:09,408 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/anchor 2008-01-31 18:26:16,574 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/anchor/5033243542266094285 with 159021 entries, sequence id 492132, and size 16.4m for 51984898/anchor 2008-01-31 18:26:16,574 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/anchor in 7166ms, sequenceid=492132 2008-01-31 18:26:16,611 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/last_seen 2008-01-31 18:26:17,223 INFO org.apache.hadoop.hbase.HRegionServer: Rolling hlog. Number of entries: 30018 2008-01-31 18:26:22,469 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/last_seen/789487054095285345 with 167871 entries, sequence id 519368, and size 17.2m for 51984898/last_seen 2008-01-31 18:26:22,470 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/last_seen in 5859ms, sequenceid=519368 2008-01-31 18:26:22,520 DEBUG org.apache.hadoop.hbase.HLog: Closing current log writer hdfs: //10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.016 to get a new one 2008-01-31 18:26:22,526 INFO org.apache.hadoop.hbase.HLog: new log writer created at hdfs: //10.0.0.1:9000/gfs_storage/hadoop-root/hbase/log_10.0.0.2_1201825376852_60020/hlog.dat.017 2008-01-31 18:26:22,526 DEBUG org.apache.hadoop.hbase.HLog: Found 0 logs to remove using oldest outstanding seqnum of 0 from HStore 51984898/size 2008-01-31 18:26:22,532 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/in_rank 2008-01-31 18:26:23,643 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/6607743928392488812 with 24913 entries, sequence id 545013, and size 2.7m for 51984898/in_rank 2008-01-31 18:26:23,643 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 1111ms, sequenceid=545013 2008-01-31 18:26:23,648 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/anchor 2008-01-31 18:26:24,707 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/anchor/3932738492736374231 with 18919 entries, sequence id 550311, and size 1.9m for 51984898/anchor 2008-01-31 18:26:24,707 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/anchor in 1059ms, sequenceid=550311 2008-01-31 18:26:24,710 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/last_seen 2008-01-31 18:26:25,352 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/last_seen/5324440789419095782 with 11724 entries, sequence id 555402, and size 1.1m for 51984898/last_seen 2008-01-31 18:26:25,352 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/last_seen in 642ms, sequenceid=555402 2008-01-31 18:26:25,355 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201825394888 store 51984898/in_rank 2008-01-31 18:26:25,869 DEBUG org.apache.hadoop.hbase.HStore: Added 51984898/in_rank/2378178173153292440 with 4425 entries, sequence id 558549, and size 491.2k for 51984898/in_rank 2008-01-31 18:26:25,870 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 51984898/in_rank in 515ms, sequenceid=558549 Flushes are still way smaller after the first flush then the set size. I thank we should be looking at the process that calls the flush its seams the be the part thats going crazy after the first flush maybe some loop we are not getting out of or something along those lines.
          Hide
          Billy Pearson added a comment -

          I checked the 4th patch above just to see if it fixed my problems but it did not here a little update on what i am seeing

          The flushing happens correctly the first flush of the server once it started.
          then it flushes as fast as it can over and over.
          but if I stop the job the flushes stop at the same time with in a sec

          But once a job starts again the flushes start almost at the same time its like the flush size has been reset to 0 or something after the first flush.

          Show
          Billy Pearson added a comment - I checked the 4th patch above just to see if it fixed my problems but it did not here a little update on what i am seeing The flushing happens correctly the first flush of the server once it started. then it flushes as fast as it can over and over. but if I stop the job the flushes stop at the same time with in a sec But once a job starts again the flushes start almost at the same time its like the flush size has been reset to 0 or something after the first flush.
          Hide
          Jim Kellerman added a comment -

          Addresses Stack's issues.

          Show
          Jim Kellerman added a comment - Addresses Stack's issues.
          Hide
          Jim Kellerman added a comment -

          RE: TextSequence. The ultimate solution is to either fix Text so that it is subclassable or to change HStoreKey so that the column family and column member are stored separately. The latter would require a massive upgrade as HStoreKeys are in all the MapFiles.

          I would prefer the former solution but doubt the community would support it because Text is used so heavily.

          Or I suppose a third solution would be to make our own Text that supports what Text does and adds TextSequence capability.

          Show
          Jim Kellerman added a comment - RE: TextSequence. The ultimate solution is to either fix Text so that it is subclassable or to change HStoreKey so that the column family and column member are stored separately. The latter would require a massive upgrade as HStoreKeys are in all the MapFiles. I would prefer the former solution but doubt the community would support it because Text is used so heavily. Or I suppose a third solution would be to make our own Text that supports what Text does and adds TextSequence capability.
          Hide
          Jim Kellerman added a comment -

          > stack commented on HADOOP-2636:
          > -------------------------------
          >
          > In HLogKey, was it just a case of a misnamed data member?
          > All along it was a store but we were calling it region? See below:
          >
          >

          > -  Text regionName = new Text();
          > +  Text storeName = new Text();
          > 

          No. Before flushes happened at the region level and we wrote the region name into the log.
          Moving flushes to the store level required writing the store name instead of the region name
          into the log so that log rolling and garbage collection would work properly.

          Note, however that this is an incompatible change, so before upgrading, all region servers
          should be shut down cleanly so that there are no logs to recover.

          > Can this string creation be avoided in HStore; e.g. can
          > storeName be Text?
          >
          >

          > +            || 
          > (key.getStoreName().toString().compareTo(storeName) != 
          > + 0)
          > 

          Done.

          > Logging below at INFO level seems inappropriate:
          >
          >

          > +      LOG.info("Not flushing cache for " + storeName +
          > +          " because it has 0 entries");
          > 

          Changed to debug level.

          > This kind of logging doesn't help (though I think this log is
          > just a line moved from elsewhere):
          >
          >

          > +            LOG.debug("nothing to compact for " + this.storeName);
          > 

          >
          > Should say why there is nothing to compact – e.g. only one
          > file present or holds references.

          Fixed.

          > Just remove rather than comment out?
          >
          >

          > -      HStoreKey rowKey = new HStoreKey(row, timestamp);
          > +/*      HStoreKey rowKey = new HStoreKey(row, timestamp); */
          > 

          No. This line is a part of other code that needs to be enabled when MapFile.Reader.getClosest before is implemented.

          > HStoreSize inner class is no longer needed because the check
          > is local to HStore where before it was higher up in HRegion?
          > The info HStoreSize carried is now all availble in the
          > context where the check is being done?

          Yes.

          > Nice how you cleaned up lease-making/updating.

          Thanks.

          > Why make RowMap non-private? Its used by inner classes?

          Synthetic accessors showed up in eclipse.

          > The below no longer makes use of TextSequences? Any reason
          > for that? (TS was means of cutting down on object creations.
          > Profiling, using TSs made a big difference).
          >

          > -      Text qualifier = HStoreKey.extractQualifier(col);
          > +      Text member = HStoreKey.extractMember(col);
          > 

          I just ran into too many instances where things broke because TextSequence is not a Text.
          Text should either use accessors to access members of another Text or it should just be
          declared as final because it cannot be subclassed as it stands now.

          Show
          Jim Kellerman added a comment - > stack commented on HADOOP-2636 : > ------------------------------- > > In HLogKey, was it just a case of a misnamed data member? > All along it was a store but we were calling it region? See below: > > > - Text regionName = new Text(); > + Text storeName = new Text(); > No. Before flushes happened at the region level and we wrote the region name into the log. Moving flushes to the store level required writing the store name instead of the region name into the log so that log rolling and garbage collection would work properly. Note, however that this is an incompatible change, so before upgrading, all region servers should be shut down cleanly so that there are no logs to recover. > Can this string creation be avoided in HStore; e.g. can > storeName be Text? > > > + || > (key.getStoreName().toString().compareTo(storeName) != > + 0) > Done. > Logging below at INFO level seems inappropriate: > > > + LOG.info( "Not flushing cache for " + storeName + > + " because it has 0 entries" ); > Changed to debug level. > This kind of logging doesn't help (though I think this log is > just a line moved from elsewhere): > > > + LOG.debug( "nothing to compact for " + this .storeName); > > > Should say why there is nothing to compact – e.g. only one > file present or holds references. Fixed. > Just remove rather than comment out? > > > - HStoreKey rowKey = new HStoreKey(row, timestamp); > +/* HStoreKey rowKey = new HStoreKey(row, timestamp); */ > No. This line is a part of other code that needs to be enabled when MapFile.Reader.getClosest before is implemented. > HStoreSize inner class is no longer needed because the check > is local to HStore where before it was higher up in HRegion? > The info HStoreSize carried is now all availble in the > context where the check is being done? Yes. > Nice how you cleaned up lease-making/updating. Thanks. > Why make RowMap non-private? Its used by inner classes? Synthetic accessors showed up in eclipse. > The below no longer makes use of TextSequences? Any reason > for that? (TS was means of cutting down on object creations. > Profiling, using TSs made a big difference). > > - Text qualifier = HStoreKey.extractQualifier(col); > + Text member = HStoreKey.extractMember(col); > I just ran into too many instances where things broke because TextSequence is not a Text. Text should either use accessors to access members of another Text or it should just be declared as final because it cannot be subclassed as it stands now.
          Hide
          stack added a comment -

          In HLogKey, was it just a case of a misnamed data member? All along it was a store but we were calling it region? See below:

          -  Text regionName = new Text();
          +  Text storeName = new Text();
          

          Can this string creation be avoided in HStore; e.g. can storeName be Text?

          +            || (key.getStoreName().toString().compareTo(storeName) != 0)
          

          Logging below at INFO level seems inappropriate:

          +      LOG.info("Not flushing cache for " + storeName +
          +          " because it has 0 entries");
          

          This kind of logging doesn't help (though I think this log is just a line moved from elsewhere):

          +            LOG.debug("nothing to compact for " + this.storeName);
          

          Should say why there is nothing to compact – e.g. only one file present or holds references.

          Just remove rather than comment out?

          -      HStoreKey rowKey = new HStoreKey(row, timestamp);
          +/*      HStoreKey rowKey = new HStoreKey(row, timestamp); */
          

          HStoreSize inner class is no longer needed because the check is local to HStore where before it was higher up in HRegion? The info HStoreSize carried is now all availble in the context where the check is being done?

          Nice how you cleaned up lease-making/updating.

          Why make RowMap non-private? Its used by inner classes?

          The below no longer makes use of TextSequences? Any reason for that? (TS was means of cutting down on object creations. Profiling, using TSs made a big difference).

          -      Text qualifier = HStoreKey.extractQualifier(col);
          +      Text member = HStoreKey.extractMember(col);
          
          Show
          stack added a comment - In HLogKey, was it just a case of a misnamed data member? All along it was a store but we were calling it region? See below: - Text regionName = new Text(); + Text storeName = new Text(); Can this string creation be avoided in HStore; e.g. can storeName be Text? + || (key.getStoreName().toString().compareTo(storeName) != 0) Logging below at INFO level seems inappropriate: + LOG.info( "Not flushing cache for " + storeName + + " because it has 0 entries" ); This kind of logging doesn't help (though I think this log is just a line moved from elsewhere): + LOG.debug( "nothing to compact for " + this .storeName); Should say why there is nothing to compact – e.g. only one file present or holds references. Just remove rather than comment out? - HStoreKey rowKey = new HStoreKey(row, timestamp); +/* HStoreKey rowKey = new HStoreKey(row, timestamp); */ HStoreSize inner class is no longer needed because the check is local to HStore where before it was higher up in HRegion? The info HStoreSize carried is now all availble in the context where the check is being done? Nice how you cleaned up lease-making/updating. Why make RowMap non-private? Its used by inner classes? The below no longer makes use of TextSequences? Any reason for that? (TS was means of cutting down on object creations. Profiling, using TSs made a big difference). - Text qualifier = HStoreKey.extractQualifier(col); + Text member = HStoreKey.extractMember(col);
          Hide
          Billy Pearson added a comment -

          works ok until the first flush then they flush back to back.
          I got my flush size set to 64MB and you see below the flush is on the columns are in the low KB's and they are flushed several times a min

          2008-01-27 13:23:12,428 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/3002468228224142295 with 2198 entries, sequence id 2169440, and size 250.1k for 673880687/anchor
          2008-01-27 13:23:12,428 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 503ms, sequenceid=2169440
          2008-01-27 13:23:12,429 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank
          2008-01-27 13:23:13,009 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/8574408391914465288 with 1428 entries, sequence id 2171549, and size 158.0k for 673880687/in_rank
          2008-01-27 13:23:13,010 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 581ms, sequenceid=2171549
          2008-01-27 13:23:13,011 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen
          2008-01-27 13:23:13,603 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/1870135185038371518 with 2736 entries, sequence id 2173487, and size 295.4k for 673880687/last_seen
          2008-01-27 13:23:13,603 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 592ms, sequenceid=2173487
          2008-01-27 13:23:13,609 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor
          2008-01-27 13:23:14,179 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/580140819100995513 with 1989 entries, sequence id 2175620, and size 216.4k for 673880687/anchor
          2008-01-27 13:23:14,180 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 572ms, sequenceid=2175620
          2008-01-27 13:23:14,181 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank
          2008-01-27 13:23:14,693 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/5202798596377303202 with 1896 entries, sequence id 2177448, and size 213.7k for 673880687/in_rank
          2008-01-27 13:23:14,693 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 512ms, sequenceid=2177448
          2008-01-27 13:23:14,696 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen
          2008-01-27 13:23:15,260 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/8482135742832457522 with 1940 entries, sequence id 2179528, and size 206.2k for 673880687/last_seen
          2008-01-27 13:23:15,260 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 564ms, sequenceid=2179528
          2008-01-27 13:23:15,276 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor
          2008-01-27 13:23:15,839 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/7980708509699803761 with 1991 entries, sequence id 2181788, and size 204.6k for 673880687/anchor
          2008-01-27 13:23:15,839 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 564ms, sequenceid=2181788
          2008-01-27 13:23:15,840 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank
          2008-01-27 13:23:16,423 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/1305487102801331533 with 2150 entries, sequence id 2184079, and size 221.0k for 673880687/in_rank
          2008-01-27 13:23:16,423 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 583ms, sequenceid=2184079
          2008-01-27 13:23:16,425 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen
          2008-01-27 13:23:17,138 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/1135861004256783910 with 2258 entries, sequence id 2186492, and size 221.2k for 673880687/last_seen
          2008-01-27 13:23:17,138 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 713ms, sequenceid=2186492
          2008-01-27 13:23:17,139 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor
          2008-01-27 13:23:17,366 INFO org.apache.hadoop.hbase.HRegionServer: Rolling hlog. Number of entries: 30003
          2008-01-27 13:23:17,738 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/5010915044698800730 with 2232 entries, sequence id 2188716, and size 236.5k for 673880687/anchor
          2008-01-27 13:23:17,738 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 599ms, sequenceid=2188716
          
          Show
          Billy Pearson added a comment - works ok until the first flush then they flush back to back. I got my flush size set to 64MB and you see below the flush is on the columns are in the low KB's and they are flushed several times a min 2008-01-27 13:23:12,428 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/3002468228224142295 with 2198 entries, sequence id 2169440, and size 250.1k for 673880687/anchor 2008-01-27 13:23:12,428 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 503ms, sequenceid=2169440 2008-01-27 13:23:12,429 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank 2008-01-27 13:23:13,009 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/8574408391914465288 with 1428 entries, sequence id 2171549, and size 158.0k for 673880687/in_rank 2008-01-27 13:23:13,010 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 581ms, sequenceid=2171549 2008-01-27 13:23:13,011 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen 2008-01-27 13:23:13,603 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/1870135185038371518 with 2736 entries, sequence id 2173487, and size 295.4k for 673880687/last_seen 2008-01-27 13:23:13,603 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 592ms, sequenceid=2173487 2008-01-27 13:23:13,609 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor 2008-01-27 13:23:14,179 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/580140819100995513 with 1989 entries, sequence id 2175620, and size 216.4k for 673880687/anchor 2008-01-27 13:23:14,180 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 572ms, sequenceid=2175620 2008-01-27 13:23:14,181 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank 2008-01-27 13:23:14,693 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/5202798596377303202 with 1896 entries, sequence id 2177448, and size 213.7k for 673880687/in_rank 2008-01-27 13:23:14,693 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 512ms, sequenceid=2177448 2008-01-27 13:23:14,696 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen 2008-01-27 13:23:15,260 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/8482135742832457522 with 1940 entries, sequence id 2179528, and size 206.2k for 673880687/last_seen 2008-01-27 13:23:15,260 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 564ms, sequenceid=2179528 2008-01-27 13:23:15,276 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor 2008-01-27 13:23:15,839 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/7980708509699803761 with 1991 entries, sequence id 2181788, and size 204.6k for 673880687/anchor 2008-01-27 13:23:15,839 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 564ms, sequenceid=2181788 2008-01-27 13:23:15,840 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/in_rank 2008-01-27 13:23:16,423 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/in_rank/1305487102801331533 with 2150 entries, sequence id 2184079, and size 221.0k for 673880687/in_rank 2008-01-27 13:23:16,423 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/in_rank in 583ms, sequenceid=2184079 2008-01-27 13:23:16,425 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/last_seen 2008-01-27 13:23:17,138 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/last_seen/1135861004256783910 with 2258 entries, sequence id 2186492, and size 221.2k for 673880687/last_seen 2008-01-27 13:23:17,138 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/last_seen in 713ms, sequenceid=2186492 2008-01-27 13:23:17,139 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201461077996 store 673880687/anchor 2008-01-27 13:23:17,366 INFO org.apache.hadoop.hbase.HRegionServer: Rolling hlog. Number of entries: 30003 2008-01-27 13:23:17,738 DEBUG org.apache.hadoop.hbase.HStore: Added 673880687/anchor/5010915044698800730 with 2232 entries, sequence id 2188716, and size 236.5k for 673880687/anchor 2008-01-27 13:23:17,738 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 673880687/anchor in 599ms, sequenceid=2188716
          Hide
          Jim Kellerman added a comment -

          This patch does not requeue flushes, compactions or splits if they are already in the queue.

          Show
          Jim Kellerman added a comment - This patch does not requeue flushes, compactions or splits if they are already in the queue.
          Hide
          Jim Kellerman added a comment -

          Thanks for trying this out Billy, as it turned out, the HStore could queue several requests to flush itself before the flusher could act on it. I have modified this so that if an HStore is in the flush queue, it won't get entered a second time.

          The latest patch contains this fix.

          Additionally, I have re-written Leases to be event-driven rather than a polling model.

          If you get a chance, try it out.

          Show
          Jim Kellerman added a comment - Thanks for trying this out Billy, as it turned out, the HStore could queue several requests to flush itself before the flusher could act on it. I have modified this so that if an HStore is in the flush queue, it won't get entered a second time. The latest patch contains this fix. Additionally, I have re-written Leases to be event-driven rather than a polling model. If you get a chance, try it out.
          Hide
          Billy Pearson added a comment - - edited

          I tried your patch out above and it only flushing one column for me I have three getting data and I only see one flushing and its flushing back to back over and over.

          Example this is a flush on the same column 4 times with in one sec

          2008-01-26 22:40:14,137 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank
          2008-01-26 22:40:14,418 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/1595847912559744983 with 70 entries, sequence id 1877650, and size 6.6k for 332212182/in_rank
          2008-01-26 22:40:14,418 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 281ms, sequenceid=1877650
          2008-01-26 22:40:14,436 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank
          2008-01-26 22:40:14,650 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/3253290776281930479 with 6 entries, sequence id 1877667, and size 621.0 for 332212182/in_rank
          2008-01-26 22:40:14,650 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 214ms, sequenceid=1877667
          2008-01-26 22:40:14,682 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank
          2008-01-26 22:40:14,893 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/6244850576092789885 with 5 entries, sequence id 1877683, and size 497.0 for 332212182/in_rank
          2008-01-26 22:40:14,894 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 212ms, sequenceid=1877683
          2008-01-26 22:40:14,941 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank
          
          Show
          Billy Pearson added a comment - - edited I tried your patch out above and it only flushing one column for me I have three getting data and I only see one flushing and its flushing back to back over and over. Example this is a flush on the same column 4 times with in one sec 2008-01-26 22:40:14,137 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank 2008-01-26 22:40:14,418 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/1595847912559744983 with 70 entries, sequence id 1877650, and size 6.6k for 332212182/in_rank 2008-01-26 22:40:14,418 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 281ms, sequenceid=1877650 2008-01-26 22:40:14,436 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank 2008-01-26 22:40:14,650 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/3253290776281930479 with 6 entries, sequence id 1877667, and size 621.0 for 332212182/in_rank 2008-01-26 22:40:14,650 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 214ms, sequenceid=1877667 2008-01-26 22:40:14,682 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank 2008-01-26 22:40:14,893 DEBUG org.apache.hadoop.hbase.HStore: Added 332212182/in_rank/6244850576092789885 with 5 entries, sequence id 1877683, and size 497.0 for 332212182/in_rank 2008-01-26 22:40:14,894 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for store 332212182/in_rank in 212ms, sequenceid=1877683 2008-01-26 22:40:14,941 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region webdata,,1201405676281 store 332212182/in_rank
          Hide
          Jim Kellerman added a comment -

          This is the first patch that works locally. Not submitting, because there will undoubtedly be more updates checked in for the 0.16 release before 0.17 opens up.

          Show
          Jim Kellerman added a comment - This is the first patch that works locally. Not submitting, because there will undoubtedly be more updates checked in for the 0.16 release before 0.17 opens up.
          Hide
          Jim Kellerman added a comment -

          > Why will optional cache flushing be removed? Will anything be done in its place?

          Optional cache flushing will be removed as it will no longer be necessary. Cache flushes will happen when the cache fills or when the region is closed. All the information is in the HLog anyway.

          > For kicks, you might log both memcache size - in bytes and count - and resultant log file size.

          +1

          Show
          Jim Kellerman added a comment - > Why will optional cache flushing be removed? Will anything be done in its place? Optional cache flushing will be removed as it will no longer be necessary. Cache flushes will happen when the cache fills or when the region is closed. All the information is in the HLog anyway. > For kicks, you might log both memcache size - in bytes and count - and resultant log file size. +1
          Hide
          stack added a comment -

          .bq [Copied from HADOOP-2674] ... size reported is the mapfile size. In HADOOP-2636, it will be the size of the memcache.

          For kicks, you might log both memcache size – in bytes and count – and resultant log file size.

          Show
          stack added a comment - .bq [Copied from HADOOP-2674] ... size reported is the mapfile size. In HADOOP-2636 , it will be the size of the memcache. For kicks, you might log both memcache size – in bytes and count – and resultant log file size.
          Hide
          stack added a comment -

          .bg [Copied from HADOOP-2674] Optional cache flushing and zero length flushes will be removed in HADOOP-2636

          Why will optional cache flushing be removed? Will anything be done in its place?

          Show
          stack added a comment - .bg [Copied from HADOOP-2674] Optional cache flushing and zero length flushes will be removed in HADOOP-2636 Why will optional cache flushing be removed? Will anything be done in its place?
          Hide
          Billy Pearson added a comment -

          +1

          Show
          Billy Pearson added a comment - +1
          Hide
          stack added a comment -

          +1 on doing things at store level.

          Hard part is keeping account of all the memcaches in all the Stores on all the Regions on an HRS, but thats another issue (smile).

          Show
          stack added a comment - +1 on doing things at store level. Hard part is keeping account of all the memcaches in all the Stores on all the Regions on an HRS, but thats another issue (smile).
          Hide
          Jim Kellerman added a comment -

          Better yet, move triggering of cache flush to the store level instead of the region level. Same for compactions.

          Split still has to happen at the region level because it is the region that embodies the concept of row range. However the split could be triggered by a single store reaching the split threshold.

          Show
          Jim Kellerman added a comment - Better yet, move triggering of cache flush to the store level instead of the region level. Same for compactions. Split still has to happen at the region level because it is the region that embodies the concept of row range. However the split could be triggered by a single store reaching the split threshold.

            People

            • Assignee:
              Jim Kellerman
              Reporter:
              stack
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development