HBase
  1. HBase
  2. HBASE-616

" We slept XXXXXX ms, ten times longer than scheduled: 3000" happens frequently.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Just saw the below in a log... all in a row on the one server.

         4493 2008-05-05 18:08:17,512 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 34557ms, ten times longer than scheduled: 3000
         4494 2008-05-05 18:11:08,879 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 30576ms, ten times longer than scheduled: 3000
         4495 2008-05-05 18:30:45,056 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 1091720ms, ten times longer than scheduled: 3000
         4496 2008-05-05 18:30:45,056 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 1094209ms, ten times longer than scheduled: 10000
         4497 2008-05-05 18:30:45,429 FATAL org.apache.hadoop.hbase.HRegionServer: unable to report to master for 1092093 milliseconds - aborting server
      

      We're seeing these kinda outages pretty frequently. In the case above, it was small cluster that was using TableReduce to insert. The MR, HDFS and HBase were all running on same nodes.

        Issue Links

          Activity

          Hide
          stack added a comment -

          Resolving invalid, as a symptom of a stalled GC. Wiki has GC tuning steps to help minimize stalls.

          Show
          stack added a comment - Resolving invalid, as a symptom of a stalled GC. Wiki has GC tuning steps to help minimize stalls.
          Hide
          stack added a comment -

          We likely ain't going to fix what is seemingly a GC issue for 0.20.0. Improvements and suggested config. for GC we can stick up on wiki but not a fix I'd say for 0.20.0 timeframe. Moving out.

          Show
          stack added a comment - We likely ain't going to fix what is seemingly a GC issue for 0.20.0. Improvements and suggested config. for GC we can stick up on wiki but not a fix I'd say for 0.20.0 timeframe. Moving out.
          Hide
          Billy Pearson added a comment -

          I use
          -Xloggc:/tmp/gc.log
          on region server
          and when I see these messages there is always a Full GC happening at the same time

          Show
          Billy Pearson added a comment - I use -Xloggc:/tmp/gc.log on region server and when I see these messages there is always a Full GC happening at the same time
          Hide
          stack added a comment -

          This is looking to be a symptom of full GC'ing.

          Show
          stack added a comment - This is looking to be a symptom of full GC'ing.
          Hide
          stack added a comment -

          Moving out of 0.19.0. Can bring back in for 0.19.1 if we figure it meantime. Otherwise, 0.20.0.

          Show
          stack added a comment - Moving out of 0.19.0. Can bring back in for 0.19.1 if we figure it meantime. Otherwise, 0.20.0.
          Hide
          stack added a comment -

          Moved to 0.19 because we're starting to get a handle on this issue. Could be root of HBASE-900 too among other things (Lots of long GC'ing provoking OOME).

          Show
          stack added a comment - Moved to 0.19 because we're starting to get a handle on this issue. Could be root of HBASE-900 too among other things (Lots of long GC'ing provoking OOME).
          Hide
          stack added a comment -

          jgray ran his uploader. Oddly, all seemed to work fine and the upload seemed to have finished nicely but then the gc log hung like this:

          4702.183: [GC 723285K->644523K(903424K), 0.1020550 secs]
          4741.964: [GC 751595K->674674K(903488K), 0.1030280 secs]
          5386.336: [GC 781746K->698672K(903744K), 299.1547740 secs]
          5685.743: [Full GC
          

          It was hung after the emission of 'Full GC'.

          Eventually it cleared with the GC having been hung for 400 seconds:

          4643.716: [GC 693488K->616512K(897216K), 0.0775520 secs]
          4702.183: [GC 723285K->644523K(903424K), 0.1020550 secs]
          4741.964: [GC 751595K->674674K(903488K), 0.1030280 secs]
          5386.336: [GC 781746K->698672K(903744K), 299.1547740 secs]
          5685.743: [Full GC 698672K->694148K(903744K), 434.7056360 secs]
          6161.052: [Full GC 789962K->630359K(903744K), 23.9435320 secs]
          6197.831: [Full GC 737687K->478895K(903744K), 28.7396540 secs]
          6261.250: [GC 586222K->481646K(903744K), 0.0385590 secs]
          6305.213: [GC 588974K->481846K(905472K), 0.1706230 secs]
          6359.292: [GC 591478K->482681K(906048K), 0.0147330 secs]
          6469.725: [GC 592313K->482763K(907072K), 0.0059020 secs]
          ...
          

          Looking in ganglia, the whole cluster had gone into heavy swap. Jon looking at top, etc., saw that the machines had loads of free memory. Could not explain why swap, low CPU, and free memory combination.

          Whatever the cause of cluster swap seems to be hanging GC upsetting hbase.

          Looking in logs, nothing really obvious though hdfs and deletes are suspect (pset has reported deletes being problematic in past, so has streamy). See a bunch of deleted commit files around that time but then looking at block deletes in datanode, they seem to be pretty constantly deleting bunches over time. Datanodes at the time were handling startup of a new MR job which was opening loads of files.

          One thing to try is a rerun with delete of commit logs disabled.

          Show
          stack added a comment - jgray ran his uploader. Oddly, all seemed to work fine and the upload seemed to have finished nicely but then the gc log hung like this: 4702.183: [GC 723285K->644523K(903424K), 0.1020550 secs] 4741.964: [GC 751595K->674674K(903488K), 0.1030280 secs] 5386.336: [GC 781746K->698672K(903744K), 299.1547740 secs] 5685.743: [Full GC It was hung after the emission of 'Full GC'. Eventually it cleared with the GC having been hung for 400 seconds: 4643.716: [GC 693488K->616512K(897216K), 0.0775520 secs] 4702.183: [GC 723285K->644523K(903424K), 0.1020550 secs] 4741.964: [GC 751595K->674674K(903488K), 0.1030280 secs] 5386.336: [GC 781746K->698672K(903744K), 299.1547740 secs] 5685.743: [Full GC 698672K->694148K(903744K), 434.7056360 secs] 6161.052: [Full GC 789962K->630359K(903744K), 23.9435320 secs] 6197.831: [Full GC 737687K->478895K(903744K), 28.7396540 secs] 6261.250: [GC 586222K->481646K(903744K), 0.0385590 secs] 6305.213: [GC 588974K->481846K(905472K), 0.1706230 secs] 6359.292: [GC 591478K->482681K(906048K), 0.0147330 secs] 6469.725: [GC 592313K->482763K(907072K), 0.0059020 secs] ... Looking in ganglia, the whole cluster had gone into heavy swap. Jon looking at top, etc., saw that the machines had loads of free memory. Could not explain why swap, low CPU, and free memory combination. Whatever the cause of cluster swap seems to be hanging GC upsetting hbase. Looking in logs, nothing really obvious though hdfs and deletes are suspect (pset has reported deletes being problematic in past, so has streamy). See a bunch of deleted commit files around that time but then looking at block deletes in datanode, they seem to be pretty constantly deleting bunches over time. Datanodes at the time were handling startup of a new MR job which was opening loads of files. One thing to try is a rerun with delete of commit logs disabled.
          Hide
          stack added a comment -

          Would be interesting to figure if a long-running GC was going on concurrently? If so, perhaps as suggested by Rong-en a while back, the concurrent low-pause collector might work better when under load: http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms.available_options.

          Show
          stack added a comment - Would be interesting to figure if a long-running GC was going on concurrently? If so, perhaps as suggested by Rong-en a while back, the concurrent low-pause collector might work better when under load: http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms.available_options .
          Hide
          stack added a comment -

          Here's log excerpt:

          2008-11-22 17:12:42,426 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_XX.XX.249.103_1227396990675_60020/hlog.dat.1227402762424
          2008-11-22 17:47:47,792 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 193824ms, ten times longer than scheduled: 3000
          2008-11-22 17:47:47,792 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 193115ms, ten times longer than scheduled: 10000
          2008-11-22 17:47:48,113 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master for 194203 milliseconds - retrying
          2008-11-22 17:47:48,395 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP
          

          When it finally checks in, master tells it restart: MSG_CALL_SERVER_STARTUP.

          Show
          stack added a comment - Here's log excerpt: 2008-11-22 17:12:42,426 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_XX.XX.249.103_1227396990675_60020/hlog.dat.1227402762424 2008-11-22 17:47:47,792 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 193824ms, ten times longer than scheduled: 3000 2008-11-22 17:47:47,792 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 193115ms, ten times longer than scheduled: 10000 2008-11-22 17:47:48,113 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master for 194203 milliseconds - retrying 2008-11-22 17:47:48,395 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP When it finally checks in, master tells it restart: MSG_CALL_SERVER_STARTUP.
          Hide
          stack added a comment -

          Seeing this on the streamy test cluster. 2 core machines. Whats up with our JVM scheduler such that a thread can get no time for periods that are > 10X the sleep time. Is there something better than Thread.sleep that we should be using instead?

          Show
          stack added a comment - Seeing this on the streamy test cluster. 2 core machines. Whats up with our JVM scheduler such that a thread can get no time for periods that are > 10X the sleep time. Is there something better than Thread.sleep that we should be using instead?
          Hide
          Andrew Purtell added a comment -

          I think this issue is related to HBASE-15. DFS transaction timeouts or excessive sleeps are both indications of excessive system load.

          Show
          Andrew Purtell added a comment - I think this issue is related to HBASE-15 . DFS transaction timeouts or excessive sleeps are both indications of excessive system load.
          Hide
          Billy Pearson added a comment -

          I am seeing the same stuff when one of my servers are vary loaded also see this more often when the server is havening to swap for memory.
          This was causing my regions server to lose there leases so I had to extent the time in case one of these timeout happens longer then 30 sec.

          Also not when this happens on a region server that is hosting the meta table the master starts to hang up some too looking to scan the meta table.
          Seams the only time I see these are when the server are under heavy load and is maxed out on resources.

          Show
          Billy Pearson added a comment - I am seeing the same stuff when one of my servers are vary loaded also see this more often when the server is havening to swap for memory. This was causing my regions server to lose there leases so I had to extent the time in case one of these timeout happens longer then 30 sec. Also not when this happens on a region server that is hosting the meta table the master starts to hang up some too looking to scan the meta table. Seams the only time I see these are when the server are under heavy load and is maxed out on resources.
          Hide
          stack added a comment -

          Watching logs on this 'weaker' machine, I can predict when I'll see '... ten times longer...' messages; they come in middle of compaction. Compacting sends the load way up from about 1 up to 4 or so on these machines (they are running datanodes concurrently).

          Show
          stack added a comment - Watching logs on this 'weaker' machine, I can predict when I'll see '... ten times longer...' messages; they come in middle of compaction. Compacting sends the load way up from about 1 up to 4 or so on these machines (they are running datanodes concurrently).
          Hide
          stack added a comment -

          Looking at a log which has a bunch of these in it – the user has set timeout to 3minutes instead of default one minute – and they seem to happen when a compaction is running. It may be possible that compaction on weaker boxes can make it so other threads are starved.

          Show
          stack added a comment - Looking at a log which has a bunch of these in it – the user has set timeout to 3minutes instead of default one minute – and they seem to happen when a compaction is running. It may be possible that compaction on weaker boxes can make it so other threads are starved.

            People

            • Assignee:
              Unassigned
              Reporter:
              stack
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development