Index: src/docbkx/troubleshooting.xml
===================================================================
--- src/docbkx/troubleshooting.xml (revision 1135734)
+++ src/docbkx/troubleshooting.xml (working copy)
@@ -442,6 +442,21 @@
On your clients, edit $HBASE_HOME/conf/log4j.properties and change this: log4j.logger.org.apache.hadoop.hbase=DEBUG to this: log4j.logger.org.apache.hadoop.hbase=INFO, or even log4j.logger.org.apache.hadoop.hbase=WARN.
+
+ Long Client Pauses With Compression
+ This is a fairly frequent question on the HBase dist-list. The scenario is that a client is typically inserting a lot of data into a
+ relatively un-optimized HBase cluster. Compression can exacerbate the pauses, although it is not the source of the problem.
+ See on the pattern for pre-creating regions and confirm that the table isn't starting with a single region.
+ See for cluster configuration, particularly hbase.hstore.blockingStoreFiles, hbase.hregion.memstore.block.multiplier,
+ MAX_FILESIZE (region size), and MEMSTORE_FLUSHSIZE.
+ A slightly longer explanation of why pauses can happen is as follows: Puts are sometimes blocked on the MemStores which are blocked by the flusher thread which is blocked because there are
+ too many files to compact because the compactor is given too many small files to compact and has to compact the same data repeatedly. This situation can occur even with minor compactions.
+ Compounding this situation, HBase doesn't compress data in memory. Thus, the 64MB that lives in the MemStore could become a 6MB file after compression - which results in a smaller StoreFile. The upside is that
+ more data is packed into the same region, but performance is achieved by being able to write larger files - which is why HBase waits until the flushize before writing a new StoreFile. And smaller StoreFiles
+ become targets for compaction. Without compression the files are much bigger and don't need as much compaction, however this is at the expense of I/O.
+
+
+
@@ -586,6 +601,12 @@
See for other general information about ZooKeeper troubleshooting.
+
+ NotServingRegionException
+ This exception is "normal" when found in the RegionServer logs at DEBUG level. This exception is returned back to the client
+ and then the client goes back to .META. to find the new location of the moved region.
+ However, if the NotServingRegionException is logged ERROR, then the client ran out of retries and something probably wrong.
+