Index: src/docbkx/book.xml =================================================================== --- src/docbkx/book.xml (revision 1038359) +++ src/docbkx/book.xml (working copy) @@ -299,7 +299,10 @@ in branch-0.20-append to see list of patches involved. HBase bundles the Apache branch-0.20-append Hadoop. Replace the Hadoop jar bundled with HBase with that you have - installed on your cluster to avoid version mismatch issues. + installed on your cluster to avoid version mismatch issues; + for example, versions of CDH do not have HDFS-724 whereas + Hadoops branch-0.20-append branch does have HDFS-724. This + patch changes the RPC version because protocol was changed.
ssh @@ -984,6 +987,27 @@
Recommended Configuations +
<varname>zookeeper.session.timeout</varname> + The default timeout is three minutes. This means that if a server crash, + it will be three minutes before the Master notices the crash and starts recovery. + You might like to tune the timeout down to a minute or even less so the Master + notices failures the sooner. Before changing this value, be sure you have + your JVM garbage collection configuration under control otherwise, a long + garbage collection that lasts beyond the zookeeper session timeout will take out + your RegionServer (You might be fine with this -- you probably want recovery to start + on the server if a RegionServer has been in GC for a long period of time). + + To change this configuration, edit hbase-site.xml, + copy the changed file around the cluster and restart. + + We set this value high to save our having to field noob questions up on the mailing lists asking + why a RegionServer went down during a massive import. The usual cause is that their JVM is untuned and + they are running into long GC pauses. Our thinking is that + while users are getting familiar with HBase, we'd save them having to know all of its + intriciacies. Later when they've built some confidence, then they can play + with configuration such as this. + +
Configuration for large memory machines