diff --git a/src/docbkx/book.xml b/src/docbkx/book.xml index 6248b19..1dff02c 100644 --- a/src/docbkx/book.xml +++ b/src/docbkx/book.xml @@ -286,8 +286,13 @@ Usually you'll want to use the latest version available except the problematic u
DNS -Basic name resolving must be working correctly on your cluster. - + HBase uses the local hostname to self-report it's IP address. Both forward and reverse DNS resolving should work. + If your machine has multiple interfaces, HBase will use the interface that the primary hostname resolves to. + If this is insufficient, you can set hbase.regionserver.dns.interface to indicate the primary interface. + This only works if your cluster + configuration is consistent and every host has the same network interface configuration. + Another alternative is setting hbase.regionserver.dns.nameserver to choose a different nameserver than the + system wide default.
NTP @@ -295,6 +300,7 @@ Usually you'll want to use the latest version available except the problematic u wild skew could generate odd behaviors. Run NTP on your cluster, or an equivalent. + If you are having problems querying data, or "weird" cluster operations, check system time!
@@ -313,6 +319,27 @@ Usually you'll want to use the latest version available except the problematic u running the HBase process is an operating system configuration, not an HBase configuration. +
+ <varname>ulimit</varname> on Ubuntu + + If you are on Ubuntu you will need to make the following changes: + + In the file /etc/security/limits.conf add a line like: + hadoop - nofile 32768 + + Replace 'hadoop' with whatever user is running hadoop and hbase. If you have + separate users, you will need 2 entries, one for each user. + + + In the file /etc/pam.d/common-session add as the last line in the file: + session required pam_limits.so + + Otherwise the changes in /etc/security/limits.conf won't be applied. + + + Don't forget to log out and back in again for the changes to take place! + +
@@ -338,17 +365,49 @@ Usually you'll want to use the latest version available except the problematic u daemons run on a single server, and distributed, where each of the daemons runs on different cluster node.
Standalone HBase - TODO + + This is the default mode straight out of the box. HBase stores data on local disk, in + the temp directory. Many OSes will clear that directory on reboot, so this is not + appropriate for extensive testing. The quickstart + has instructions on how to relocate the storage directories. In standalone mode, + HBase does not use HDFS, and runs a local zookeeper in the same JVM. + Zookeeper binds to a well known port so clients may talk to HBase. +
Pseudo-distributed - TODO + + In pseudo-distributed mode, a full Hadoop stack is run, including HDFS, + Zookeeper and HBase in separate JVM instances on one node. +
Distributed - TODO + In distributed mode, all the different components run on different nodes, + this is the deployment mode.
Client configuration and dependencies connecting to an HBase cluster - TODO + + Since the HBase master may move around, clients bootstrap from Zookeeper. Thus clients + require the Zookeeper quorum information in a hbase-site.xml that + is on their classpath. If you are configuring an IDE to run a HBase client, you should + include the conf/ directory on your classpath. + + + An example basic hbase-site.xml for client only: + + + + + hbase.zookeeper.quorum + example1,example2,example3 + The directory shared by region servers. + + + +]]> + +
Example Configurations