2.4. The Important Configurations

Below we list the important Configurations. We've divided this section into required configuration and worth-a-look recommended configs.

2.4.1. Required Configurations

Here are some configurations you must configure to suit your deploy.

2.4.1.1. ulimit

HBase is a database, it uses a lot of files at the same time. The default ulimit -n of 1024 on *nix systems is insufficient. Any significant amount of loading will lead you to FAQ: Why do I see "java.io.IOException...(Too many open files)" in my logs?. You will also notice errors like:

2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
      2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901
      

Do yourself a favor and change the upper bound on the number of file descriptors. Set it to north of 10k. See the above referenced FAQ for how.

To be clear, upping the file descriptors for the user who is running the HBase process is an operating system configuration, not an HBase configuration.

2.4.1.2. dfs.datanode.max.xcievers

Hadoop HDFS has an upper bound of files that it will serve at one same time, called xcievers (yes, this is misspelled). Again, before doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml setting the xceivers value to at least the following:

      <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>2047</value>
      </property>
      

2.4.2. Recommended Configuations

2.4.2.1. LZO compression

You should consider enabling LZO compression. Its near-frictionless and in most all cases boosts performance.

Unfortunately, HBase cannot ship with LZO because of the licensing issues; HBase is Apache-licensed, LZO is GPL. Therefore LZO install is to be done post-HBase install. See the Using LZO Compression wiki page for how to make LZO work with HBase.

A common problem users run into when using LZO is that while initial setup of the cluster runs smooth, a month goes by and some sysadmin goes to add a machine to the cluster only they'll have forgotten to do the LZO fixup on the new machine. In versions since HBase 0.90.0, we should fail in a way that makes it plain what the problem is, but maybe not. Remember you read this paragraph[1].



[1] See hbase.regionserver.codec for a feature to help protect against failed LZO install