Hadoop ======= Selecting a Hadoop version is critical for your HBase deployment. Below table shows some information about what versions of Hadoop are supported by various HBase versions. Based on the version of HBase, you should select the most appropriate version of Hadoop. HBase-0.92.x HBase-0.94.x HBase-0.96 Hadoop-0.20.205 S S X Hadoop-0.22.x S S X Hadoop-1.0 S S S Hadoop-1.1 NT S S Hadoop-2.x X NT S Where S = supported and tested, X = not supported NT = it should run, but not tested enough. Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its lib directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it is critical that the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase everywhere on your cluster. Hadoop version mismatch issues have various manifestations but often all looks like its hung up. Apache HBase 0.92 and 0.94 -------------------------- HBase 0.92 and 0.94 versions can work with Hadoop versions, 0.20.205, 0.22.x, 1.0.x and 1.1.x. 0.94 has not been tested much with Hadoop-2.x series although you can compile the code with using the maven profile (see top level pom.xml) Apache HBase 0.96 ------------------ Apache HBase 0.96.0 requires Apache Hadoop 1.x at a minimum, and it can run equally well on hadoop-2.0. As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required. We will no longer run properly on older Hadoops such as 0.20.205 or branch-0.20-append. Do not move to Apache HBase 0.96.x if you cannot upgrade your Hadoop[10]. Hadoop versions 0.20.x - 1.x ---------------------------- HBase will lose data unless it is running on an HDFS that has a durable sync implementation. DO NOT use Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 which DO NOT have this attribute. Currently only Hadoop versions 0.20.205.x or any release in excess of this version -- this includes hadoop-1.0.0 -- have a working, durable sync [6]. Sync has to be explicitly enabled by setting dfs.support.append equal to true on both the client side -- in hbase-site.xml -- and on the serverside in hdfs-site.xml (The sync facility HBase needs is a subset of the append code path). dfs.support.append true You will have to restart your cluster after making this edit. Ignore the chicken-little comment you'll find in the hdfs-default.xml in the description for the dfs.support.append configuration. Apache HBase on Secure Hadoop ----------------------------- Apache HBase will run on any Hadoop 0.20.x that incorporates Hadoop security features as long as you do as suggested above and replace the Hadoop jar that ships with HBase with the secure version. If you want to read more about how to setup Secure HBase, see Section 8.1, “Secure Client Access to Apache HBase”. dfs.datanode.max.xcievers ------------------------- same as previous