Index: src/java/overview.html =================================================================== --- src/java/overview.html (revision 675794) +++ src/java/overview.html (working copy) @@ -28,7 +28,8 @@
-
${HBASE_HOME}: Set HBASE_HOME to the location of the HBase root: e.g. /user/local/hbase.
-Edit ${HBASE_HOME}/conf/hbase-env.sh. In this file you can
+Define ${HBASE_HOME} to be the location of the root of your HBase installation, e.g.
+/user/local/hbase. Edit ${HBASE_HOME}/conf/hbase-env.sh. In this file you can
set the heapsize for HBase, etc. At a minimum, set JAVA_HOME to point at the root of
your Java installation.
+
If you are running a standalone operation, there should be nothing further to configure; proceed to -Running and Confirming Your Installation. If you are running a distributed +Running and Confirming Your Installation. If you are running a distributed operation, continue reading.
-Distributed mode requires an instance of the Hadoop Distributed File System (DFS). See the Hadoop requirements and instructions for how to set up a DFS.
-Once you have confirmed your DFS setup, configuring HBase requires modification of the following two files:
-${HBASE_HOME}/conf/hbase-site.xml and ${HBASE_HOME}/conf/regionservers.
-The former needs to be pointed at the running Hadoop DFS instance. The latter file lists
-all the members of the HBase cluster.
-
Use hbase-site.xml to override the properties defined in
+
+
A pseudo-distributed operation is simply a distributed operation run on a single host.
+Once you have confirmed your DFS setup, configuring HBase for use on one host requires modification of
+${HBASE_HOME}/conf/hbase-site.xml, which needs to be pointed at the running Hadoop DFS instance.
+Use hbase-site.xml to override the properties defined in
${HBASE_HOME}/conf/hbase-default.xml (hbase-default.xml itself
-should never be modified). At a minimum the hbase.master and the
-hbase.rootdir properties should be redefined
-in hbase-site.xml to configure the host:port pair on which the
-HMaster runs (read about the
-HBase master, regionservers, etc) and to point HBase at the Hadoop filesystem to use. For
-example, adding the below to your hbase-site.xml says the master is up on port 60000 on the host
-example.org and that HBase should use the /hbase directory in the HDFS whose namenode
-is at port 9000, again on example.org:
+should never be modified). At a minimum the hbase.rootdir property should be redefined
+in hbase-site.xml to point HBase at the Hadoop filesystem to use. For example, adding the property
+below to your hbase-site.xml says that HBase should use the /hbase directory in the
+HDFS whose namenode is at port 9000 on your local machine:
<configuration> + ... + <property> + <name>hbase.rootdir</name> + <value>hdfs://localhost:9000/hbase</value> + <description>The directory shared by region servers. + </description> + </property> + ... +</configuration> ++
hbase-site.xml,
+you must also configure hbase.master to the host:port pair on which the HMaster
+runs (read about the HBase master,
+regionservers, etc). For example, adding the below to your hbase-site.xml says the
+master is up on port 60000 on the host example.org:
+
+
+<configuration>
+ ...
<property>
<name>hbase.master</name>
<value>example.org:60000</value>
<description>The host and port that the HBase master runs at.
</description>
</property>
-
- <property>
- <name>hbase.rootdir</name>
- <value>hdfs://example.org:9000/hbase</value>
- <description>The directory shared by region servers.
- </description>
- </property>
-
+ ...
</configuration>
-The regionserver file lists all the hosts running HRegionServers, one
-host per line (This file in HBase is like the hadoop slaves file at
+In addition to hbase-site.xml, a fully-distributed operation requires that you also modify
+${HBASE_HOME}/conf/regionservers. regionserver lists all the hosts
+running HRegionServers, one host per line (This file in HBase is like the hadoop slaves file at
${HADOOP_HOME}/conf/slaves).
Of note, if you have made HDFS client configuration on your hadoop cluster, hbase will not @@ -114,8 +124,8 @@
hbase-site.xmldfs.replication. If for example,
-you want to run with a replication factor of 5, hbase will make files will create files with
-the default of 3 unless you do the above to make the configuration available to hbase.
+you want to run with a replication factor of 5, hbase will create files with the default of 3 unless
+you do the above to make the configuration available to hbase.
If you are running a distributed cluster you will need to start the Hadoop DFS daemons
before starting HBase and stop the daemons after HBase has shut down. Start and
stop the Hadoop DFS daemons by running ${HADOOP_HOME}/bin/start-dfs.sh.
-Ensure it started properly by testing the put and get of files into the Hadoop filesystem.
+You can ensure it started properly by testing the put and get of files into the Hadoop filesystem.
HBase does not normally use the mapreduce daemons. These do not need to be started.
Start HBase with the following command: @@ -169,14 +179,13 @@
+import java.io.IOException;
import org.apache.hadoop.hbase.client.HTable;
-import org.apache.hadoop.hbase.HBaseConfiguration;
-import org.apache.hadoop.hbase.HStoreKey;
-import org.apache.hadoop.hbase.HScannerInterface;
+import org.apache.hadoop.hbase.client.Scanner;
import org.apache.hadoop.hbase.io.BatchUpdate;
import org.apache.hadoop.hbase.io.Cell;
+import org.apache.hadoop.hbase.io.RowResult;
import org.apache.hadoop.io.Text;
-import java.io.IOException;
public class MyClient {
@@ -218,42 +227,37 @@
// convert it yourself.
Cell cell = table.get(new Text("myRow"),
new Text("myColumnFamily:columnQualifier1"));
- String valueStr = new String(valueBytes.getValue());
+ String valueStr = new String(cell.getValue());
// Sometimes, you won't know the row you're looking for. In this case, you
// use a Scanner. This will give you cursor-like interface to the contents
// of the table.
- HStoreKey row = new HStoreKey();
- SortedMap columns = new TreeMap();
- HScannerInterface scanner =
+ Scanner scanner =
// we want to get back only "myColumnFamily:columnQualifier1" when we iterate
- table.obtainScanner(new Text[]{new Text("myColumnFamily:columnQualifier1")},
- // we want to start scanning from an empty Text, meaning the beginning of
- // the table
- new Text(""));
+ table.getScanner(new Text[]{new Text("myColumnFamily:columnQualifier1")});
// Scanners in HBase 0.2 return RowResult instances. A RowResult is like the
// row key and the columns all wrapped up in a single interface.
// RowResult#getRow gives you the row key. RowResult also implements
- // Map, so you can get to your column results easily.
+ // Map, so you can get to your column results easily.
// Now, for the actual iteration. One way is to use a while loop like so:
RowResult rowResult = scanner.next();
while(rowResult != null) {
// print out the row we found and the columns we were looking for
- System.out.println("Found row: " + rowResult.getRow() + " with value: " +
- new String(rowResult.get("myColumnFamily:columnQualifier1")));
+ System.out.println("Found row: " + new String(rowResult.getRow()) + " with value: " +
+ rowResult.get(new Text("myColumnFamily:columnQualifier1").getBytes()));
rowResult = scanner.next();
}
// The other approach is to use a foreach loop. Scanners are iterable!
- for (RowResult rowResult : scanner) {
+ for (RowResult result : scanner) {
// print out the row we found and the columns we were looking for
- System.out.println("Found row: " + rowResult.getRow() + " with value: " +
- new String(rowResult.get("myColumnFamily:columnQualifier1")));
+ System.out.println("Found row: " + new String(result.getRow()) + " with value: " +
+ result.get(new Text("myColumnFamily:columnQualifier1").getBytes()));
}
// Make sure you close your scanners when you are done!