From a3c62286166757927b7f4bd2f6cd637393026deb Mon Sep 17 00:00:00 2001 From: Michael Stack Date: Tue, 3 Apr 2018 10:27:38 -0700 Subject: [PATCH] HBASE-20337 Update the doc on how to setup shortcircuit reads; its stale --- src/main/asciidoc/_chapters/schema_design.adoc | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/main/asciidoc/_chapters/schema_design.adoc b/src/main/asciidoc/_chapters/schema_design.adoc index 4cd7656ad6..81b840c9c7 100644 --- a/src/main/asciidoc/_chapters/schema_design.adoc +++ b/src/main/asciidoc/_chapters/schema_design.adoc @@ -1148,16 +1148,30 @@ Detect regionserver failure as fast as reasonable. Set the following parameters: - `dfs.namenode.avoid.read.stale.datanode = true` - `dfs.namenode.avoid.write.stale.datanode = true` +[[shortcircuit.reads]] === Optimize on the Server Side for Low Latency - -* Skip the network for local blocks. In `hbase-site.xml`, set the following parameters: +Skip the network for local blocks when the RegionServer goes to read from HDFS by exploiting HDFS's +link:https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html[Short-Circuit Local Reads] facility. +Note how setup must be done both at the datanode and on the dfsclient ends of the conneciton -- i.e. at the RegionServer +and how both ends need to have loaded the hadoop native `.so` library. +After configuring your hadoop setting _dfs.client.read.shortcircuit_ to _true_ and configuring +the _dfs.domain.socket.path_ path for the datanode and dfsclient to share and restarting, next configure +the regionserver/dfsclient side. + +* In `hbase-site.xml`, set the following parameters: - `dfs.client.read.shortcircuit = true` +- `dfs.domain.socket.path` to match what was set for the datanodes. - `dfs.client.read.shortcircuit.buffer.size = 131072` (Important to avoid OOME) * Ensure data locality. In `hbase-site.xml`, set `hbase.hstore.min.locality.to.skip.major.compact = 0.7` (Meaning that 0.7 \<= n \<= 1) * Make sure DataNodes have enough handlers for block transfers. In `hdfs-site.xml`, set the following parameters: - `dfs.datanode.max.xcievers >= 8192` - `dfs.datanode.handler.count =` number of spindles +Check the RegionServer logs after restart. You should only see complaint if misconfiguration. +Otherwise, shortcircuit read operates quietly in background. It does not provide metrics so +no optics on how effective it is but read latencies should show a marked improvement, especially if +good data locality, lots of random reads, and dataset is larger than available cache. + === JVM Tuning ==== Tune JVM GC for low collection latencies -- 2.16.3