Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.2.0, 3.0.0-alpha1
-
None
-
None
Description
Current the BlockReaderLocal's read has a synchronized modifier:
public synchronized int read(byte[] buf, int off, int len) throws IOException {
In a HBase physical read heavy cluster, we observed some hotspots from dfsclient path, the detail strace trace could be found from: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241
I haven't looked into the detail yet, put some raw ideas here firstly:
1) replace synchronized with try lock with timeout pattern, so could fail-fast, 2) fallback to non-ssr mode if get a local reader lock failed.
There're two suitable scenario at least to remove this hotspot:
1) Local physical read heavy, e.g. HBase block cache miss ratio is high
2) slow/bad disk.
It would be helpful to achive a lower 99th percentile HBase read latency somehow.