Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5664

try to relieve the BlockReaderLocal read() synchronized hotspot

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0, 3.0.0-alpha1
    • None
    • hdfs-client
    • None

    Description

      Current the BlockReaderLocal's read has a synchronized modifier:

      public synchronized int read(byte[] buf, int off, int len) throws IOException {
      

      In a HBase physical read heavy cluster, we observed some hotspots from dfsclient path, the detail strace trace could be found from: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241

      I haven't looked into the detail yet, put some raw ideas here firstly:
      1) replace synchronized with try lock with timeout pattern, so could fail-fast, 2) fallback to non-ssr mode if get a local reader lock failed.
      There're two suitable scenario at least to remove this hotspot:
      1) Local physical read heavy, e.g. HBase block cache miss ratio is high
      2) slow/bad disk.
      It would be helpful to achive a lower 99th percentile HBase read latency somehow.

      Attachments

        Activity

          People

            Unassigned Unassigned
            xieliang007 Liang Xie
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: