Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-2180

Bad random read performance from synchronizing hfile.fddatainputstream

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.20.4
    • None
    • None
    • Reviewed

    Description

      deep in the HFile read path, there is this code:

      synchronized (in)

      { in.seek(pos); ret = in.read(b, off, n); }

      this makes it so that only 1 read per file per thread is active. this prevents the OS and hardware from being able to do IO scheduling by optimizing lots of concurrent reads.

      We need to either use a reentrant API (pread may be partially reentrant according to Todd) or use multiple stream objects, 1 per scanner/thread.

      Attachments

        1. 2180-v2.patch
          23 kB
          Michael Stack
        2. 2180.patch
          12 kB
          Michael Stack

        Issue Links

          Activity

            People

              stack Michael Stack
              ryanobjc ryan rawson
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: