Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-16393 Improve computeHDFSBlocksDistribution
  3. HBASE-16398

optimize HRegion computeHDFSBlocksDistribution

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0, 2.0.0
    • 1.4.0, 2.0.0
    • regionserver
    • None

    Description

      First i assume there is no reference and link in a region family's directory.
      Without the patch to computeHDFSBlocksDistribution for a region family, there is 1+2*N rpc call, N is hfile numbers, The first rpc call is to DistributedFileSystem#listStatus to get hfiles, for every hfile there is two rpc call DistributedFileSystem#getFileStatus(path) and then DistributedFileSystem#getFileBlockLocations(status, start, length).
      With the patch to computeHDFSBlocksDistribution for a region family, there is 2 rpc call, they are DistributedFileSystem#getFileStatus(path) and DistributedFileSystem#listLocatedStatus(final Path p, final PathFilter filter).
      So if there is at least one hfile, with the patch, the rpc call will less.

      Attachments

        1. LocatedBlockStatusComparison.java
          11 kB
          Thiruvel Thirumoolan
        2. HBASE-16398.patch
          11 kB
          Lijin Bin
        3. HBASE-16398.branch-1.v1.patch
          10 kB
          Lijin Bin
        4. HBASE-16398_v5.patch
          12 kB
          Lijin Bin
        5. HBASE-16398_v4.patch
          12 kB
          Lijin Bin
        6. HBASE-16398_v3.patch
          11 kB
          Lijin Bin
        7. HBASE-16398_v2.patch
          5 kB
          Lijin Bin

        Activity

          People

            binlijin Lijin Bin
            binlijin Lijin Bin
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: