Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-16393 Improve computeHDFSBlocksDistribution
  3. HBASE-16398

optimize HRegion computeHDFSBlocksDistribution

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0, 2.0.0
    • 1.4.0, 2.0.0
    • regionserver
    • None

    Description

      First i assume there is no reference and link in a region family's directory.
      Without the patch to computeHDFSBlocksDistribution for a region family, there is 1+2*N rpc call, N is hfile numbers, The first rpc call is to DistributedFileSystem#listStatus to get hfiles, for every hfile there is two rpc call DistributedFileSystem#getFileStatus(path) and then DistributedFileSystem#getFileBlockLocations(status, start, length).
      With the patch to computeHDFSBlocksDistribution for a region family, there is 2 rpc call, they are DistributedFileSystem#getFileStatus(path) and DistributedFileSystem#listLocatedStatus(final Path p, final PathFilter filter).
      So if there is at least one hfile, with the patch, the rpc call will less.

      Attachments

        1. HBASE-16398_v2.patch
          5 kB
          Lijin Bin
        2. HBASE-16398_v3.patch
          11 kB
          Lijin Bin
        3. HBASE-16398_v4.patch
          12 kB
          Lijin Bin
        4. HBASE-16398_v5.patch
          12 kB
          Lijin Bin
        5. HBASE-16398.branch-1.v1.patch
          10 kB
          Lijin Bin
        6. HBASE-16398.patch
          11 kB
          Lijin Bin
        7. LocatedBlockStatusComparison.java
          11 kB
          Thiruvel Thirumoolan

        Activity

          People

            binlijin Lijin Bin
            binlijin Lijin Bin
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: