Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5712

Parallelize load of .regioninfo files in diagnostic/repair portion of hbck.

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.90.7, 0.92.2, 0.94.0, 0.95.2
    • Fix Version/s: 0.94.0, 0.95.0
    • Component/s: hbck
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      On heavily loaded hdfs's some dfs nodes may not respond quickly and backs off for 60s before attempting to read data from another datanode. Portions of the information gathered from hdfs (.regioninfo files) are loaded serially. With HBase with clusters with 100's, or 1000's, or 10000's regions encountering these 60s delay blocks progress and can be very painful.

      There is already some parallelization of portions of the hdfs information load operations and the goal here is move the reading of .regioninfos into the parallelized sections..

        Attachments

        1. hbase-5712-90.patch
          9 kB
          Jonathan Hsieh
        2. hbase-5712.patch
          10 kB
          Jonathan Hsieh
        3. hbase-5712-v2.patch
          10 kB
          Jonathan Hsieh
        4. hbase-5712-90-v2.patch
          9 kB
          Jonathan Hsieh

          Issue Links

            Activity

              People

              • Assignee:
                jmhsieh Jonathan Hsieh
                Reporter:
                jmhsieh Jonathan Hsieh
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: