Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-17130

Add support to specify an arbitrary number of reducers when writing HFiles for bulk load

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • mapreduce
    • None

    Description

      From the discussion from HBASE-16894 there is a set of use cases where writing to multiple regions in a single reducer can be helpful to reduce the overhead of MR jobs when a large number of regions exist in an HBase cluster and some regions can present a data skew, e.g. 100s or 1000s of regions with a very small number of rows vs. regions with 10s or millions or rows as part of the same job. And merging regions is not an option for the use case.

      Attachments

        Issue Links

          Activity

            People

              esteban Esteban Gutierrez
              esteban Esteban Gutierrez
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: