Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-12825

Implement processor to get row key ranges for HBase regions

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0-M3
    • None
    • None

    Description

      A common way for parallelizing scan operations to HBase is to scan by row key ranges. In the HBase architecture, HBase splits tables into regions, each with a range of row keys. These row key ranges are mutually exclusive, and they include all the row keys.

      The manual approach currently to parallelize scans to HBase via row key ranges is to go to HBase shell, perform the "list_regions" function to obtain row key ranges. This approach has its downsides, most importantly being the fact that row key ranges are not static. HBase regions may also split, creating two regions with the row key range split in the middle.

      Providing a way for NiFi to obtain these row key ranges per HBase region could help improve the ease of creating a flow that performs scans to HBase parallelized by row key range. Once we know row key ranges, this information could be easily fed into a scanning processor (i.e. ScanHBase).

       

      Attachments

        Issue Links

          Activity

            People

              emilio.setiadarma Emilio Setiadarma
              emilio.setiadarma Emilio Setiadarma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h