Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Sub-region provides a light weight management below region level.
      Matt Corgan has a nice summary of the relationship between region size and number of regions on region server:
      https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13575024&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13575024

      HBASE-7667 proposed stripe compaction. However, to fully achieve fine-grained management, more components should participate:

      • memstore flush should ideally have knowledge about what makes stripe compaction work efficiently
      • we need to figure out where to store sub-region boundary information so that components have easy access. Boundary information should sustain after region moves from one server to another.

      Since HBASE-7667 focuses on compaction aspect, this JIRA discusses sub-region management in other components so that we better understand the benefits and complexities.

        Issue Links

          Activity

          Hide
          stack added a comment -

          This issue looks totally speculative and a waste of a JIRA number. Are you going to work on this Ted? Is the basis a comment by Matt Corgan in another issue? What is the fit criteria for closing this issue? Subregions showing in the UI? How would that even work? We have lists now w/ thousands of regions and you would add a new layer of subregions?

          Please close unless you intend to work on this and unless you provide more substance on what subregions are about.

          Show
          stack added a comment - This issue looks totally speculative and a waste of a JIRA number. Are you going to work on this Ted? Is the basis a comment by Matt Corgan in another issue? What is the fit criteria for closing this issue? Subregions showing in the UI? How would that even work? We have lists now w/ thousands of regions and you would add a new layer of subregions? Please close unless you intend to work on this and unless you provide more substance on what subregions are about.
          Hide
          Ted Yu added a comment -

          I do intend to work on this after wrapping up snapshots feature (in a satisfiable way) and other high-priority 0.96 tasks.

          From Sergey (https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13578090&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13578090):

          My reasoning was - too many really tiny files, plus scope creep into memstore.

          From Matt (https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13578090&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13578090):

          Gotcha. Agree about limiting scope.

          So HBASE-7667 wouldn't touch non-compaction components (for the time being).

          We have lists now w/ thousands of regions and you would add a new layer of subregions?

          From Jimmy's comment in HBASE-7667 @ 10/Feb/13 23:56:

          Stripes don't have overlapping keyrange with other stripes. So each stripe is just like a sub-region.

          We can discuss whether the concept of sub-region makes sense.

          Show
          Ted Yu added a comment - I do intend to work on this after wrapping up snapshots feature (in a satisfiable way) and other high-priority 0.96 tasks. From Sergey ( https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13578090&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13578090): My reasoning was - too many really tiny files, plus scope creep into memstore. From Matt ( https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13578090&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13578090): Gotcha. Agree about limiting scope. So HBASE-7667 wouldn't touch non-compaction components (for the time being). We have lists now w/ thousands of regions and you would add a new layer of subregions? From Jimmy's comment in HBASE-7667 @ 10/Feb/13 23:56: Stripes don't have overlapping keyrange with other stripes. So each stripe is just like a sub-region. We can discuss whether the concept of sub-region makes sense.
          Hide
          stack added a comment -

          Ted Yu Please resolve this issue as invalid until you come up w/ better notion of what a subregion is in the first place, what its scope is in the second place, and then once this is done, then you could have an issue on their 'management'.

          You do not answer my questions. Instead you provoke more... this issue is for discussing what subregions are? Instead you quote others from another issue on compaction stripes (I click through to discussion that makes allusion but the notion is fuzzy at best and an analogy at its most hard – the jimmy quote). You end that this issue is about discussing whether or not subregions make sense when you start w/ adding them to the UI and their management.

          This issue does not help. It confuses. Please clean it up or close.

          Show
          stack added a comment - Ted Yu Please resolve this issue as invalid until you come up w/ better notion of what a subregion is in the first place, what its scope is in the second place, and then once this is done, then you could have an issue on their 'management'. You do not answer my questions. Instead you provoke more... this issue is for discussing what subregions are? Instead you quote others from another issue on compaction stripes (I click through to discussion that makes allusion but the notion is fuzzy at best and an analogy at its most hard – the jimmy quote). You end that this issue is about discussing whether or not subregions make sense when you start w/ adding them to the UI and their management. This issue does not help. It confuses. Please clean it up or close.
          Hide
          Ted Yu added a comment -

          Subregion is hardly a new idea. I am open to other terms (arena, section, etc).

          Subregions divide the key space of a region into (potentially variable-width) non-overlapping segments.
          In terms of compaction, subregions map to stripes.

          In terms of memstore, there can be counterpart to stripes.
          Matt Corgan proposed (see HBASE-3484) memstore be represented as Set<Set<KeyValue>>.
          Another possibility is to use List<Set<KeyValue>> for memstore. The goal is the same: flushing doesn't produce L0 files (that have all the keys in the region). Each subregion flushes into corresponding stripe of store files.
          Some index would facilitate quick lookup of subregion in the collection of Set<KeyValue>.

          Refactoring of memstore would be done first to make subregion pluggable.

          Will continue to think through this topic.

          Show
          Ted Yu added a comment - Subregion is hardly a new idea. I am open to other terms (arena, section, etc). Subregions divide the key space of a region into (potentially variable-width) non-overlapping segments. In terms of compaction, subregions map to stripes. In terms of memstore, there can be counterpart to stripes. Matt Corgan proposed (see HBASE-3484 ) memstore be represented as Set<Set<KeyValue>>. Another possibility is to use List<Set<KeyValue>> for memstore. The goal is the same: flushing doesn't produce L0 files (that have all the keys in the region). Each subregion flushes into corresponding stripe of store files. Some index would facilitate quick lookup of subregion in the collection of Set<KeyValue>. Refactoring of memstore would be done first to make subregion pluggable. Will continue to think through this topic.
          Hide
          stack added a comment -

          Resolving speculative issue pulled from comments made in other issues w/ no fit criteria for judging when this issue is done.

          Show
          stack added a comment - Resolving speculative issue pulled from comments made in other issues w/ no fit criteria for judging when this issue is done.

            People

            • Assignee:
              Unassigned
              Reporter:
              Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development