Details

    • Type: Umbrella Umbrella
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      Currently HBase treats all tables, users, and workloads in the same way.
      This is ok, until multiple users and workloads are applied on the same cluster/table. Some workloads/users must be prioritized over others, and some other workloads must not impact others.

      We can separate the problem into three components.

      • Isolation/Partitioning (Physically split on different machines)
      • Scheduling (Prioritize small/interactive workloads vs long/batch workloads)
      • Quotas (Limit a user/table requests/sec or size)

      This is the umbrella jira tracking the multi-tenancy related tasks.
      An initial design document is up for comments here: https://docs.google.com/document/d/1ygIwZpDWQuMPdfcryckic6ODi5DHQkrzXKjmOJodfs0

        Issue Links

          Activity

          Hide
          Andrew Purtell added a comment -

          On namespaces, the document says:

          Namespaces: Will not be mentioned much in this document, since we are considering the smallest unity of scalability to make the examples, and namespaces are just a group of tables, so everytime you see “Table” you can replace it with “Namespace”. A rule applied to a namespace will be applied to all the tables in it.

          Elsewhere, on quotas:

          From an implementation point of view quotas will be part of scheduling, since we have to block/limit/throttle incoming requests based on the usage. The simplest way to implement quotas is by using a “token bucket” algorithm (but we will have an interface to make it pluggable).

          What happens if there is a quota on a namespace, and quotas on tables within?

          Show
          Andrew Purtell added a comment - On namespaces, the document says: Namespaces: Will not be mentioned much in this document, since we are considering the smallest unity of scalability to make the examples, and namespaces are just a group of tables, so everytime you see “Table” you can replace it with “Namespace”. A rule applied to a namespace will be applied to all the tables in it. Elsewhere, on quotas: From an implementation point of view quotas will be part of scheduling, since we have to block/limit/throttle incoming requests based on the usage. The simplest way to implement quotas is by using a “token bucket” algorithm (but we will have an interface to make it pluggable). What happens if there is a quota on a namespace, and quotas on tables within?
          Hide
          Matteo Bertozzi added a comment -

          What happens if there is a quota on a namespace, and quotas on tables within?

          The quota on the table has the priority, but is part of the namespace.
          e.g. let say that you limit the Namespace to 10Gb, you can enforce a limit on the table between 1 and 10G and the other tables will get what remains

          Show
          Matteo Bertozzi added a comment - What happens if there is a quota on a namespace, and quotas on tables within? The quota on the table has the priority, but is part of the namespace. e.g. let say that you limit the Namespace to 10Gb, you can enforce a limit on the table between 1 and 10G and the other tables will get what remains
          Hide
          Andrew Purtell added a comment -

          Only tackling quotas, not also reservations, assumed.

          Show
          Andrew Purtell added a comment - Only tackling quotas, not also reservations, assumed.
          Hide
          James Taylor added a comment -

          For existing support of multi-tenancy in Apache Phoenix, see here: http://phoenix.incubator.apache.org/multi-tenancy.html. I think to do multi-tenancy in a good way, you need to hide a lot of details behind a good client API. This is what Apache Phoenix provides.

          Show
          James Taylor added a comment - For existing support of multi-tenancy in Apache Phoenix, see here: http://phoenix.incubator.apache.org/multi-tenancy.html . I think to do multi-tenancy in a good way, you need to hide a lot of details behind a good client API. This is what Apache Phoenix provides.
          Hide
          stack added a comment -

          James Taylor You are not suggesting that apache phoenix 'solves' multi-tenancy, are you? Instead, I think you meant to write encouraging words about the great work Matteo Bertozzi is doing here adding primitives to hbase that projects like phoenix can make use of and publish nice apis against – right? (smile).

          Show
          stack added a comment - James Taylor You are not suggesting that apache phoenix 'solves' multi-tenancy, are you? Instead, I think you meant to write encouraging words about the great work Matteo Bertozzi is doing here adding primitives to hbase that projects like phoenix can make use of and publish nice apis against – right? (smile).
          Hide
          James Taylor added a comment -

          Correct, stack. I should have been more verbose. I just wanted to point out some similar, complimentary work that is also going on in the hope that all of this great work can converge (rather than be duplicated). FWIW, a number of folks in the PhD program at Duke are interested as well and working in this area. Perhaps some collaboration is in order?

          Show
          James Taylor added a comment - Correct, stack . I should have been more verbose. I just wanted to point out some similar, complimentary work that is also going on in the hope that all of this great work can converge (rather than be duplicated). FWIW, a number of folks in the PhD program at Duke are interested as well and working in this area. Perhaps some collaboration is in order?
          Hide
          stack added a comment -

          FWIW, a number of folks in the PhD program at Duke are interested as well and working in this area. Perhaps some collaboration is in order?

          Sure. How. Where. When (smile). We game.

          Show
          stack added a comment - FWIW, a number of folks in the PhD program at Duke are interested as well and working in this area. Perhaps some collaboration is in order? Sure. How. Where. When (smile). We game.
          Hide
          Elliott Clark added a comment -

          Can we put this on the schedule for the after HBaseCon hackathon ?

          Show
          Elliott Clark added a comment - Can we put this on the schedule for the after HBaseCon hackathon ?
          Show
          stack added a comment - Elliott Clark I added it http://www.meetup.com/hackathon/events/176659262/

            People

            • Assignee:
              Matteo Bertozzi
              Reporter:
              Matteo Bertozzi
            • Votes:
              0 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

              • Created:
                Updated:

                Development