HBase
  1. HBase
  2. HBASE-4047

[Coprocessors] Generic external process host

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Coprocessors
    • Labels:
      None

      Description

      Where HBase coprocessors deviate substantially from the design (as I understand it) of Google's BigTable coprocessors is we've reimagined it as a framework for internal extension. In contrast BigTable coprocessors run as separate processes colocated with tablet servers. The essential trade off is between performance, flexibility and possibility, and the ability to control and enforce resource usage.

      Since the initial design of HBase coprocessors some additional considerations are in play:

      • Developing computational frameworks sitting directly on top of HBase hosted in coprocessor(s);
      • Introduction of the map reduce next generation (mrng) resource management model, and the probability that limits will be enforced via cgroups at the OS level after this is generally available, e.g. when RHEL 6 deployments are common;
      • The possibility of deployment of HBase onto mrng-enabled Hadoop clusters via the mrng resource manager and a HBase-specific application controller.

      Therefore we should consider developing a coprocessor that is a generic host for another coprocessor, but one that forks a child process, loads the target coprocessor into the child, establishes a bidirectional pipe and uses an eventing model and umbilical protocol to provide for the coprocessor loaded into the child the same semantics as if it was loaded internally to the parent, and (eventually) use available resource management capabilities on the platform – perhaps via the mrng resource controller or directly with cgroups – to limit the child as desired by system administrators or the application designer.

        Issue Links

          Activity

          Hide
          stack added a comment -

          wow

          Show
          stack added a comment - wow
          Hide
          Arun C Murthy added a comment -

          Andrew, sounds exciting! Glad to help in any way possible.

          Some questions:

          1. Is 'generic host co-processor' a system process i.e. managed by HBase itself?
          2. Does the 'generic host' co-processor live forever on each region server? Or is it launched on demand?
          Show
          Arun C Murthy added a comment - Andrew, sounds exciting! Glad to help in any way possible. Some questions: Is 'generic host co-processor' a system process i.e. managed by HBase itself? Does the 'generic host' co-processor live forever on each region server? Or is it launched on demand?
          Hide
          eric baldeschwieler added a comment -

          Interesting! I've been very concerned about the implications for multi-teneted use cases of implementing co-processors hosted inside HBase. This seems like a very good idea. Once 0.23 is real, I'll see what I can do to help with this. I've also been thinking about HBase inside MR as you propose. Is there a jira for that?

          Show
          eric baldeschwieler added a comment - Interesting! I've been very concerned about the implications for multi-teneted use cases of implementing co-processors hosted inside HBase. This seems like a very good idea. Once 0.23 is real, I'll see what I can do to help with this. I've also been thinking about HBase inside MR as you propose. Is there a jira for that?
          Hide
          Andrew Purtell added a comment -

          @Arun:

          1. A child JVM managed by HBase itself.

          2. (Re)Launched on demand.

          Show
          Andrew Purtell added a comment - @Arun: 1. A child JVM managed by HBase itself. 2. (Re)Launched on demand.
          Hide
          Andrew Purtell added a comment -

          @Eric Ideas for frameworks for computation on top of HBase can be found in HBASE-3131 and HBASE-3220. Also, the proof of concept patch for HBASE-2000 included a toy mapreduce framework.

          Show
          Andrew Purtell added a comment - @Eric Ideas for frameworks for computation on top of HBase can be found in HBASE-3131 and HBASE-3220 . Also, the proof of concept patch for HBASE-2000 included a toy mapreduce framework.
          Hide
          Andrew Purtell added a comment -

          Assigning to me as I will be starting implementation of this middle of September.

          If someone wants to work on this sooner, no problem, just reassign.

          Show
          Andrew Purtell added a comment - Assigning to me as I will be starting implementation of this middle of September. If someone wants to work on this sooner, no problem, just reassign.
          Hide
          Arun C Murthy added a comment -

          Andrew, there are several non MR frameworks being built on NextGen MR right now - happy to help more if you are planning on using it:

          1. Spark - https://github.com/mesos/spark-yarn/
          2. MPI - MAPREDUCE-2911
          Show
          Arun C Murthy added a comment - Andrew, there are several non MR frameworks being built on NextGen MR right now - happy to help more if you are planning on using it: Spark - https://github.com/mesos/spark-yarn/ MPI - MAPREDUCE-2911
          Hide
          Andrew Purtell added a comment -

          Getting to this a bit late, thinking about design.

          Here are some possible motivating cases:

          • A hot value cache implemented in C/C++
          • Indexing and search with Lucene indexes hosted on a colocated (impl bundled/linked with the external coprocessor and private to it) R+W distributed FS like Gluster
          • Support something we are building internally that requires efficient hand off of HFiles between processes for compaction strategy override.

          Suggestions welcome, preferably useful to real activities you may be undertaking.

          Show
          Andrew Purtell added a comment - Getting to this a bit late, thinking about design. Here are some possible motivating cases: A hot value cache implemented in C/C++ Indexing and search with Lucene indexes hosted on a colocated (impl bundled/linked with the external coprocessor and private to it) R+W distributed FS like Gluster Support something we are building internally that requires efficient hand off of HFiles between processes for compaction strategy override. Suggestions welcome, preferably useful to real activities you may be undertaking.
          Hide
          Andrew Purtell added a comment -

          Start with testcases, the first a test that confirms a stuck child process via SIGSTOP doesn't take down the regionserver. Thinking there should be three selectable strategies:

          1. Close and reopen the region, triggering force termination of the stuck child on close, and fork/initialization of a new child on open, along with reinit of all region related resources, other coprocessors, etc.

          2. Unload/reload the malfunctioning coprocessor. Will require some work in the coprocessor framework to actually support unloading in a reasonable way. The JVM may make this complicated for integrated CPs, so perhaps just for those hosted in external processes.

          3. Unload/terminate the malfunctioning coprocessor and continue operation. Consider changes in the CP framework for temporary blacklisting, will need that to avoid loading the suspect CP after a split.

          Show
          Andrew Purtell added a comment - Start with testcases, the first a test that confirms a stuck child process via SIGSTOP doesn't take down the regionserver. Thinking there should be three selectable strategies: 1. Close and reopen the region, triggering force termination of the stuck child on close, and fork/initialization of a new child on open, along with reinit of all region related resources, other coprocessors, etc. 2. Unload/reload the malfunctioning coprocessor. Will require some work in the coprocessor framework to actually support unloading in a reasonable way. The JVM may make this complicated for integrated CPs, so perhaps just for those hosted in external processes. 3. Unload/terminate the malfunctioning coprocessor and continue operation. Consider changes in the CP framework for temporary blacklisting, will need that to avoid loading the suspect CP after a split.
          Hide
          Asaf Mesika added a comment -

          Was this idea abandoned?

          Show
          Asaf Mesika added a comment - Was this idea abandoned?
          Hide
          Andrew Purtell added a comment -

          Asaf Mesika Not at all, but I had to move to more immediate priorities. Hoping to circle back and do this in 2013.

          Show
          Andrew Purtell added a comment - Asaf Mesika Not at all, but I had to move to more immediate priorities. Hoping to circle back and do this in 2013.
          Hide
          Asaf Mesika added a comment -

          This truly sounds like a great feature. I was wondering:

          • Did you find any performance penalties for shifting data back and forth between the processes? Did you this using the loopback interface?
          • What method did you choose to communicate between those processes? TCP? Output stream piping?
          Show
          Asaf Mesika added a comment - This truly sounds like a great feature. I was wondering: Did you find any performance penalties for shifting data back and forth between the processes? Did you this using the loopback interface? What method did you choose to communicate between those processes? TCP? Output stream piping?
          Hide
          Andrew Purtell added a comment -

          Asaf Mesika I didn't get beyond some early high level thoughts. Therefore there is no data, but sure there will be some performance penalty, we must introduce an RPC mechanism between the RegionServer and the child external coprocessor host.

          It seems reasonable that the external coprocessor host should handle all IPC issues, use Process/ProcessBuilder to launch a child process for hosting the user coprocessor code and get access to its stdin and stdout.

          We will need to introduce a new type of Observer to the coprocessor framework that can be a singleton watching all regions in the RS. Currently we allocate a coprocessor environment for each region and an Observer can only see what goes on in that environment (for only that region). Otherwise you can imagine for a RS hosting 1000 regions there might be 1000 threads just for IPC between the external coprocessor host in the RS and not one child but 1000. That's a nonstarter. So we want one coprocessor in the RS managing communication to one child, and both parent+child handle all Observer (and Endpoint) actions on all regions, using NIO to multiplex communication among the input and output streams set up by Process/ProcessBuilder. How efficiently this can be done and how low latency it can be kept will determine the performance penalty for external coprocessors.

          Show
          Andrew Purtell added a comment - Asaf Mesika I didn't get beyond some early high level thoughts. Therefore there is no data, but sure there will be some performance penalty, we must introduce an RPC mechanism between the RegionServer and the child external coprocessor host. It seems reasonable that the external coprocessor host should handle all IPC issues, use Process/ProcessBuilder to launch a child process for hosting the user coprocessor code and get access to its stdin and stdout. We will need to introduce a new type of Observer to the coprocessor framework that can be a singleton watching all regions in the RS. Currently we allocate a coprocessor environment for each region and an Observer can only see what goes on in that environment (for only that region). Otherwise you can imagine for a RS hosting 1000 regions there might be 1000 threads just for IPC between the external coprocessor host in the RS and not one child but 1000. That's a nonstarter. So we want one coprocessor in the RS managing communication to one child, and both parent+child handle all Observer (and Endpoint) actions on all regions, using NIO to multiplex communication among the input and output streams set up by Process/ProcessBuilder. How efficiently this can be done and how low latency it can be kept will determine the performance penalty for external coprocessors.
          Hide
          Michael Segel added a comment -

          Is this still an open issue?

          Any progress?

          Show
          Michael Segel added a comment - Is this still an open issue? Any progress?
          Hide
          Andrew Purtell added a comment -

          I have this on my to do list for this year but as it is optional work there are other things preempting it. Do you need something like this soon Michael Segel or have a patch or related work?

          Show
          Andrew Purtell added a comment - I have this on my to do list for this year but as it is optional work there are other things preempting it. Do you need something like this soon Michael Segel or have a patch or related work?
          Hide
          Adela Maznikar added a comment -

          I am curious if there is any additional progress here. Really exciting idea!

          Show
          Adela Maznikar added a comment - I am curious if there is any additional progress here. Really exciting idea!
          Hide
          Andrew Purtell added a comment -

          This is just waiting for a strong requirement to do all the work required Adela Maznikar, or a contribution from elsewhere where the same requirement came up. I'd say it could be a post-1.0 feature.

          Show
          Andrew Purtell added a comment - This is just waiting for a strong requirement to do all the work required Adela Maznikar , or a contribution from elsewhere where the same requirement came up. I'd say it could be a post-1.0 feature.
          Hide
          Michael Segel added a comment -

          Sorry I don't always follow Jiras.

          To answer your question... in terms of patches, it would be a massive rewrite and would probably break the existing code base using coprocessors today.
          In terms of me providing a patch... will Apache indemnify me if I get sued for introducing IP that I may have used or learned at a former company / client?
          (Didn't think so.)

          I can tell you what you need and I can pencil out a design. But that's as far as I can go.

          In terms of a strong requirement. By creating a flag that will stop the loading of coprocessor code after the system coprocessors are loaded, the security issue is reduced to a point that the requirement goes away. There is a large enough client who could make that request from one of the vendors, however they are not using HBase at a level where they are implementing coprocessors.

          Outside of a requirement. The issue is that using coprocessors adds risk to the system. Risk in terms of performance, stability, and security. It also causes issues when it comes to maintenance. You want to remove (not shut off) a coprocessor you can't without restarting the RS and reloading the coprocessors that you want loaded. (e.g. class collision)

          Coprocessors is necessary for extending HBase beyond a simple object store. Security (XASecure / Ranger) require it. Adding OLTP and RDBMs like features are also important to many. (Transactions / Isolation levels) Fixing issues with compactions...

          But I digress.

          Show
          Michael Segel added a comment - Sorry I don't always follow Jiras. To answer your question... in terms of patches, it would be a massive rewrite and would probably break the existing code base using coprocessors today. In terms of me providing a patch... will Apache indemnify me if I get sued for introducing IP that I may have used or learned at a former company / client? (Didn't think so.) I can tell you what you need and I can pencil out a design. But that's as far as I can go. In terms of a strong requirement. By creating a flag that will stop the loading of coprocessor code after the system coprocessors are loaded, the security issue is reduced to a point that the requirement goes away. There is a large enough client who could make that request from one of the vendors, however they are not using HBase at a level where they are implementing coprocessors. Outside of a requirement. The issue is that using coprocessors adds risk to the system. Risk in terms of performance, stability, and security. It also causes issues when it comes to maintenance. You want to remove (not shut off) a coprocessor you can't without restarting the RS and reloading the coprocessors that you want loaded. (e.g. class collision) Coprocessors is necessary for extending HBase beyond a simple object store. Security (XASecure / Ranger) require it. Adding OLTP and RDBMs like features are also important to many. (Transactions / Isolation levels) Fixing issues with compactions... But I digress.

            People

            • Assignee:
              Unassigned
              Reporter:
              Andrew Purtell
            • Votes:
              0 Vote for this issue
              Watchers:
              53 Start watching this issue

              Dates

              • Created:
                Updated:

                Development