Hadoop Common
  1. Hadoop Common
  2. HADOOP-6483

Provide Hadoop as a Service based on standards

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Hadoop as a Service provides a standards-based web services interface that layers on top of Hadoop on Demand and allows Hadoop jobs to be submitted via popular schedulers, such as Sun Grid Engine (SGE), Platform LSF, Microsoft HPC Server 2008 etc., to local or remote Hadoop clusters. This allows multiple Hadoop clusters within an organization to be efficiently shared and provides flexibility, allowing remote Hadoop clusters, offered as Cloud services, to be used for experimentation and burst capacity. HaaS hides complexity, allowing users to submit many types of compute or data intensive work via a single scheduler without actually knowing where it will be done. Additionally providing a standards-based front-end to Hadoop means that users would be able to easily choose HaaS providers without being locked in, i.e. via proprietary interfaces such as Amazon's map/reduce service.

      Our HaaS implementation uses the OGF High Performance Computing Basic Profile standard to define interoperable job submission descriptions and management interfaces to Hadoop. It uses Hadoop on Demand to provision capacity. Our HaaS implementation also supports files stage in/out with protocols like FTP, SCP and GridFTP.

      Our HaaS implementation also provides a suit of RESTful interface which compliant with HPC-BP.

      1. SC08-HPCBPforHadoop.ppt
        1.31 MB
        Yang Zhou
      2. OGF27-HPCBPforHadoop.ppt
        937 kB
        Yang Zhou

        Activity

        Hide
        steve_l added a comment -

        yes, I would argue for opening up to all interested parties. We could talk next week tuesday or wednesday?

        Show
        steve_l added a comment - yes, I would argue for opening up to all interested parties. We could talk next week tuesday or wednesday?
        Hide
        Paul Strong added a comment -

        Hi Steve, It's been a while :o) Would it be worth a quick Skype chat on this? I agree with regards to separation of provisioning etc., although I think it needs to be part of the flow. I am thinking we could do this via a plug-in.

        Show
        Paul Strong added a comment - Hi Steve, It's been a while :o) Would it be worth a quick Skype chat on this? I agree with regards to separation of provisioning etc., although I think it needs to be part of the flow. I am thinking we could do this via a plug-in.
        Hide
        Yang Zhou added a comment -

        The HPC-BP implementation for Hadoop we present in SC'08 is the one without resource provision ability.
        The HPC-BP implementation for Hadoop we present in OGF27 is the one with resource provision ability.
        Of course, there are some other difference between these two implementation, e.g. Hadoop job description.
        I attach both of them for comparison.

        Show
        Yang Zhou added a comment - The HPC-BP implementation for Hadoop we present in SC'08 is the one without resource provision ability. The HPC-BP implementation for Hadoop we present in OGF27 is the one with resource provision ability. Of course, there are some other difference between these two implementation, e.g. Hadoop job description. I attach both of them for comparison.
        Hide
        steve_l added a comment -

        I hear you have some SC08 slides on this topic. Could you attach them to this issue?

        Show
        steve_l added a comment - I hear you have some SC08 slides on this topic. Could you attach them to this issue?
        Hide
        Yang Zhou added a comment -

        Actually, at the start we implemented BES only for a single Hadoop cluster. With that implementation, multiple users can submit their jobs to a single cluster; everyone has his own workspace both in local file-system and HDFS. I think that's what you want.

        The system we have now is much like Amazon Elastic MapReduce. I think two systems are on different levels. But we can provide both to Apache Hadoop.

        As to the RESTy BES, that part is not included in the standards I mentioned. Sorry for that. BES only defines SOAP interfaces. That's one of the "less good" things of BES. So we define our own REST interface according to BES which is different from the besrest_talk.odp you mentioned :

          GET POST PUT DELETE
        /bes/attributes BES Factory Doc      
        /bes/start     Start BES  
        /bes/stop     Stop BES  
        /bes/jobs Job List Create Job    
        /bes/job/ {jobid} Job Status & Doc     Terminate Job

        And what we post to /bes/jobs is not JSDL but some parameters.

        Show
        Yang Zhou added a comment - Actually, at the start we implemented BES only for a single Hadoop cluster. With that implementation, multiple users can submit their jobs to a single cluster; everyone has his own workspace both in local file-system and HDFS. I think that's what you want. The system we have now is much like Amazon Elastic MapReduce. I think two systems are on different levels. But we can provide both to Apache Hadoop. As to the RESTy BES, that part is not included in the standards I mentioned. Sorry for that. BES only defines SOAP interfaces. That's one of the "less good" things of BES. So we define our own REST interface according to BES which is different from the besrest_talk.odp you mentioned :   GET POST PUT DELETE /bes/attributes BES Factory Doc       /bes/start     Start BES   /bes/stop     Stop BES   /bes/jobs Job List Create Job     /bes/job/ {jobid} Job Status & Doc     Terminate Job And what we post to /bes/jobs is not JSDL but some parameters.
        Hide
        steve_l added a comment -
        1. You should decouple provisioning/decommissioning from the workflow stuff where possible. That said, it helps your scheduler to have a lot of awareness of node cost and locality. A good interface is needed there.
        2. I wasn't aware of a standard RESTy BES. Can you provide links
        3. This is a good opportunity to talk to the OGF people and get their input and engineering support.
        Show
        steve_l added a comment - You should decouple provisioning/decommissioning from the workflow stuff where possible. That said, it helps your scheduler to have a lot of awareness of node cost and locality. A good interface is needed there. I wasn't aware of a standard RESTy BES. Can you provide links This is a good opportunity to talk to the OGF people and get their input and engineering support.
        Hide
        Yang Zhou added a comment -

        I think our Hadoop as a Service is more than just providing a service interface for a Hadoop cluster. What we want to solve is how to provide Hadoop power which may be located in local data center or in remote Cloud to the normal users or grid schedulers. Those users or schedulers shouldn't need to care about how to bring up a Hadoop cluster and how to run Hadoop jobs on it. They only need to tell our HaaS that what the Hadoop work is and how large the Hadoop cluster they want. That's why we use JSDL to describe their requirements, BES to provide unified submission interface and HOD to provision Hadoop cluster.

        Another benefit of standards(JSDL, BES) is it enables meta-scheduler to be used in job submission so that both compute and data(Hadoop) intensive jobs can be sent to the same meta-scheduler which then finds the right resource management system for those jobs.

        The reason why we use HOD to provision capacity is because HOD is the easiest way to do that. Currently our implementation only support HOD but maybe we can provide a mechanism to support more in the future.

        Show
        Yang Zhou added a comment - I think our Hadoop as a Service is more than just providing a service interface for a Hadoop cluster. What we want to solve is how to provide Hadoop power which may be located in local data center or in remote Cloud to the normal users or grid schedulers. Those users or schedulers shouldn't need to care about how to bring up a Hadoop cluster and how to run Hadoop jobs on it. They only need to tell our HaaS that what the Hadoop work is and how large the Hadoop cluster they want. That's why we use JSDL to describe their requirements, BES to provide unified submission interface and HOD to provision Hadoop cluster. Another benefit of standards(JSDL, BES) is it enables meta-scheduler to be used in job submission so that both compute and data(Hadoop) intensive jobs can be sent to the same meta-scheduler which then finds the right resource management system for those jobs. The reason why we use HOD to provision capacity is because HOD is the easiest way to do that. Currently our implementation only support HOD but maybe we can provide a mechanism to support more in the future.
        Hide
        steve_l added a comment -

        This can and should be independent of HOD. Indeed, it can be independent of the Job Tracker, Pig, Cascading etc. A general purpose RESTful front end for job management that works with long-haul clients: web, ant tasks, command line tools, would be nice. JSDL is OK, not the scary bits of the OGF specification space. The bits I used to work on.

        I have my own way of bringing up clusters on demand; the mechanism for submitting work to a cluster should not have to care how a cluster comes up, only that you can bring up a cluster somewhere, and submit work to it from your roaming laptop without having to be 100% in sync with the libraries at the back end.

        See also http://www.moreno.marzolla.name/publications/talks/besrest_talk.odp

        Show
        steve_l added a comment - This can and should be independent of HOD. Indeed, it can be independent of the Job Tracker, Pig, Cascading etc. A general purpose RESTful front end for job management that works with long-haul clients: web, ant tasks, command line tools, would be nice. JSDL is OK, not the scary bits of the OGF specification space. The bits I used to work on. I have my own way of bringing up clusters on demand; the mechanism for submitting work to a cluster should not have to care how a cluster comes up, only that you can bring up a cluster somewhere, and submit work to it from your roaming laptop without having to be 100% in sync with the libraries at the back end. See also http://www.moreno.marzolla.name/publications/talks/besrest_talk.odp
        Hide
        Hemanth Yamijala added a comment -

        Depending on how you intend to use Hadoop on Demand, please be aware of some of its main limitations. If multiple users of a cluster are using HOD to provision and bring up multiple Hadoop clusters, they might realize that the resources are not being efficiently utilized. But if you want to use it as a convenient provisioning tool to bring up a single Hadoop cluster that can then be shared among multiple users as a 'static' cluster, then it might still be useful.

        Show
        Hemanth Yamijala added a comment - Depending on how you intend to use Hadoop on Demand, please be aware of some of its main limitations. If multiple users of a cluster are using HOD to provision and bring up multiple Hadoop clusters, they might realize that the resources are not being efficiently utilized. But if you want to use it as a convenient provisioning tool to bring up a single Hadoop cluster that can then be shared among multiple users as a 'static' cluster, then it might still be useful.
        Hide
        steve_l added a comment -

        This could be good, though the term "Standards" is a vague one, that could include things like WS-RF, WS-Notification and WS-BaseFaults if you aren't careful, things best left alone. I write as someone who has implemented all of these and worked on three different SOAP stacks.

        1. Clearly, I need to know more. Are you just planning POSTed JSDL or something else?
        2. This should be a contrib/ in mapreduce or something on its own
        3. see also my slides on "Mombasa", the long-haul way to meet elephants http://www.slideshare.net/steve_l/long-haul-hadoop . It notes there that job submission is independent of the work; JSDL can be used for this; it's batch-ness does match the way the JobTracker works
        Show
        steve_l added a comment - This could be good, though the term "Standards" is a vague one, that could include things like WS-RF, WS-Notification and WS-BaseFaults if you aren't careful, things best left alone. I write as someone who has implemented all of these and worked on three different SOAP stacks. Clearly, I need to know more. Are you just planning POSTed JSDL or something else? This should be a contrib/ in mapreduce or something on its own see also my slides on "Mombasa", the long-haul way to meet elephants http://www.slideshare.net/steve_l/long-haul-hadoop . It notes there that job submission is independent of the work; JSDL can be used for this; it's batch-ness does match the way the JobTracker works

          People

          • Assignee:
            Unassigned
            Reporter:
            Yang Zhou
          • Votes:
            1 Vote for this issue
            Watchers:
            19 Start watching this issue

            Dates

            • Created:
              Updated:

              Development