Hadoop YARN
  1. Hadoop YARN
  2. YARN-153

PaaS on YARN: an YARN application to demonstrate that YARN can be used as a PaaS

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This application is to demonstrate that YARN can be used for non-mapreduce applications. As Hadoop has already been adopted and deployed widely and its deployment in future will be highly increased, we thought that it's a good potential to be used as PaaS.
      I have implemented a proof of concept to demonstrate that YARN can be used as a PaaS (Platform as a Service). I have done a gap analysis against VMware's Cloud Foundry and tried to achieve as many PaaS functionalities as possible on YARN.

      I'd like to check in this POC as a YARN example application.

      1. MAPREDUCE4393.patch
        132 kB
        Jacob Jaigak Song
      2. MAPREDUCE4393.patch
        132 kB
        Jacob Jaigak Song
      3. MAPREDUCE-4393.patch
        132 kB
        Jacob Jaigak Song
      4. MAPREDUCE-4393.patch
        128 kB
        Jacob Jaigak Song
      5. HADOOPasPAAS_Architecture.pdf
        1.66 MB
        Jacob Jaigak Song
      6. MAPREDUCE-4393.patch
        135 kB
        Jacob Jaigak Song

        Activity

        Jacob Jaigak Song created issue -
        Jacob Jaigak Song logged work - 05/Jul/12 00:09
        • Time Spent:
          336h
           
          <No comment>
        Hide
        Arun C Murthy added a comment - - edited

        Jaigak, this sounds like a very useful exercise to validate YARN apis and to help identify gaps.

        Please provide a patch (details here: http://wiki.apache.org/hadoop/HowToContribute) and make it a example module under hadoop-trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/ (it should be a peer of hadoop-trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell). Thanks!

        Show
        Arun C Murthy added a comment - - edited Jaigak, this sounds like a very useful exercise to validate YARN apis and to help identify gaps. Please provide a patch (details here: http://wiki.apache.org/hadoop/HowToContribute ) and make it a example module under hadoop-trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/ (it should be a peer of hadoop-trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell). Thanks!
        Hide
        Jacob Jaigak Song added a comment -

        Please review the changes (actually they are all new files). Thanks!

        Show
        Jacob Jaigak Song added a comment - Please review the changes (actually they are all new files). Thanks!
        Jacob Jaigak Song made changes -
        Field Original Value New Value
        Attachment MAPREDUCE-4393.patch [ 12535731 ]
        Hide
        Arun C Murthy added a comment -

        Jaigak - thanks for the patch, I'll look into it.

        Typically we mark the jira as 'Patch Available' when you want it reviewed, I've done it for you here.

        Show
        Arun C Murthy added a comment - Jaigak - thanks for the patch, I'll look into it. Typically we mark the jira as 'Patch Available' when you want it reviewed, I've done it for you here.
        Arun C Murthy made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Assignee Jaigak Song [ jaigak.song ]
        Hide
        Bikas Saha added a comment -

        Jaigak - Its nice to see another application on top of Yarn.
        Since its a lot of new code, could you please add some notes on the high level components and suggest some pointers on navigating the code. It will help review the changes. Thanks!

        Show
        Bikas Saha added a comment - Jaigak - Its nice to see another application on top of Yarn. Since its a lot of new code, could you please add some notes on the high level components and suggest some pointers on navigating the code. It will help review the changes. Thanks!
        Hide
        Jacob Jaigak Song added a comment -

        Hi Bikas, thanks for your suggestion. I already have some documentation but need to update it before I share it. I will attach the updated doc as soon as possible once modification is done.

        Show
        Jacob Jaigak Song added a comment - Hi Bikas, thanks for your suggestion. I already have some documentation but need to update it before I share it. I will attach the updated doc as soon as possible once modification is done.
        Hide
        Jacob Jaigak Song added a comment -

        Architecture document for the application

        Show
        Jacob Jaigak Song added a comment - Architecture document for the application
        Jacob Jaigak Song made changes -
        Attachment HADOOPasPAAS_Architecture.pdf [ 12535750 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12535750/HADOOPasPAAS_Architecture.pdf
        against trunk revision .

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2562//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12535750/HADOOPasPAAS_Architecture.pdf against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2562//console This message is automatically generated.
        Hide
        Jacob Jaigak Song added a comment -

        Included the doc into the patch so that it can avoid build failures. Also, I applied some svn ignores before creating the path.

        But the actual application is same as before.

        Show
        Jacob Jaigak Song added a comment - Included the doc into the patch so that it can avoid build failures. Also, I applied some svn ignores before creating the path. But the actual application is same as before.
        Jacob Jaigak Song made changes -
        Attachment MAPREDUCE-4393.patch [ 12535921 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12535921/MAPREDUCE-4393.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 javadoc. The javadoc tool appears to have generated 25 warning messages.

        -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

        -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

        -1 release audit. The applied patch generated 1 release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//testReport/
        Release audit warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12535921/MAPREDUCE-4393.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 25 warning messages. -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. -1 release audit. The applied patch generated 1 release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2568//console This message is automatically generated.
        Hide
        Jacob Jaigak Song added a comment -

        Fixed most of the errors reported from QA automatic build: javadoc, findBugs, audit ...

        Show
        Jacob Jaigak Song added a comment - Fixed most of the errors reported from QA automatic build: javadoc, findBugs, audit ...
        Jacob Jaigak Song made changes -
        Attachment MAPREDUCE-4393.patch [ 12536078 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12536078/MAPREDUCE-4393.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2573//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2573//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536078/MAPREDUCE-4393.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2573//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2573//console This message is automatically generated.
        Hide
        Kihwal Lee added a comment -

        You have probably done it already, but the first thing to make sure is that everything builds okay for all targets and profiles. e.g. build and run test with clover (-Pclover). The test-patch process is most useful when existing code is modified, so in your case it would be nice if you could report more testing results.

        People will also like to hear about your experience on writing a new YARN app. There are on-going works to make it easier to develop and debug apps. I am sure these efforts will benefit from your input.

        Show
        Kihwal Lee added a comment - You have probably done it already, but the first thing to make sure is that everything builds okay for all targets and profiles. e.g. build and run test with clover (-Pclover). The test-patch process is most useful when existing code is modified, so in your case it would be nice if you could report more testing results. People will also like to hear about your experience on writing a new YARN app. There are on-going works to make it easier to develop and debug apps. I am sure these efforts will benefit from your input.
        Hide
        Jacob Jaigak Song added a comment -

        Finally I could successfully run 'dec-support.sh' with a positive overall result.

        Show
        Jacob Jaigak Song added a comment - Finally I could successfully run 'dec-support.sh' with a positive overall result.
        Jacob Jaigak Song made changes -
        Attachment MAPREDUCE4393.patch [ 12536236 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12536236/MAPREDUCE4393.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified test files.

        -1 javac. The patch appears to cause the build to fail.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2580//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536236/MAPREDUCE4393.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified test files. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2580//console This message is automatically generated.
        Hide
        Jacob Jaigak Song added a comment -

        Please bear with me as I'm new to this Hadoop development environment. The attached patch works fine (i.e. test-patch.sh produced +1 overall result) on my ubuntu machine. Let's see how it goes this time.

        Show
        Jacob Jaigak Song added a comment - Please bear with me as I'm new to this Hadoop development environment. The attached patch works fine (i.e. test-patch.sh produced +1 overall result) on my ubuntu machine. Let's see how it goes this time.
        Jacob Jaigak Song made changes -
        Attachment MAPREDUCE4393.patch [ 12536269 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12536269/MAPREDUCE4393.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2582//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2582//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536269/MAPREDUCE4393.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-client hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-container hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-master hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-paas/hadoop-yarn-applications-paas-zkclient. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2582//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2582//console This message is automatically generated.
        Hide
        Bikas Saha added a comment -

        I took a pass at the changes. I have some
        1) The Client and AppMaster look very similar to DistributedShell. It might be useful to see if some of the common portions could be abstracted out.
        2) How about using the AM itself as the information repo about active PAAS containers instead of storing information in ZK? The AM knows exactly what is running. If there is some information that the containers need to post then that can be posted to the AM itself. Thereafter, the AM can be queried for the same information that ZK is giving.
        3) The AM could open a port to listen for new commands from the PAAS client. So starting new instances can be done via the currently running AM instead of starting new AM's.

        Show
        Bikas Saha added a comment - I took a pass at the changes. I have some 1) The Client and AppMaster look very similar to DistributedShell. It might be useful to see if some of the common portions could be abstracted out. 2) How about using the AM itself as the information repo about active PAAS containers instead of storing information in ZK? The AM knows exactly what is running. If there is some information that the containers need to post then that can be posted to the AM itself. Thereafter, the AM can be queried for the same information that ZK is giving. 3) The AM could open a port to listen for new commands from the PAAS client. So starting new instances can be done via the currently running AM instead of starting new AM's.
        Hide
        Jacob Jaigak Song added a comment -

        Bikas, thanks for your comments!
        Regarding #1, some shell related portion and some more can be abstracted out, but I don't see much value out of that in my opinion.

        For #2, I agree on that AM can be used, but first, #3 should be implemented before that in order to have one place to maintain available containers of the same application type. Besides, ZooKeeper seems a better choice at this point as the PaaS implementation has Routers (which is not part of the patch due to some dependency), which are supposed to distribute incoming requests by utilizing the information of which containers are available for which application type. If there are multiple AM's (e.g. hundreds or thousands of AM's) for different application types, ZooKeeper is much simpler to use and can be better performant because of its asynchronous characteristics.

        For #3, I received the same suggestion from Arun Murthy a couple of weeks ago and I put it as an enhancement to my documentation to distribute soon. Certainly we can improve the application later. This implementation was a POC done within a couple of weeks.

        Show
        Jacob Jaigak Song added a comment - Bikas, thanks for your comments! Regarding #1, some shell related portion and some more can be abstracted out, but I don't see much value out of that in my opinion. For #2, I agree on that AM can be used, but first, #3 should be implemented before that in order to have one place to maintain available containers of the same application type. Besides, ZooKeeper seems a better choice at this point as the PaaS implementation has Routers (which is not part of the patch due to some dependency), which are supposed to distribute incoming requests by utilizing the information of which containers are available for which application type. If there are multiple AM's (e.g. hundreds or thousands of AM's) for different application types, ZooKeeper is much simpler to use and can be better performant because of its asynchronous characteristics. For #3, I received the same suggestion from Arun Murthy a couple of weeks ago and I put it as an enhancement to my documentation to distribute soon. Certainly we can improve the application later. This implementation was a POC done within a couple of weeks.
        Hide
        Kihwal Lee added a comment -

        I think use of ZK is fine since it won't be pretty for routers to poll status from RM (to get the list of AMs) and AM (to get updates on app instances). Multiple AMs can run on the same node, so a predefined port number cannot be used. Then there has to be a way to discover the port number. Having ZK in the picture certainly helps.

        But depending on the requirement on router, all external dependencies (router & zk) can be substituted with another YARN app! PaaS System App? If we do this, the PaaS app can be made to talk to any one of the two types of management system.

        Show
        Kihwal Lee added a comment - I think use of ZK is fine since it won't be pretty for routers to poll status from RM (to get the list of AMs) and AM (to get updates on app instances). Multiple AMs can run on the same node, so a predefined port number cannot be used. Then there has to be a way to discover the port number. Having ZK in the picture certainly helps. But depending on the requirement on router, all external dependencies (router & zk) can be substituted with another YARN app! PaaS System App? If we do this, the PaaS app can be made to talk to any one of the two types of management system.
        Hide
        Jacob Jaigak Song added a comment -

        One of the requirements for PaaS (at least I have) is that even if AM crashes, all the application containers should keep running if possible. In this sense, ZK or a more reliable component is better for tracking available instances instead of AM doing that.

        Show
        Jacob Jaigak Song added a comment - One of the requirements for PaaS (at least I have) is that even if AM crashes, all the application containers should keep running if possible. In this sense, ZK or a more reliable component is better for tracking available instances instead of AM doing that.
        Hide
        Kihwal Lee added a comment -

        I didn't mean that the manager AM is responsible for launching app AMs. I think it can be a separate yarn app. They don't even have to be any start-up dependency among them, if we design communication protocol well. This also makes restart easy.

        If we can (re)launch the manager AM on one of the predefined set of hosts, most of the requirements can be met. By storing system state in the hdfs and reading back on restart, it can go back in sync fast and offer service again. Routers can be provisioned similarly, but they will acquire state information from the manager AM. The service discovery is simplified by the fact that they will be on specific hosts. If a VIP is used to deal with service up/down or migration among the given set of hosts, the service discovery is further simplified. Since they are independent app instances or independent yarn apps, a crash/restart of one thing won't force termination of others.

        The one thing I am not sure about is the ability to specifying a specific set of candidate hosts for launching AM. If not supported already, we can launch AM on a random host and then launch containers on a specific set of hosts, but that lowers the reliability. Or maybe the AM can be anywhere and the container launched from it will only be used for service discovery.

        I am not insisting on doing this now, but it will be nice if everything is contained in YARN so that setting up is simpler and it is easily demoable.

        Show
        Kihwal Lee added a comment - I didn't mean that the manager AM is responsible for launching app AMs. I think it can be a separate yarn app. They don't even have to be any start-up dependency among them, if we design communication protocol well. This also makes restart easy. If we can (re)launch the manager AM on one of the predefined set of hosts, most of the requirements can be met. By storing system state in the hdfs and reading back on restart, it can go back in sync fast and offer service again. Routers can be provisioned similarly, but they will acquire state information from the manager AM. The service discovery is simplified by the fact that they will be on specific hosts. If a VIP is used to deal with service up/down or migration among the given set of hosts, the service discovery is further simplified. Since they are independent app instances or independent yarn apps, a crash/restart of one thing won't force termination of others. The one thing I am not sure about is the ability to specifying a specific set of candidate hosts for launching AM. If not supported already, we can launch AM on a random host and then launch containers on a specific set of hosts, but that lowers the reliability. Or maybe the AM can be anywhere and the container launched from it will only be used for service discovery. I am not insisting on doing this now, but it will be nice if everything is contained in YARN so that setting up is simpler and it is easily demoable.
        Hide
        Jacob Jaigak Song added a comment -

        Personally I don't like everything being contained in YARN considering enterprise environments. Certainly it can be one of the options Hadoop YARN may provide, but too much tight integrations (but I don't think you mean this) can be a dislike in enterprise environments.

        Show
        Jacob Jaigak Song added a comment - Personally I don't like everything being contained in YARN considering enterprise environments. Certainly it can be one of the options Hadoop YARN may provide, but too much tight integrations (but I don't think you mean this) can be a dislike in enterprise environments.
        Hide
        Jacob Jaigak Song added a comment -

        I just published a document about the prototype and findings which you might already know. If you are interested, here is the url: http://jaigak.blogspot.com/2012/07/paas-on-hadoop-yarn-idea-and-prototype.html

        Show
        Jacob Jaigak Song added a comment - I just published a document about the prototype and findings which you might already know. If you are interested, here is the url: http://jaigak.blogspot.com/2012/07/paas-on-hadoop-yarn-idea-and-prototype.html
        Jacob Jaigak Song made changes -
        Remaining Estimate 336h [ 1209600 ] 0h [ 0 ]
        Time Spent 336h [ 1209600 ]
        Worklog Id 13734 [ 13734 ]
        Hide
        Arun C Murthy added a comment -

        Jaigak - I spent some more thinking about this in light of MAPREDUCE-4495.

        Unfortunately, it seems that we are running the risk of turning YARN into an 'umbrella' project by accepting applications built on top of YARN into the project itself...

        Essentially, as folks like Chris Mattman have pointed out in MAPREDUCE-4495, the PaaS prototype is better off being a standalone project in Apache Incubator since the Apache Software Foundation frowns upon one 'umbrella' project housing several smaller projects i.e. YARN vis-a-vis PaaS, Workflow AM etc.

        If you are interested, I'm more than happy to help you through the Apache Incubator process and we collaborate via the Incubator. Do you mind doing that? Thanks!

        Show
        Arun C Murthy added a comment - Jaigak - I spent some more thinking about this in light of MAPREDUCE-4495 . Unfortunately, it seems that we are running the risk of turning YARN into an 'umbrella' project by accepting applications built on top of YARN into the project itself... Essentially, as folks like Chris Mattman have pointed out in MAPREDUCE-4495 , the PaaS prototype is better off being a standalone project in Apache Incubator since the Apache Software Foundation frowns upon one 'umbrella' project housing several smaller projects i.e. YARN vis-a-vis PaaS, Workflow AM etc. If you are interested, I'm more than happy to help you through the Apache Incubator process and we collaborate via the Incubator. Do you mind doing that? Thanks!
        Arun C Murthy made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Arun C Murthy added a comment -

        Here is more information about proposing this via the incubator: http://incubator.apache.org/guides/proposal.html

        I do apologize for not seeing the danger of this (i.e. turning YARN into an umbrella project) earlier - I'm willing to make up for it by helping you through the Incubator. However, it is something the ASF cares deeply about and is something I have to follow as part of the responsibility of the Hadoop PMC.

        Again, apologies - but I do hope we can collaborate through the Incubator and my offer of help stands. Thanks!

        Show
        Arun C Murthy added a comment - Here is more information about proposing this via the incubator: http://incubator.apache.org/guides/proposal.html I do apologize for not seeing the danger of this (i.e. turning YARN into an umbrella project) earlier - I'm willing to make up for it by helping you through the Incubator. However, it is something the ASF cares deeply about and is something I have to follow as part of the responsibility of the Hadoop PMC. Again, apologies - but I do hope we can collaborate through the Incubator and my offer of help stands. Thanks!
        Hide
        Brock Noland added a comment -

        Jaigak, if you take this project to the incubator I would be interested in being part of the project.

        Show
        Brock Noland added a comment - Jaigak, if you take this project to the incubator I would be interested in being part of the project.
        Arun C Murthy made changes -
        Project Hadoop Map/Reduce [ 12310941 ] Hadoop YARN [ 12313722 ]
        Key MAPREDUCE-4393 YARN-153
        Issue Type Task [ 3 ] New Feature [ 2 ]
        Affects Version/s 0.23.1 [ 12318883 ]
        Fix Version/s 3.0.0 [ 12323268 ]
        Fix Version/s 3.0.0 [ 12320355 ]
        Component/s examples [ 12312911 ]
        Arun C Murthy made changes -
        Fix Version/s 2.0.3-alpha [ 12323272 ]
        Fix Version/s 3.0.0 [ 12323268 ]
        Arun C Murthy made changes -
        Fix Version/s 2.0.4-beta [ 12324029 ]
        Fix Version/s 2.0.3-alpha [ 12323272 ]
        Arun C Murthy made changes -
        Fix Version/s 2.3.0 [ 12324589 ]
        Fix Version/s 2.1.0-beta [ 12324029 ]
        Arun C Murthy made changes -
        Fix Version/s 2.3.0 [ 12325256 ]
        Fix Version/s 2.4.0 [ 12324589 ]
        Arun C Murthy made changes -
        Fix Version/s 2.4.0 [ 12326142 ]
        Fix Version/s 2.3.0 [ 12325256 ]
        Hide
        Junping Du added a comment -

        Hi Jacob Jaigak Song, any update on this JIRA? I am happened to have some experience on Cloud Foundry and have some thoughts too. Mind to have a discussion?

        Show
        Junping Du added a comment - Hi Jacob Jaigak Song , any update on this JIRA? I am happened to have some experience on Cloud Foundry and have some thoughts too. Mind to have a discussion?
        Arun C Murthy made changes -
        Fix Version/s 2.5.0 [ 12326262 ]
        Fix Version/s 2.4.0 [ 12326142 ]
        Karthik Kambatla (Inactive) made changes -
        Fix Version/s 2.6.0 [ 12327197 ]
        Fix Version/s 2.5.0 [ 12326262 ]
        Arun C Murthy made changes -
        Fix Version/s 2.7.0 [ 12327585 ]
        Fix Version/s 2.6.0 [ 12327197 ]
        Allen Wittenauer made changes -
        Fix Version/s 2.7.0 [ 12327585 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        5d 19h 39m 1 Arun C Murthy 09/Jul/12 21:39
        Patch Available Patch Available Open Open
        24d 9h 24m 1 Arun C Murthy 03/Aug/12 07:04

          People

          • Assignee:
            Jacob Jaigak Song
            Reporter:
            Jacob Jaigak Song
          • Votes:
            2 Vote for this issue
            Watchers:
            44 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 336h
              336h
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 336h
              336h

                Development