Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Now that YARN (aka MR2 aka MAPREDUCE-279) has been merged into the Hadoop trunk, we should think about what it would take to separate out the graph processing bits of Giraph from the MR1-specific code so as to take advantage of the less-MR centric aspects of YARN, while still supporting both over the medium term.

      Review Board link (ready for review now): https://reviews.apache.org/r/9811/

      1. GIRAPH-13-1.patch
        109 kB
        Eli Reisman
      2. GIRAPH-13-2.patch
        118 kB
        Eli Reisman
      3. GIRAPH-13-3.patch
        65 kB
        Eli Reisman
      4. GIRAPH-13-4.patch
        72 kB
        Eli Reisman
      5. GIRAPH-13-5.patch
        72 kB
        Eli Reisman
      6. GIRAPH-13-6.patch
        77 kB
        Eli Reisman
      7. GIRAPH-13-7.patch
        90 kB
        Eli Reisman
      8. GIRAPH-13-8.patch
        94 kB
        Eli Reisman
      9. GIRAPH-13-9.patch
        115 kB
        Eli Reisman
      10. GIRAPH-13-9-r1.patch
        116 kB
        Eli Reisman
      11. GIRAPH-13-9-r2.patch
        117 kB
        Eli Reisman
      12. GIRAPH-13-9-r3.patch
        120 kB
        Eli Reisman
      13. GIRAPH-13-9-r4.patch
        121 kB
        Eli Reisman
      14. GIRAPH-13-9-r5.patch
        121 kB
        Eli Reisman
      15. GIRAPH-13-9-r6.patch
        123 kB
        Eli Reisman

        Issue Links

          Activity

          Hide
          Avery Ching added a comment -

          Agreed.

          Show
          Avery Ching added a comment - Agreed.
          Hide
          Hyunsik Choi added a comment -

          I totally agree with you.

          Show
          Hyunsik Choi added a comment - I totally agree with you.
          Hide
          Avery Ching added a comment -

          This is going to be a fun one. =) Thanks for taking it on.

          Show
          Avery Ching added a comment - This is going to be a fun one. =) Thanks for taking it on.
          Hide
          Hyunsik Choi added a comment -

          Jakob,

          How about the progress of this issue? I have little experience about developing Yarn app.
          If you share your progress or separate this issue into sub tasks, I can help you a bit.

          Thank you

          Show
          Hyunsik Choi added a comment - Jakob, How about the progress of this issue? I have little experience about developing Yarn app. If you share your progress or separate this issue into sub tasks, I can help you a bit. Thank you
          Hide
          Jakob Homan added a comment -

          This is coming along, but there are a lot of pre-req issues that need to be done first. I'll start creating issues and linking them here. Grab any you'd like. GIRAPH-37 and its related issues are definitely blockers for this effort.

          Show
          Jakob Homan added a comment - This is coming along, but there are a lot of pre-req issues that need to be done first. I'll start creating issues and linking them here. Grab any you'd like. GIRAPH-37 and its related issues are definitely blockers for this effort.
          Hide
          Hyunsik Choi added a comment -

          Probably, there are many prerequisite and difficult issues.
          I'm willing to wait for your update

          Show
          Hyunsik Choi added a comment - Probably, there are many prerequisite and difficult issues. I'm willing to wait for your update
          Hide
          David Capwell added a comment -

          I would also like to help out. I have forked the project on github and have some code already to start the AM/container

          Show
          David Capwell added a comment - I would also like to help out. I have forked the project on github and have some code already to start the AM/container
          Hide
          Ed Kohlwey added a comment -

          I started some (maybe duplicate?) work on this under GIRAPH-108 - if anyone has feedback or suggestions let me know. I'll be happy to help on this in whatever way seems appropriate. Presently I'm planning on porting Giraph to Mesos and would be interested in participating in the YARN work where possible.

          Show
          Ed Kohlwey added a comment - I started some (maybe duplicate?) work on this under GIRAPH-108 - if anyone has feedback or suggestions let me know. I'll be happy to help on this in whatever way seems appropriate. Presently I'm planning on porting Giraph to Mesos and would be interested in participating in the YARN work where possible.
          Hide
          Ravi Prakash added a comment -

          Any updates? I'm sorry I have not been in the loop at all. Can users run Giraph jobs on YARN?

          Show
          Ravi Prakash added a comment - Any updates? I'm sorry I have not been in the loop at all. Can users run Giraph jobs on YARN?
          Hide
          Eli Reisman added a comment -

          Jakob has handed this off to me for the time being, and I'm going to be logging some hours on this very soon. The idea is a "Pure YARN" implementation so that we can submit our resource needs and Giraph code to the cluster without doing anything MR specific. The idea is to make this behavior a pluggable option so we can run on MRv2, YARN, or eventually any cluster framework we like rather than lumping all the MRv1 behavior into GiraphMapper/GiraphJob/GiraphRunner with the more generic Giraph behaviors.

          As it stands, when you set up a MRv2 cluster on YARN, compile Giraph as 'mvn -Phadoop_2.0.2 package' (for instance) you can run Giraph on MRv2-enabled YARN cluster such as Hadoop-2.0.2-alpha right now. This still utilizes the existing Hadoop/MR mapper-centric API but works just fine for me so far. So yes, Giraph is fully functional on YARN clusters but still depends on MapReduce as it stands today.

          Show
          Eli Reisman added a comment - Jakob has handed this off to me for the time being, and I'm going to be logging some hours on this very soon. The idea is a "Pure YARN" implementation so that we can submit our resource needs and Giraph code to the cluster without doing anything MR specific. The idea is to make this behavior a pluggable option so we can run on MRv2, YARN, or eventually any cluster framework we like rather than lumping all the MRv1 behavior into GiraphMapper/GiraphJob/GiraphRunner with the more generic Giraph behaviors. As it stands, when you set up a MRv2 cluster on YARN, compile Giraph as 'mvn -Phadoop_2.0.2 package' (for instance) you can run Giraph on MRv2-enabled YARN cluster such as Hadoop-2.0.2-alpha right now. This still utilizes the existing Hadoop/MR mapper-centric API but works just fine for me so far. So yes, Giraph is fully functional on YARN clusters but still depends on MapReduce as it stands today.
          Hide
          Eli Reisman added a comment -

          Just a placeholder and request for criticism as I launch into this endeavor, and to stoke the fire so folks will start opining on the approach where it affects (or offends) them. Thanks!

          This patch will (hopefully) include the following as time goes on:

          1. POM adjustments to add a munge flag and to exclude org.apache.giraph.yarn package from the build when Hadoop MRv1 profiles are compiled.

          2. Minimal munge surgery to inject my job running code into GiraphRunner. Alternately, I will attempt a GiraphJob conf setting and factory of some sort to handle this w/o munging, if I can.

          3. GiraphJob for YARN (prototype included here)

          4. Application Master for Giraph (prototype mostly finished and included here)

          5. Unit tests for the YARN job launching classes (for everything that happens before the BSP work starts, we have tests for that)

          Show
          Eli Reisman added a comment - Just a placeholder and request for criticism as I launch into this endeavor, and to stoke the fire so folks will start opining on the approach where it affects (or offends) them. Thanks! This patch will (hopefully) include the following as time goes on: 1. POM adjustments to add a munge flag and to exclude org.apache.giraph.yarn package from the build when Hadoop MRv1 profiles are compiled. 2. Minimal munge surgery to inject my job running code into GiraphRunner. Alternately, I will attempt a GiraphJob conf setting and factory of some sort to handle this w/o munging, if I can. 3. GiraphJob for YARN (prototype included here) 4. Application Master for Giraph (prototype mostly finished and included here) 5. Unit tests for the YARN job launching classes (for everything that happens before the BSP work starts, we have tests for that)
          Hide
          Eli Reisman added a comment -

          btw, if this looks funny its because I'm basing it off my "GIRAPH-503" branch in anticipation of the refactor being committed don't mean to confuse any more than I usually do!

          Show
          Eli Reisman added a comment - btw, if this looks funny its because I'm basing it off my " GIRAPH-503 " branch in anticipation of the refactor being committed don't mean to confuse any more than I usually do!
          Hide
          Hyunsik Choi added a comment -

          The patch looks great! I'll take a look at this in this weekend.

          Show
          Hyunsik Choi added a comment - The patch looks great! I'll take a look at this in this weekend.
          Hide
          Eli Reisman added a comment -

          Coming along really nicely, AppMaster and Client are done, utils are done, Maven profile is configured and working, only compiles for versions of Hadoop that have the YARN deps I need. Munge flag added but not used yet, should (if I get lucky) only need this to stitch us into GiraphRunner (one or two lines) and run the client instead of GiraphJob.

          Still need to handle a few issues concerning serializing GiraphConf and getting it to the AppMaster and then our task containers, and telling the RM we're done with the job, but these are just easy details.

          The only real scary challenge yet will be pondering how to allow YARN to fire up the IO formats and some of our early worker/master setup without Hadoop handing them a JobContext, TaskAttemptContext etc.
          Might have to factor out some interfaces for this to happen without gutting our existing IO code (which I plan to avoid if I can.)

          Show
          Eli Reisman added a comment - Coming along really nicely, AppMaster and Client are done, utils are done, Maven profile is configured and working, only compiles for versions of Hadoop that have the YARN deps I need. Munge flag added but not used yet, should (if I get lucky) only need this to stitch us into GiraphRunner (one or two lines) and run the client instead of GiraphJob. Still need to handle a few issues concerning serializing GiraphConf and getting it to the AppMaster and then our task containers, and telling the RM we're done with the job, but these are just easy details. The only real scary challenge yet will be pondering how to allow YARN to fire up the IO formats and some of our early worker/master setup without Hadoop handing them a JobContext, TaskAttemptContext etc. Might have to factor out some interfaces for this to happen without gutting our existing IO code (which I plan to avoid if I can.)
          Hide
          Hyunsik Choi added a comment -

          The patch is looking great. In particular, GiraphYarnClient is quite neat. I like it.
          In addition to a few issues that you mentioned, the followings should be handled (may be you already planed):

          1. Are you planning to implement a history system? When a Giraph job runs on Yarn, Giraph needs its own history system. Probably, Giraph needs a specialized history system with a veriety of counters/metrics.

          2. Should provide a way to enable users to specify a memory size of ApplicationMaster or task containers. It would be good if both API and CLI provide it.

          Because existing I/O code is dependent on MR1, it looks hard to handle I/O code for Yarn port. If you plan to make Giraph more platform-independent, how about this? First, we refactor existing I/O code to be generic and platform-independent. Then, we implement some wrapper classes for MR1. If we did, it is easier to port Giraph to another platform.

          Show
          Hyunsik Choi added a comment - The patch is looking great. In particular, GiraphYarnClient is quite neat. I like it. In addition to a few issues that you mentioned, the followings should be handled (may be you already planed): 1. Are you planning to implement a history system? When a Giraph job runs on Yarn, Giraph needs its own history system. Probably, Giraph needs a specialized history system with a veriety of counters/metrics. 2. Should provide a way to enable users to specify a memory size of ApplicationMaster or task containers. It would be good if both API and CLI provide it. Because existing I/O code is dependent on MR1, it looks hard to handle I/O code for Yarn port. If you plan to make Giraph more platform-independent, how about this? First, we refactor existing I/O code to be generic and platform-independent. Then, we implement some wrapper classes for MR1. If we did, it is easier to port Giraph to another platform.
          Hide
          Eli Reisman added a comment -

          Thanks for your great review Hyunsik, great to hear from you!

          I really appreciate your input! You successfully named ALL of my concerns! My biggest is the IO formats which, as you said, are completely depended on MRv1. Your idea was exactly the approach I was planning on.

          As for your 1. concern, yes this is a draft version and the new one (don't even have a patch up yet but I will soon to show you) will be completely configurable from the GiraphRunner CLI options.

          for 2. concern: There is a need for history and a number of other basic systems we get from MRv1 right now. Because of the timing (I am trying to finish this phase before the end of march) I may attmept to make GIRAPH-13 just cover the following upgrade: a YARN profile for Giraph, including the ability to run examples/ applications from the Giraph jar-with-dependencies, on YARN. I hope to make all other "fleshing out" of the features in more separate JIRAs or subissues. This sort of bounds in the difficulty for this first stage, and enables others to start working the feature-add JIRA's without having to know all about YARN.

          The exciting thing is that the YARN API allows a much finer grained control of a lot of our BSP process than Hadoop ever did. And I too was thinking, after this a port to Mesos (or wherever) is going to be really easy! We might as time passes consider moving the launch of our zookeeper instance into the ApplicationMaster, doing more fine-grained resource allocation control (assign input splits right at the beginning of the job run, assign hosts to the workers as we choose for data locality, allot memory and/or cores depending on the size of the splits we assign etc.) the options really open some doors.

          BUT, even to just make the exmaples run, the IO problem must be solved. I do think wrapping the MRv1 related functions (stuff that needs a TaskAttemptContext or Job-type classes from Hadoop and more) is the way to go, but I sure appreciate any ideas you might have?

          Anyway, I will put up another patch hopefully tonight or tomorrow that is another significant upgrade from what you saw here so far. All input and ideas are appreciated, thanks again!

          Show
          Eli Reisman added a comment - Thanks for your great review Hyunsik, great to hear from you! I really appreciate your input! You successfully named ALL of my concerns! My biggest is the IO formats which, as you said, are completely depended on MRv1. Your idea was exactly the approach I was planning on. As for your 1. concern, yes this is a draft version and the new one (don't even have a patch up yet but I will soon to show you) will be completely configurable from the GiraphRunner CLI options. for 2. concern: There is a need for history and a number of other basic systems we get from MRv1 right now. Because of the timing (I am trying to finish this phase before the end of march) I may attmept to make GIRAPH-13 just cover the following upgrade: a YARN profile for Giraph, including the ability to run examples/ applications from the Giraph jar-with-dependencies, on YARN. I hope to make all other "fleshing out" of the features in more separate JIRAs or subissues. This sort of bounds in the difficulty for this first stage, and enables others to start working the feature-add JIRA's without having to know all about YARN. The exciting thing is that the YARN API allows a much finer grained control of a lot of our BSP process than Hadoop ever did. And I too was thinking, after this a port to Mesos (or wherever) is going to be really easy! We might as time passes consider moving the launch of our zookeeper instance into the ApplicationMaster, doing more fine-grained resource allocation control (assign input splits right at the beginning of the job run, assign hosts to the workers as we choose for data locality, allot memory and/or cores depending on the size of the splits we assign etc.) the options really open some doors. BUT, even to just make the exmaples run, the IO problem must be solved. I do think wrapping the MRv1 related functions (stuff that needs a TaskAttemptContext or Job-type classes from Hadoop and more) is the way to go, but I sure appreciate any ideas you might have? Anyway, I will put up another patch hopefully tonight or tomorrow that is another significant upgrade from what you saw here so far. All input and ideas are appreciated, thanks again!
          Hide
          Eli Reisman added a comment -

          Hey one more idea to throw out there regarding all the IO format issues with YARN, what do you think of this:

          Since some of our internals are prettty bound up in some MRv1 classes, we can do the refactor and wrapping already spoken about above to hide this dependency. Another approach I might explore is to simply have a generic task runner (that owns GraphTaskManager, and replaces GraphMapper in our YARN impl) that just instantiates the TaskAttemptContext and other Hadoop MRv1 classes and populates them with the info they need to run the job (taken from the giraphConfiguration and/or the YARN classes that report some of the same data to the running job) and just hand those off to our Giraph code that expects these objects. Since this activity is self-contained in the runner class, no platform-dependent setup code (for YARN, mesos, whoever) has to know anything about the runner, just create it and hand it the data it needs, set it to running on the right compute nodes, etc.

          This is a tiny bit hacky, but gets the job done with minimal changes to existing code, allows for future JIRAs to do more extensive refactors, and does not hide from the fact that we will still carry dependencies on the Hadoop JARs for as long as we support MRv1 too, so we will have access to these classes to instantiate even on Mesos or YARN. I am not entirely sure this approach is possible but its one I have toyed with as an alternative to doing the full "wrap all MRv1 IO objects" approach.

          Any opinions? I will be exploring the options for the IO dilemma in great detail later in the week and will post my findings/opinions as I survey the landscape. Just need to get the rest of the Yarn job setup code done today and post that patch first...

          Show
          Eli Reisman added a comment - Hey one more idea to throw out there regarding all the IO format issues with YARN, what do you think of this: Since some of our internals are prettty bound up in some MRv1 classes, we can do the refactor and wrapping already spoken about above to hide this dependency. Another approach I might explore is to simply have a generic task runner (that owns GraphTaskManager, and replaces GraphMapper in our YARN impl) that just instantiates the TaskAttemptContext and other Hadoop MRv1 classes and populates them with the info they need to run the job (taken from the giraphConfiguration and/or the YARN classes that report some of the same data to the running job) and just hand those off to our Giraph code that expects these objects. Since this activity is self-contained in the runner class, no platform-dependent setup code (for YARN, mesos, whoever) has to know anything about the runner, just create it and hand it the data it needs, set it to running on the right compute nodes, etc. This is a tiny bit hacky, but gets the job done with minimal changes to existing code, allows for future JIRAs to do more extensive refactors, and does not hide from the fact that we will still carry dependencies on the Hadoop JARs for as long as we support MRv1 too, so we will have access to these classes to instantiate even on Mesos or YARN. I am not entirely sure this approach is possible but its one I have toyed with as an alternative to doing the full "wrap all MRv1 IO objects" approach. Any opinions? I will be exploring the options for the IO dilemma in great detail later in the week and will post my findings/opinions as I survey the landscape. Just need to get the rest of the Yarn job setup code done today and post that patch first...
          Hide
          Eli Reisman added a comment -

          This is another placeholder, but with a lot of improvements.

          • Client and AppMaster are much closer to "real life" but will need a couple more changes (some of the API's I use have fancy wrapper now apparently, and I need to do a quick upgrade)
          • Munge flag used to stitch us into GiraphRunner with minimal pain. If all goes well, this will be the only place it is used.
          • mvn profile set up and building correctly. Using default of 2.0.2-alpha for now because I have a local machine set up for that, but will change when the patch is done. Should be able to config this (among YARN-friendly Hadoops anyway) with -Dhadoop.version=... at the mvn command line at this point anyway.
          Show
          Eli Reisman added a comment - This is another placeholder, but with a lot of improvements. Client and AppMaster are much closer to "real life" but will need a couple more changes (some of the API's I use have fancy wrapper now apparently, and I need to do a quick upgrade) Munge flag used to stitch us into GiraphRunner with minimal pain. If all goes well, this will be the only place it is used. mvn profile set up and building correctly. Using default of 2.0.2-alpha for now because I have a local machine set up for that, but will change when the patch is done. Should be able to config this (among YARN-friendly Hadoops anyway) with -Dhadoop.version=... at the mvn command line at this point anyway.
          Hide
          Eli Reisman added a comment -

          What do you think of this approach. I have added GiraphYarnTask.java to replace GraphMapper. I think I may have stumbled on a method to get Giraph up and running on YARN right away, and save the IO refactor for a future JIRA. Please let me know if some form of this approach (illustrated in GiraphYarnTask) seems reasonable. If successful, it will allow us to stitch in pure YARN (at least at first) with one one use of the munge flag, in only one file!

          If not, I'll put up a separate JIRA to refactor Mapper#Context soon and continue to hack away on this. When I get a chance tomorrow I will run this on test cluster and see what happens as well.

          Show
          Eli Reisman added a comment - What do you think of this approach. I have added GiraphYarnTask.java to replace GraphMapper. I think I may have stumbled on a method to get Giraph up and running on YARN right away, and save the IO refactor for a future JIRA. Please let me know if some form of this approach (illustrated in GiraphYarnTask) seems reasonable. If successful, it will allow us to stitch in pure YARN (at least at first) with one one use of the munge flag, in only one file! If not, I'll put up a separate JIRA to refactor Mapper#Context soon and continue to hack away on this. When I get a chance tomorrow I will run this on test cluster and see what happens as well.
          Hide
          Hyunsik Choi added a comment -

          I'll take a look at your patch today night

          Show
          Hyunsik Choi added a comment - I'll take a look at your patch today night
          Hide
          Hyunsik Choi added a comment -

          That's nice progress. But, is GraphTaskManager.java just an interface? I cannot find GraphTaskManager.java from the patch.

          Anyway, I vote +1 for the approach using GiraphYarnTask. I think this is reasonable and the most simple way in the current stage. It makes the porting works easier, while we can minimize the change of existing codes.

          In addition, I agree that Yarn allows a much finer grained control throughout the whole processing. Many interesting issues that we have not needed to consider occur if we completely port Giraph to Yarn.

          Show
          Hyunsik Choi added a comment - That's nice progress. But, is GraphTaskManager.java just an interface? I cannot find GraphTaskManager.java from the patch. Anyway, I vote +1 for the approach using GiraphYarnTask. I think this is reasonable and the most simple way in the current stage. It makes the porting works easier, while we can minimize the change of existing codes. In addition, I agree that Yarn allows a much finer grained control throughout the whole processing. Many interesting issues that we have not needed to consider occur if we completely port Giraph to Yarn.
          Hide
          Eli Reisman added a comment -

          GraphTaskManager is probably in the o.a.g.graph package and is the encapsulation of all the BSP stuff that used to happen directly in GraphMapper. So its not part of the patch. Whats nice about faking out the Mapper#Context this way, as you said, is it allows us to do fine grained JIRAs to build on this, and makes this a single, discrete, simple patch for enabling YARN and nothign else. Let me clean it up and make sure it works (might need to populate the MapContextImpl I create in GiraphYarnTask with a few more goodies, etc.) but I'm feeling good about this approach. Will post another patch soon.

          Show
          Eli Reisman added a comment - GraphTaskManager is probably in the o.a.g.graph package and is the encapsulation of all the BSP stuff that used to happen directly in GraphMapper. So its not part of the patch. Whats nice about faking out the Mapper#Context this way, as you said, is it allows us to do fine grained JIRAs to build on this, and makes this a single, discrete, simple patch for enabling YARN and nothign else. Let me clean it up and make sure it works (might need to populate the MapContextImpl I create in GiraphYarnTask with a few more goodies, etc.) but I'm feeling good about this approach. Will post another patch soon.
          Hide
          Eli Reisman added a comment -

          Here's the latest placeholder. Things are going very well so far, almost done (if my "faking out Mapper#Context" approach works.) right now I'm trying to get to a point where I can test that part (might need a fake OutputComitter also) but I need a better way to get locate my JARs on the local machine. In order to distribute them w/o a distributed cache, we use the YARN LocalResources, and that means finding them locally and putting them on HDFS to be downloaded by all the worker tasks and the AppMaster. The code below is failing on the "URL is null" case. Any ideas? I want to avoid assumptions about where the jars are. In this case, I have my jars in $HADOOP_HOME/share/hadoop/... but they are not showing up using my YarnGiraphClient's class loader to find them. I'll keep playing with it, but if there's an obvious solution let me know

          Of course, I will be removing the hardcoded "giraph-examples" jar (and other cruft) when I know all this works as well, and we can pick up whatever jars we desire using the new -yj argument.

          /**
             * Utility function to locate local JAR files for trasmission to the
             * remote app container where our Application Master will run.
             * @param clazz the calling class who's classpath we want to hunt along.
             * @param name the file name of the jar, without path information.
             * @return the named jar, as a File object.
             */
            public static File getLocalJarFile(final Class<?> clazz, final String name) {
              java.net.URL jarUrl = clazz.getClassLoader().getResource(name);
              if (jarUrl == null) {
                throw new IllegalStateException("Could not locate local JAR: " + name);
              }
              LOG.info("[*] Local JAR path for " + name + " found: " + jarUrl.getPath());
              return new File(jarUrl.getPath());
            }
          
          Show
          Eli Reisman added a comment - Here's the latest placeholder. Things are going very well so far, almost done (if my "faking out Mapper#Context" approach works.) right now I'm trying to get to a point where I can test that part (might need a fake OutputComitter also) but I need a better way to get locate my JARs on the local machine. In order to distribute them w/o a distributed cache, we use the YARN LocalResources, and that means finding them locally and putting them on HDFS to be downloaded by all the worker tasks and the AppMaster. The code below is failing on the "URL is null" case. Any ideas? I want to avoid assumptions about where the jars are. In this case, I have my jars in $HADOOP_HOME/share/hadoop/... but they are not showing up using my YarnGiraphClient's class loader to find them. I'll keep playing with it, but if there's an obvious solution let me know Of course, I will be removing the hardcoded "giraph-examples" jar (and other cruft) when I know all this works as well, and we can pick up whatever jars we desire using the new -yj argument. /** * Utility function to locate local JAR files for trasmission to the * remote app container where our Application Master will run. * @param clazz the calling class who's classpath we want to hunt along. * @param name the file name of the jar, without path information. * @ return the named jar, as a File object. */ public static File getLocalJarFile( final Class <?> clazz, final String name) { java.net.URL jarUrl = clazz.getClassLoader().getResource(name); if (jarUrl == null ) { throw new IllegalStateException( "Could not locate local JAR: " + name); } LOG.info( "[*] Local JAR path for " + name + " found: " + jarUrl.getPath()); return new File(jarUrl.getPath()); }
          Hide
          Eli Reisman added a comment -

          I think I can steal a more robust version of the above method from GiraphTaskManager, might want to move it to FileUtils and make the method public so as not to duplicate the code. I knew we had something like this!

          Show
          Eli Reisman added a comment - I think I can steal a more robust version of the above method from GiraphTaskManager, might want to move it to FileUtils and make the method public so as not to duplicate the code. I knew we had something like this!
          Hide
          Eli Reisman added a comment -

          Lots of updates in this patch. The YARN and Giraph components are going swimmingly. However, I have discovered a strange Maven behavior and I thought I would put up another placeholder patch in case any of you know whats going on here.

          The way I have this version of hadoop_yarn profile set up, the compile plugin (by default on all profiles) does not compile anything in the o.a.g.yarn directory tree (or package) and this seems to work as expected. But, when I compile using -Phadoop_yarn, I reinclude the yarn dir from the profile definition in the pom where I declare the profile configuration of the compiler plugin. This only partially works.

          For some reason I cannot fathon as of yet, only a couple of the files in my yarn package compile and make it into the final build products. This is even though mvn verify (etc) all passes and classes themselves seem to compile. I thought at first I forgot to declare the package for all my files, or maybe a problem with my package-info.java file or something. But nothing obvious. Just several of my YARN files don't end up in the jar with their companions. Anyone know what I could be doing wrong here? All the changes to pom files that set this up are included in the patch for viewing.

          If anyone has an insight (or good Maven tricks to try) let me know. Thanks!

          Show
          Eli Reisman added a comment - Lots of updates in this patch. The YARN and Giraph components are going swimmingly. However, I have discovered a strange Maven behavior and I thought I would put up another placeholder patch in case any of you know whats going on here. The way I have this version of hadoop_yarn profile set up, the compile plugin (by default on all profiles) does not compile anything in the o.a.g.yarn directory tree (or package) and this seems to work as expected. But, when I compile using -Phadoop_yarn, I reinclude the yarn dir from the profile definition in the pom where I declare the profile configuration of the compiler plugin. This only partially works. For some reason I cannot fathon as of yet, only a couple of the files in my yarn package compile and make it into the final build products. This is even though mvn verify (etc) all passes and classes themselves seem to compile. I thought at first I forgot to declare the package for all my files, or maybe a problem with my package-info.java file or something. But nothing obvious. Just several of my YARN files don't end up in the jar with their companions. Anyone know what I could be doing wrong here? All the changes to pom files that set this up are included in the patch for viewing. If anyone has an insight (or good Maven tricks to try) let me know. Thanks!
          Hide
          Eli Reisman added a comment -

          All the YARN plumbing works now, I am (for working purposes) hardcoding our examples jar as my artifact to bring into YARN/HDFS for the job run, but already have a working mechanism to include jars of your choosing that I will polish up when the "review-worthy" patch is up.

          Also made lots of progress getting my dummy Mapper#Context into shape to fool Giraph. That and some tests are the last (forseeable) blockers here. Will post another patch (hopefully ready to review) very soon.

          Any ideas/comments are appreciated. I ended up having to hand-exclude the o.a.g.yarn package from all profiles but hadoop_yarn since the classes I use are only available by depending on "hadoop-yarn-common" and I didn't want to import that for all the profiles.

          My attempt to simply exclude by default the yarn package and then re-include it only in my profile, while pretty on the screen, does not make Maven happy for some reason. Will play with it some more, but I think its a "feature" of the path filtering in Maven and I have not seen documented or found a proper fix yet. Will keep looking, but this works fine as-is for now.

          OK, comments/ideas/criticisms welcome. Thanks.

          Show
          Eli Reisman added a comment - All the YARN plumbing works now, I am (for working purposes) hardcoding our examples jar as my artifact to bring into YARN/HDFS for the job run, but already have a working mechanism to include jars of your choosing that I will polish up when the "review-worthy" patch is up. Also made lots of progress getting my dummy Mapper#Context into shape to fool Giraph. That and some tests are the last (forseeable) blockers here. Will post another patch (hopefully ready to review) very soon. Any ideas/comments are appreciated. I ended up having to hand-exclude the o.a.g.yarn package from all profiles but hadoop_yarn since the classes I use are only available by depending on "hadoop-yarn-common" and I didn't want to import that for all the profiles. My attempt to simply exclude by default the yarn package and then re-include it only in my profile, while pretty on the screen, does not make Maven happy for some reason. Will play with it some more, but I think its a "feature" of the path filtering in Maven and I have not seen documented or found a proper fix yet. Will keep looking, but this works fine as-is for now. OK, comments/ideas/criticisms welcome. Thanks.
          Hide
          Eli Reisman added a comment -

          This version runs to completion (as in output gets written) when running examples on pure yarn profile. If you have 2.0.3-alpha installed, build giraph like:

          mvn -Phadoop_yarn clean package
          

          then use the giraph-examples jar with deps to run using a command line such as this:

          bin/hadoop --config etc/hadoop jar share/hadoop/giraph/giraph-examples-0.2-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -w 1 -vif org.apache.giraph.io.formats.IntIntNullIntTextInputFormat -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -vip /user/ereisman/graph -op /user/ereisman/output
          

          A lot of the functionality we need (CLI opts etc) is already there, you can run your own jars or include whatever you want in the job, and it will run on cluster too.

          It still needs a lot of cleanup, tests, and I have to get the YARN setup code to end the job nicely (fail or success) and I need the output committed to the right place. And I will remove the hardcoded dep on giraph-examples All that will be in the next patch.

          But, this is working and will commit Giraph output to HDFS. All setStatus msgs end up in the logs for now. You may have to CTRL-C out of the Yarn Client when its done (for now but not for long...)

          More to follow...

          Show
          Eli Reisman added a comment - This version runs to completion (as in output gets written) when running examples on pure yarn profile. If you have 2.0.3-alpha installed, build giraph like: mvn -Phadoop_yarn clean package then use the giraph-examples jar with deps to run using a command line such as this: bin/hadoop --config etc/hadoop jar share/hadoop/giraph/giraph-examples-0.2-SNAPSHOT- for -hadoop-2.0.3-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -w 1 -vif org.apache.giraph.io.formats.IntIntNullIntTextInputFormat -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -vip /user/ereisman/graph -op /user/ereisman/output A lot of the functionality we need (CLI opts etc) is already there, you can run your own jars or include whatever you want in the job, and it will run on cluster too. It still needs a lot of cleanup, tests, and I have to get the YARN setup code to end the job nicely (fail or success) and I need the output committed to the right place. And I will remove the hardcoded dep on giraph-examples All that will be in the next patch. But, this is working and will commit Giraph output to HDFS. All setStatus msgs end up in the logs for now. You may have to CTRL-C out of the Yarn Client when its done (for now but not for long...) More to follow...
          Hide
          Hyunsik Choi added a comment -

          Great work!
          During this weekend, I'll take a look at your patch.

          Show
          Hyunsik Choi added a comment - Great work! During this weekend, I'll take a look at your patch.
          Hide
          Hyunsik Choi added a comment -

          Eli,

          This patch looks like a well-written reference code. Actually, I have learned good usage of Yarn from your patch. I'm looking forward to the complete work.

          How about the plan for unit test? Are you planning to use MiniYarnCluster for integration test?

          Show
          Hyunsik Choi added a comment - Eli, This patch looks like a well-written reference code. Actually, I have learned good usage of Yarn from your patch. I'm looking forward to the complete work. How about the plan for unit test? Are you planning to use MiniYarnCluster for integration test?
          Hide
          Eli Reisman added a comment -

          This is an update, sorry I have waited to post it so long. Works like a charm, only 2 issues left:

          1. finish writing the integration test (partly included in patch here)

          2. debug strange latency in launching job during YARN setup. This is why I have posted on review board a bit early (I'll ping you guys when its ready for review.) My non-Giraphing YARN expert colleagues are going to take a peek on RB and tell me what I did to slow the plumbing down.

          3. Remove the hardcoded convenience dependency on giraph-examples jar with deps. The idea will be to include your own CSV list of jars to include in the job under the new -yj command line opt.

          Thanks! The RB link is:
          https://reviews.apache.org/r/9811/diff/#index_header

          Show
          Eli Reisman added a comment - This is an update, sorry I have waited to post it so long. Works like a charm, only 2 issues left: 1. finish writing the integration test (partly included in patch here) 2. debug strange latency in launching job during YARN setup. This is why I have posted on review board a bit early (I'll ping you guys when its ready for review.) My non-Giraphing YARN expert colleagues are going to take a peek on RB and tell me what I did to slow the plumbing down. 3. Remove the hardcoded convenience dependency on giraph-examples jar with deps. The idea will be to include your own CSV list of jars to include in the job under the new -yj command line opt. Thanks! The RB link is: https://reviews.apache.org/r/9811/diff/#index_header
          Hide
          Eli Reisman added a comment -

          Sorry, didn't see that Alessadro had committed 528, had to rebase

          Show
          Eli Reisman added a comment - Sorry, didn't see that Alessadro had committed 528, had to rebase
          Hide
          Eli Reisman added a comment -

          No more latency issues, working on last touches of unit test, tried to one-liner myself back to 2.0.2-alpha but had protobuf issues arising from new/old YARN API but will look closer at this soon. Worst case, we support 2.0.3-alpha and up for starters and I backport all the way to 2.0.0 soon.

          Not quite ready for careful review, but I think the next patch will be. Let me know if you see anything you hate! Thanks!

          Show
          Eli Reisman added a comment - No more latency issues, working on last touches of unit test, tried to one-liner myself back to 2.0.2-alpha but had protobuf issues arising from new/old YARN API but will look closer at this soon. Worst case, we support 2.0.3-alpha and up for starters and I backport all the way to 2.0.0 soon. Not quite ready for careful review, but I think the next patch will be. Let me know if you see anything you hate! Thanks!
          Hide
          Eli Reisman added a comment -

          OK, this is ready to go, passes mvn verify (with and without -Phadoop_yarn) and passes its new integration tests with MiniYARNCluster.

          In order to make the test cluster work, we will have to initially support 2.0.3-alpha and up Hadoop versions only. I can attempt further backports on future a JIRA.

          No more hardcoded includes, so you need -yj option on GiraphRunner and give it a comma-separated list of jar filenames (no path) to make your job run. For instance:

          mvn -Phadoop_yarn clean package
          
          cp giraph-examples/target/giraph*-jar-with*.jar ~/hadoop/share/hadoop/giraph/
          
          hstart # start your Hadoop-2.0.3-alpha cluster
                 # AND your OWN instance of ZK on some port
                 # put this in -ca giraph.zkList=... in the launch commands below if you don't use giraph-site for this!
          
          bin/hadoop --config etc/hadoop jar share/hadoop/giraph/giraph-examples-0.2-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -w 3 -yh 1024 -yj giraph-examples-0.2-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar -vif org.apache.giraph.io.formats.IntIntNullIntTextInputFormat -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -vip /user/ereisman/graph3milVerts -op /user/ereisman/output
          

          the above will build the project, then transfer giraph-examples jar with deps to a folder we are assuming is in or under a directory on the CLASSPATH, HADOOP_HOME, or at least your working dir. Last, we run a components job (assuming we have some sample data in our HDFS input dir, and a 2.0.3 cluster up and running)

          right now all setStatus() calls go right into the task logs. So we didn't lose them, but they are not aggregated in a web UI for us yet. logs are prefixed by task number (numbered 2 higher than corresponding Giraph task #'s), task 1 is always our GiraphApplicationMaster.

          JIRA's I will put up to relate to this:

          • create WebUI for Giraph
          • add process launch to GiraphApplicationMaster for our local ZK if we chose one, put host:port into zkList so Giraph-BSP doesn't take over and do it.
          • backport to 2.0.2-alpha, or even 2.0.0 Hadoop
          • lots of strange and wonderful new things are possible, we'll see about the rest as we go along.
          Show
          Eli Reisman added a comment - OK, this is ready to go, passes mvn verify (with and without -Phadoop_yarn) and passes its new integration tests with MiniYARNCluster. In order to make the test cluster work, we will have to initially support 2.0.3-alpha and up Hadoop versions only. I can attempt further backports on future a JIRA. No more hardcoded includes, so you need -yj option on GiraphRunner and give it a comma-separated list of jar filenames (no path) to make your job run. For instance: mvn -Phadoop_yarn clean package cp giraph-examples/target/giraph*-jar-with*.jar ~/hadoop/share/hadoop/giraph/ hstart # start your Hadoop-2.0.3-alpha cluster # AND your OWN instance of ZK on some port # put this in -ca giraph.zkList=... in the launch commands below if you don't use giraph-site for this ! bin/hadoop --config etc/hadoop jar share/hadoop/giraph/giraph-examples-0.2-SNAPSHOT- for -hadoop-2.0.3-alpha-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -w 3 -yh 1024 -yj giraph-examples-0.2-SNAPSHOT- for -hadoop-2.0.3-alpha-jar-with-dependencies.jar -vif org.apache.giraph.io.formats.IntIntNullIntTextInputFormat -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -vip /user/ereisman/graph3milVerts -op /user/ereisman/output the above will build the project, then transfer giraph-examples jar with deps to a folder we are assuming is in or under a directory on the CLASSPATH, HADOOP_HOME, or at least your working dir. Last, we run a components job (assuming we have some sample data in our HDFS input dir, and a 2.0.3 cluster up and running) right now all setStatus() calls go right into the task logs. So we didn't lose them, but they are not aggregated in a web UI for us yet. logs are prefixed by task number (numbered 2 higher than corresponding Giraph task #'s), task 1 is always our GiraphApplicationMaster. JIRA's I will put up to relate to this: create WebUI for Giraph add process launch to GiraphApplicationMaster for our local ZK if we chose one, put host:port into zkList so Giraph-BSP doesn't take over and do it. backport to 2.0.2-alpha, or even 2.0.0 Hadoop lots of strange and wonderful new things are possible, we'll see about the rest as we go along.
          Hide
          Jan van der Lugt added a comment -

          That's what I call a patch! Don't have a YARN cluster to try it out (still on 1.0.3), but great work! Thanks!

          Show
          Jan van der Lugt added a comment - That's what I call a patch! Don't have a YARN cluster to try it out (still on 1.0.3), but great work! Thanks!
          Hide
          Eli Reisman added a comment -

          I needed a team of horses to drag it onto the JIRA. Cut down my own tree to print it up too. And the ink...my tears.

          Show
          Eli Reisman added a comment - I needed a team of horses to drag it onto the JIRA. Cut down my own tree to print it up too. And the ink...my tears.
          Hide
          Eli Reisman added a comment -

          I found one small problem that crops up sometimes when running mvn verify involving test directory name collisions with InternalVertexRunner when using MiniYARNCLuster. MiniYARNCluster can easily end up using the same dirs or ZK ports (or wrong ZK instance) during Maven testing. I have fixed it but will temporarily withhold the patch spam for now in case I:

          1. find more little things to fix

          2. or folks put up some review issues

          Anyway, if you're feeling brave enough to install 2.0.3-alpha and test this, let me know and I'll update the patch sooner. Also ping me about YARN install issues, there are a couple little things that need to happen that are not entirely well documented and big for us.

          Show
          Eli Reisman added a comment - I found one small problem that crops up sometimes when running mvn verify involving test directory name collisions with InternalVertexRunner when using MiniYARNCLuster. MiniYARNCluster can easily end up using the same dirs or ZK ports (or wrong ZK instance) during Maven testing. I have fixed it but will temporarily withhold the patch spam for now in case I: 1. find more little things to fix 2. or folks put up some review issues Anyway, if you're feeling brave enough to install 2.0.3-alpha and test this, let me know and I'll update the patch sooner. Also ping me about YARN install issues, there are a couple little things that need to happen that are not entirely well documented and big for us.
          Hide
          Eugene Koontz added a comment -

          Hi Eli,
          Sorry to come in so late. Very impressive! I'm going to try your patch with Hadoop 2.0.3-alpha today and I appreciate your offer to help with configuration. mvn -Phadoop_yarn clean install succeeds with for me with your latest patch so I'm off to a good start.

          -Eugene

          Show
          Eugene Koontz added a comment - Hi Eli, Sorry to come in so late. Very impressive! I'm going to try your patch with Hadoop 2.0.3-alpha today and I appreciate your offer to help with configuration. mvn -Phadoop_yarn clean install succeeds with for me with your latest patch so I'm off to a good start. -Eugene
          Hide
          Eli Reisman added a comment -

          Thanks Eugene! This will be a bear to review so take your time. But make sure and use this copy, the integration tests would occasionally fail on the last one because tests that run InternalVertexRunner were occasionally stealing each other's test dirs and ports. All fixed here. I have run a bunch of jobs on this today and its running well now (I hope!)

          I'll put this on RB too.

          Show
          Eli Reisman added a comment - Thanks Eugene! This will be a bear to review so take your time. But make sure and use this copy, the integration tests would occasionally fail on the last one because tests that run InternalVertexRunner were occasionally stealing each other's test dirs and ports. All fixed here. I have run a bunch of jobs on this today and its running well now (I hope!) I'll put this on RB too.
          Hide
          Eli Reisman added a comment -

          Hey Eugene, a better command line is on the current revision of this patch on RB (marked r5 there, its r4 in the patch here...sorry) in the explanation. Forgot to post it here. And yes, there are several yarn-site.xml values you need set I can pass along that are not well doc'ed that make the cluster happy if you run into trouble.

          So far, this version works well for me.

          Show
          Eli Reisman added a comment - Hey Eugene, a better command line is on the current revision of this patch on RB (marked r5 there, its r4 in the patch here...sorry) in the explanation. Forgot to post it here. And yes, there are several yarn-site.xml values you need set I can pass along that are not well doc'ed that make the cluster happy if you run into trouble. So far, this version works well for me.
          Hide
          Eli Reisman added a comment -

          Just a rebase. Also available on RB (see link here in Description)

          Show
          Eli Reisman added a comment - Just a rebase. Also available on RB (see link here in Description)
          Hide
          Eli Reisman added a comment -

          Just another rebase. Not to hurry anyone, I know everyone's busy, but starting in a week or two I will have a lot less time to fix issues that reviewers put up.

          So...if anyone has a chance to peek at it over the next few days, I will be available to respond quickly to reviews, for now. If not...I understand! Thanks again!

          I will update this on RB too, where comments on the last couple iterations of the patch contain good command lines for building and running it on the cluster.

          Show
          Eli Reisman added a comment - Just another rebase. Not to hurry anyone, I know everyone's busy, but starting in a week or two I will have a lot less time to fix issues that reviewers put up. So...if anyone has a chance to peek at it over the next few days, I will be available to respond quickly to reviews, for now. If not...I understand! Thanks again! I will update this on RB too, where comments on the last couple iterations of the patch contain good command lines for building and running it on the cluster.
          Hide
          Hyunsik Choi added a comment -

          Eli,

          +1
          Great job! Since you started this work, you have improved many things and fixed from comments for long time. In my opinion, now there are no things to be improved. I think that this work is the most written yarn client code that I have seen so far.

          Show
          Hyunsik Choi added a comment - Eli, +1 Great job! Since you started this work, you have improved many things and fixed from comments for long time. In my opinion, now there are no things to be improved. I think that this work is the most written yarn client code that I have seen so far.
          Hide
          Gianmarco De Francisci Morales added a comment -

          Looking forward to using this

          Show
          Gianmarco De Francisci Morales added a comment - Looking forward to using this
          Hide
          Eli Reisman added a comment -

          Thanks! I learned a ton doing it! I'll give this a day or two for folks to play with it if they want or ask for changes, I'll be checking review board for any such requests, and commit in a few days if not.

          I am hoping its clear (and the low-hanging fruit ripe for improvement well marked) so others can dive in and play with it and get comfortable extending it. There are a lot of fun new possibilities if we choose to flesh this out.

          Show
          Eli Reisman added a comment - Thanks! I learned a ton doing it! I'll give this a day or two for folks to play with it if they want or ask for changes, I'll be checking review board for any such requests, and commit in a few days if not. I am hoping its clear (and the low-hanging fruit ripe for improvement well marked) so others can dive in and play with it and get comfortable extending it. There are a lot of fun new possibilities if we choose to flesh this out.
          Hide
          Roman Shaposhnik added a comment -

          I'd be very interested in taking it for integration testing in Bigtop. Looking forward to a commit.

          Show
          Roman Shaposhnik added a comment - I'd be very interested in taking it for integration testing in Bigtop. Looking forward to a commit.
          Hide
          Eli Reisman added a comment -

          I'll wait a few days for folks to point out problems (and maybe see what
          happens with GIRAPH-601) and then commit if no other review issues crop up.
          Thanks!

          Show
          Eli Reisman added a comment - I'll wait a few days for folks to point out problems (and maybe see what happens with GIRAPH-601 ) and then commit if no other review issues crop up. Thanks!
          Hide
          Eli Reisman added a comment -

          Resolved, Fixed. Amen!

          Just to reiterate, this is the smallest possible self-contained unit of YARN integration I could manage. It runs jobs using GiraphRunner or bin/giraph only (no raw "hadoop jar" calls) on a cluster or local YARN setup.

          It is only compatible with hadoop-2.0.3-alpha right now but again there's time for the rest. It may or may not support features that I did not have the time, data, or resources to stress test in an exhaustive manner. I do think its set up in such a way that it will be very easy to fix small problems that crop up or to add some potent new upgrades past what Giraph-on-MRv1 can do for us.

          Some upgrades that need to happen soon:

          • Eliminate any internal Giraph dependencies on hardcoded taskId and upgrade our GiraphApplicationMaster not to hardcode a solution.
          • WebUI or move more info from Logs to Client/AppMaster for job status etc. several ways to do this.

          I will put up a detailed wiki about how this works and how to run it soon. The review board link at the top of this JIRA can also supply details on that for now.

          Show
          Eli Reisman added a comment - Resolved, Fixed. Amen! Just to reiterate, this is the smallest possible self-contained unit of YARN integration I could manage. It runs jobs using GiraphRunner or bin/giraph only (no raw "hadoop jar" calls) on a cluster or local YARN setup. It is only compatible with hadoop-2.0.3-alpha right now but again there's time for the rest. It may or may not support features that I did not have the time, data, or resources to stress test in an exhaustive manner. I do think its set up in such a way that it will be very easy to fix small problems that crop up or to add some potent new upgrades past what Giraph-on-MRv1 can do for us. Some upgrades that need to happen soon: Eliminate any internal Giraph dependencies on hardcoded taskId and upgrade our GiraphApplicationMaster not to hardcode a solution. WebUI or move more info from Logs to Client/AppMaster for job status etc. several ways to do this. I will put up a detailed wiki about how this works and how to run it soon. The review board link at the top of this JIRA can also supply details on that for now.
          Hide
          Avery Ching added a comment -

          This is awesome Eli. I wonder if this is one of the first applications ported to YARN.

          Show
          Avery Ching added a comment - This is awesome Eli. I wonder if this is one of the first applications ported to YARN.
          Hide
          Hudson added a comment -

          Integrated in Giraph-trunk-Commit #862 (See https://builds.apache.org/job/Giraph-trunk-Commit/862/)
          GIRAPH-13: Port Giraph to YARN (Revision b2dff2751d8d3d768f788b39089688c18f6c1750)

          Result = SUCCESS
          ereisman : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=b2dff2751d8d3d768f788b39089688c18f6c1750
          Files :

          • giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java
          • giraph-core/src/main/java/org/apache/giraph/yarn/package-info.java
          • pom.xml
          • giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java
          • giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java
          • giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java
          • giraph-core/src/test/resources/capacity-scheduler.xml
          • giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java
          • giraph-core/pom.xml
          • giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java
          • giraph-core/src/test/java/org/apache/giraph/yarn/TestYarnJob.java
          • giraph-core/src/main/java/org/apache/giraph/yarn/GiraphApplicationMaster.java
          • giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnTask.java
          • giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
          • giraph-core/src/main/java/org/apache/giraph/yarn/YarnUtils.java
          • CHANGELOG
          • giraph-examples/pom.xml
          • checkstyle.xml
          • giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java
          • giraph-core/src/main/java/org/apache/giraph/bsp/BspInputFormat.java
          Show
          Hudson added a comment - Integrated in Giraph-trunk-Commit #862 (See https://builds.apache.org/job/Giraph-trunk-Commit/862/ ) GIRAPH-13 : Port Giraph to YARN (Revision b2dff2751d8d3d768f788b39089688c18f6c1750) Result = SUCCESS ereisman : http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=b2dff2751d8d3d768f788b39089688c18f6c1750 Files : giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnClient.java giraph-core/src/main/java/org/apache/giraph/yarn/package-info.java pom.xml giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java giraph-core/src/main/java/org/apache/giraph/graph/GraphTaskManager.java giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java giraph-core/src/test/resources/capacity-scheduler.xml giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java giraph-core/pom.xml giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java giraph-core/src/test/java/org/apache/giraph/yarn/TestYarnJob.java giraph-core/src/main/java/org/apache/giraph/yarn/GiraphApplicationMaster.java giraph-core/src/main/java/org/apache/giraph/yarn/GiraphYarnTask.java giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java giraph-core/src/main/java/org/apache/giraph/yarn/YarnUtils.java CHANGELOG giraph-examples/pom.xml checkstyle.xml giraph-core/src/main/java/org/apache/giraph/GiraphRunner.java giraph-core/src/main/java/org/apache/giraph/bsp/BspInputFormat.java

            People

            • Assignee:
              Eli Reisman
              Reporter:
              Jakob Homan
            • Votes:
              0 Vote for this issue
              Watchers:
              33 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development