Thanks for your great review Hyunsik, great to hear from you!
I really appreciate your input! You successfully named ALL of my concerns! My biggest is the IO formats which, as you said, are completely depended on MRv1. Your idea was exactly the approach I was planning on.
As for your 1. concern, yes this is a draft version and the new one (don't even have a patch up yet but I will soon to show you) will be completely configurable from the GiraphRunner CLI options.
for 2. concern: There is a need for history and a number of other basic systems we get from MRv1 right now. Because of the timing (I am trying to finish this phase before the end of march) I may attmept to make
GIRAPH-13 just cover the following upgrade: a YARN profile for Giraph, including the ability to run examples/ applications from the Giraph jar-with-dependencies, on YARN. I hope to make all other "fleshing out" of the features in more separate JIRAs or subissues. This sort of bounds in the difficulty for this first stage, and enables others to start working the feature-add JIRA's without having to know all about YARN.
The exciting thing is that the YARN API allows a much finer grained control of a lot of our BSP process than Hadoop ever did. And I too was thinking, after this a port to Mesos (or wherever) is going to be really easy! We might as time passes consider moving the launch of our zookeeper instance into the ApplicationMaster, doing more fine-grained resource allocation control (assign input splits right at the beginning of the job run, assign hosts to the workers as we choose for data locality, allot memory and/or cores depending on the size of the splits we assign etc.) the options really open some doors.
BUT, even to just make the exmaples run, the IO problem must be solved. I do think wrapping the MRv1 related functions (stuff that needs a TaskAttemptContext or Job-type classes from Hadoop and more) is the way to go, but I sure appreciate any ideas you might have?
Anyway, I will put up another patch hopefully tonight or tomorrow that is another significant upgrade from what you saw here so far. All input and ideas are appreciated, thanks again!