Attached are a couple of threads of email conversation pertinent to this issue, in summary there is a strong interest in committing both the FailMon and Chukwa projects and awaiting user feedback.
Ariel Rabkin <asrabkin@EECS.Berkeley.EDU> wrote on 08/04/2008 03:23:04 PM:
> As near as I could gather from the failmon code –
> Ideally, the failmon data collection plugins ("monitors") would be
> Chukwa adaptors. The abstractions are fairly close. Provided that
> failmon isn't going to be patching away too intensively in the next
> month, probably the best thing to do would be commit both, and merge later.
> ----- Original Message -----
> From: Dhruba Borthakur <email@example.com>
> Date: Monday, August 4, 2008 2:32 pm
> Subject: Re: support for FailMon commit...
> To: Prasenjit Sarkar <firstname.lastname@example.org>
> Cc: email@example.com, andyk@EECS.Berkeley.EDU, asrabkin@EECS.
> Berkeley.EDU, firstname.lastname@example.org, email@example.com,
> firstname.lastname@example.org, Ioannis Koltsidas <email@example.com>, Karan
> Gupta <firstname.lastname@example.org>
> > Hi Prasenjit,
> > All thanks to you and Ioannis for developing FailMon.
> > It would be really nice if somebody from the Chukwa team can provide
> > feedback on the FailMon package, especially whether it is compatible
> > with Chukwa. It would be good to hear Mac's comments on whether these
> > two approaches solve the same problem or how they can be complimentary
> > to one another.
> > thanks
> > dhruba
> > On Fri, Aug 1, 2008 at 4:10 PM, Prasenjit Sarkar
> > <email@example.com> wrote:
> > >
> > > Hi,
> > >
> > > As we discussed in our last meeting, we have uploaded the latest
> > version of
> > > FailMon (and some documentation) to JIRA (
HADOOP-3585). If you have
> > some
> > > time to review it, we would be very interested to hear your comments
> > and
> > > suggestions before it gets committed. Dhruba has agreed to commit
> > the patch
> > > as soon as your team gives it a positive review. In the short term,
> > > however, we would like different people/companies to start deploying
> > > FailMon as soon as possible; to that end we need to commit it to the
> > > repository as soon as possible.
> > >
> > > We also believe that you should commit the Chukwa code and together
> > we can
> > > get valuable feedback that can determine the direction of Chukwa and
> > > FailMon. In the interim, we await your support for the commit
> > process for
> > > FailMon.
> > >
> > > Regards,
> > >
> > > Prasenjit Sarkar
> > > RSM and Manager, Storage Analytics and Resiliency
> > > Master Inventor
> > > IBM Almaden Storage Systems Research
> > >
> > >
Prasenjit Sarkar/Almaden/IBM wrote on 08/04/2008 03:19:45 PM:
> I appreciate your analysis of the integration scenarios. Taking a
> step back, we think that both Chukwa and FailMon provide interesting
> value propositions independent of each other. For example, we have
> had requests from a few groups wanting to use FailMon independently
> as a quick cluster health post-processor. I'm sure that Chukwa has a
> similar user community. In that vein, I would not like the value
> proposition of these two complementary projects be diluted by the
> integration discussion.
> So, I would vote for a quick commital for both projects followed by
> integration discussions moderated by Hadoop commiters using feedback
> from Chukwa/FailMon users.
> I hope this is reasonable,
> Prasenjit Sarkar
> RSM and Manager, Storage Analytics and Resiliency
> Master Inventor
> IBM Almaden Storage Systems Research
> Jerome Boulon <firstname.lastname@example.org>
> 08/04/2008 10:11 AM
> Prasenjit Sarkar <email@example.com>, <firstname.lastname@example.org>,
> Ioannis Koltsidas/Almaden/IBM@IBMUS, Karan Gupta/Almaden/IBM@IBMUS,
> <email@example.com>, <firstname.lastname@example.org>, Runping Qi
> <email@example.com>, <firstname.lastname@example.org>, Mac Yang
> FailMon - Chukwa integration
> I have take a look at FailMon and here how we can integrate it to Chukwa.
> Basically there's 3 entry points in Chukwa:
> 1- At the adaptor level (inject data)
> 2- At the Demux level (Data analysis)
> 3- Using the archive.
> 1- Running FailMon at the adaptor level will prevent anyone to use the real
> data. So this should not be used in the general case.
> 2- It's possible to run FailMon as a Demux processor and output exactly what
> we want and that would have been my suggestion but FailMon is not intended
> to be used directly by the company that produce the output (at least for
> now) so I would prefer not to use FailMon there since we're planning to run
> critical processors and adding any latency here may become an issue.
> 3- So my recommendation is to use all Chukwa's archives as input for
> FailMon. The main advantage is that all the data is group together in one or
> more big Sequence files that can be easily processed using M/R and since
> it's an offline post-processing the impact on the production's cluster could
> be easily controlled.