|
Wow, that's just what I needed.
When can we expect the patch ? Do the "sink files" need to be in HDFS or is this pluggable as well so I could write to other filesystems, e.g., NFS?
Pete –
Yes, the sink file writers are pluggable. In fact, our current writer uses the Hadoop FileSystem class, so I believe that if you pass a local path that points at NFS, it'll "just work". We haven't tested that, though. How will this be integrated with Hadoop? As a contrib module?
Yes, we're planning to add Chukwa to hadoop Tree as a contrib module within
the next few days. /Jerome Please include license files parallel to the included jars that are not Apache projects.
Please make the tarball relative to $HADOOP_HOME. Make sure you don't have any empty directories (or others that shouldn't be checked in). Please run the release audit tool over the submission to make sure that your source files all have copyright notices. Please remove the source code for org.openflashcart.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12387671/chukwa-patch-0.0.1.tgz against trunk revision 683448. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3026/console This message is automatically generated. The current patch is still missing license files for a lot of the jar files and it includes LGPL libraries, which can't be included. It would probably help to have a README in the lib directory that lists the jar files and which project they are from and their license.
I just committed this. Thanks, guys!
The jar files in .../chukwa/lib lead to some javadoc warnings. See
Integrated in Hadoop-trunk #581 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/
Ari, you and Jerome, in an email thread a week or so back, mentioned that you were planning on releasing a second Chukwa patch. Any updates here?
Hi Alex,
If you search for "chukwa" against the Jira website you'll see a list of patches that we want to commit, but we're depending on external Apache committers to get them committed. /Jerome. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Map-reduce jobs run periodically to analyze these sink files, and to drain their contents into structured storage.
Chukwa provides a natural solution to the log collection problem, posed in
HADOOP-2206. Once we have Chukwa working at scale, we intend to produce some patches to Hadoop to trigger log collection appropriately.We expect this work to ultimately be complementary to
HADOOP-3585, the failure analysis system. We want to collect similar data, and our framework is flexible enough to accommodate the proposed structure there, with only modest code changes on each side.The attached document introduces Chukwa, and describes the data collection architecture. We do not present our analytics and visualization in detail in this document. We intend to describe them in a second document in the near future.