For web dashboards, it's useful to have Flume just fire an HTTP POST with event data to a waiting URL.
I've got a simple implementation up at http://github.com/hammer/flume that works for me, but could use parametrization and some tests.
Migrating some comments over from Github
Also I need to refactor this to allow for parametrized GET parameters and cookies.
Also needs some test cases.
I'm adding my general notes to http://www.quora.com/How-can-I-add-a-new-source-or-sink-to-Flume.
One thing I notice about Jon's RabbitMQ plugin: it doesn't use a logger. Should we figure out a way for plugins to use the Flume logger?
Another difference: in the builder, the plugin throws an IllegalArgumentException rather than a FlumeSpecException.
FLUME-83 and FLUME-84 would have been helpful here.
FLUME-85 too (we really need to allow links in this JIRA)
It's also a bit odd that we use the same logic to load plugins in SinkFactoryImpl.java and SourceFactoryImpl.java; we should do that only once.
Lastly, I noticed that Flume doesn't pick up $FLUME_HOME/conf/flume-site.xml; it goes directly to /etc/flume/conf/flume-site.xml. Seems like that should be configurable.
Okay, I figured out why Flume is automagically reading configuration information from /etc:
happens in bin/flume. I am not sure it's sane to do that?
Moved to the plugin architecture and made the POST parameter and cookie content configurable. Should be usable for anyone who wants it at http://github.com/hammer/flume/blob/master/plugins/http/src/java/com/cloudera/flume/handlers/http/HttpPostSink.java.
Note that I didn't add tests because the current example tests for plugins use a deprecated API. Waiting on FLUME-86 before cleaning up for commit.
re FLUME_CONF_DIR we do log which directory we are using.
also notice in this pending review that bin/flume has been updated to check for dev env (local conf) and use that (prefer) if available
Well, I just think we have too many places where we might set the path: the bin/flume script sets the conf path if the environment variable is not set, but we also have logic for doing that in FlumeConfiguration.java. We should pick one place to handle an unset environment variable.
but we also have logic for doing that in FlumeConfiguration.java
ah, ok, I see. Perhaps we should just exit gracefully, with actionable error
message, if FLUME_CONF_DIR is not set once we get to that code in FlumeConfiguration.java?
(if you agree could you open a new jira to address?)
I actually think handling the missing env variable in the code rather than in the bin scripts is a better idea. Some day we'd like people to manage their cluster in a more scalable fashion than bash scripts so keeping those scripts stupid seems like a good idea to me.
deprecated, flumeNG (1.0x)
Won't fix. 0.X branch not maintained anymore