Here's another revision of the patch that reuses the Avro encoder, and support schema IDs rather than sending the schema in a Flume header.
The idea behind schema IDs is that if you set the ID for the schema in the Log4j MDC then it will be used in the Flume header. (If you don't set it then everything still works, it just has to set the schema in a header for every message.) The HDFS sink then retrieves the schema by looking it up from its configuration file, which has to include the ID -> schema mapping.
When AVRO-1124 is done we could use that for the schema repository.
An alternative way of doing this now would be to have a schema catalog properties files with the ID -> schema mapping, and have both the Log4jAppender and HDFS sink use it - in this way we could avoid the MDC part.