I agree that Avro should not require MapReduce – specifically the maven POM should not cause consumers to pull MapReduce by default.
But, I think we already prevent that. The POM generated by the build specifies hadoop-core as "optional" meaning downstream projects that consume Avro won't automatically pull the Hadoop jar. Another option for similar effect is to specify the dependency scope as "provided" instead of "compile" which makes the jar available for build and test but does not bundle it. This is probably preferred for MapReduce. If a user wants to use those APIs, they have to get a copy of their own hadoop-core jar or specify the dependency themselves.
Putting the code in Hadoop is probably a problem, unless we want to release new versions of 0.18, 0.19, 0.20, etc. Placing it in Hadoop means that changes to the Avro lower level APIs will break compatibility with the version in Hadoop. Honestly, some of those APIs are going to keep evolving and dot-releases of AVRO can break these APIs (but not encoded formats). Until these APIs are more locked down it is better to keep packages like this in the Avro project.
Going slightly off topic now:
A few other libraries Avro bundles have similar issues – optional side features should specify either "provided" or "optional" flags in the maven pom. Or, the project needs to be split up into a few jars.
probably covers the main dependency chunks. Avro-core can get away with only jackson, slf4j, and commons-lang, I think – meaning generic, and specific APIs, file formats, etc work.