Daniel, you've hit the nail on the head.
This patch is specifically written to enable us to compile against all the versions of hadoop, and let the user pick which one he wants at runtime (by virtue of including the right hadoop on the path – no flags needed). In fact the default ant task in the shims directory compiles all the shims at once.
The version string hack is safe, as long as hadoop is built correctly (the zebra version is not, as it returns "Unknown", hence the last-resort hack of defaulting to 20).
If hadoop came from its own jar I could use reflection to get the jar name, and use that as a fallback for an Unknown version – but in pig, hadoop comes from the pig.jar !
Ideally, Pig would compile all the versions of shims into its jars, and the pig jar woud not include hadoop. Then the user would include the right hadoop on the path (or bin/pig would do it for him), and everything would happen automagically.
By bundling hadoop into the jar, however, switching hadoop versions on the fly is next to impossible (or at least I don't know how) – we have multiple jars on the classpath, and the classloader will use whatever is the latest (or is it earliest?). Finding the right resource becomes fraught with peril.
If existing deployments need a single pig.jar without a hadoop dependency, it might be possible to create a new target (pig-all) that would create a statically bundled jar; but I think the default behavior should be to not bundle, build all the shims, and use whatever hadoop is on the path.
The current patch is written as is so that it can be applied to trunk, enabling people to compile statically, and only require a change to the ant build files to switch to a dynamic compile later on (after 0.4, probably)