Just to clarify, the current build.xml behavior seems to be incorrect/inconsistent with what you described. If I put in a 3rd party jar in lib/, it will be copied to release tar, and if I put a 3rd party native library under lib/native with "hadoop" in their names (like libhadoopgplcompression.so), they will also be included.
So I would argue that it is reasonable to include all jars and native libraries under lib/ to tar ball for two reasons:
- it is a user's conscience decision to copy data under lib/ and thus the inclusion of these files is "by-choice".
- the hadoop script currently includes all jars under lib/ in classpath, and all native libraries under lib/native/<arch>/ in sysproperty java.library.path. And it is reasonable for user to expect that if he/she runs an "ant tar" and untar the tarball somewhere else, it should behave exactly the same as the original place.
I am fine to add a test to verify the behavior. How about just running md5sum over the set of files under lib and under package final destination?