I'd guess there are many users who do not want Hadoop to limit tasks (be they Java or streaming). When a cluster exists to run specific tasks, it seems reasonable that they can use all of its resources.
On this issue, a default ulimit -v will cause some pretty strange failures while also failing to prevent resource exhaustion in other cases. For example, some tasks may mmap multi-GB files but touch only a few pages. Others may link libraries that require 100s of MB of address space for code that's never executed (and thus never read). Still others may fork off lots of sub-processes and thus ultimately consume more RAM than any single process's virtual address space. (Btw, these examples are all taken from our deployed Hadoop apps.)
Further, when these tasks hit the virtual address space limit, it's likely they'll fail in confusing, difficult to debug ways, since few apps are written to gracefully handle that case, and when run outside of Hadoop the same commands will work fine unless the user reads the streaming code and notices that it is imposing this limit. (This is in contrast to the -Xmx limit, which can actually influence the garbage collector to be more aggressive, is a commonly used java option, and produces relatively clear OutOfMemoryErrors on failure.)
This is why I don't think ulimit -v is the right approach in general. That doesn't mean it's not the right approach for specific situations, and hence the original proposal for a wrapper script (possibly one mandated by the cluster admin) is attractive. In other specific situations, ulimit -m might be more effective than ulimit -v, or some jail-like mechanism might be employed, and of course Windows users will need something else. Adding support for all the different ways resources might be limited to streaming does not seem practical.
(I realize this would all have been much more useful to bring up in the original issue, and apologize for not following that one more closely. As one path forward, we could reopen 2765 and continue this discussion there.)