I'm deliberately avoiding in permission checks in this code path. In terms of security, I feel that this is no worse than what we have right now.
A shared cache where anyone can write is indeed worse. Today jars are being uploaded to HDFS into a private staging directory where no other normal user can interfere. If the staging directory were to become publicly writeable then it becomes trivial to compromise all users trying to run the same pig jar using a scheme like Koji Noguchi pointed out. I don't see how one can accomplish the same level of havoc today. Even if there's a window in the local filesystem where one can hijack a jar, that requires access to the same node where the user is launching the job. In the publicly-writeable shared cache scheme, one only needs access to HDFS from any node and clients on all nodes using the shared cache can be compromised.
Besides malicious users, the shared cache can also be accidentally made ineffective by clients. For example, a user with a restrictive umask (e.g.: 077) uploads a jar to the shared cache, and all the directories and files were created such that others can't read them. Now because the permissions are incorrect any other user can't share the file and any other user's file that happens to have the same initial digit(s) in its hash can't be uploaded to the shared cache. And then there's the client that deletes files in-use by other clients, breaking their jobs.
In short, shared public caches that are publicly writeable are going to be problematic, especially in secure setups. As such I think there should at least be some documentation describing the risks of enabling it and how it could be used in a read-only manner for sharing securely, i.e.: shared cache is publicly readable but only writeable by admins who manually maintain the entries in the shared cache.