Affects Version/s: None
Fix Version/s: 2.2.0
With FileSystem symlink support incoming in
HADOOP-8040, some clients will wish to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW in open(2).
Rationale for is for a security model where a user can invoke a third-party service running as a service user to operate on the user's data. For instance, users might want to use Hive to query data in their homedirs, where Hive runs as the Hive user and the data is readable by the Hive user. This leads to a security issue with symlinks:
- User Mallory invokes Hive to process data files in /user/mallory/hive/
- Hive checks permissions on the files in /user/mallory/hive/ and allows the query to proceed.
- RACE: Mallory replaces the files in /user/mallory/hive with symlinks that point to user Ann's Hive files in /user/ann/hive. These files aren't readable by Mallory, but she can create whatever symlinks she wants in her own scratch directory.
- Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.
This is also potentially useful for clients using FileContext, so let's add it there too.