Found and fixed several bugs involving Hadoop archives:
- In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.
- It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.
- har:// connections cannot be indexed by (scheme, authority, username) – the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
MAPREDUCE-1010 Adding tests for changes in archives.
- is cloned by
HADOOP-6231 Allow caching of filesystem instances to be disabled on a per-instance basis