Details
Description
Found and fixed several bugs involving Hadoop archives:
- In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.
- It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.
- har:// connections cannot be indexed by (scheme, authority, username) – the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.
Attachments
Attachments
Issue Links
- incorporates
-
MAPREDUCE-1010 Adding tests for changes in archives.
- Resolved
- is cloned by
-
HADOOP-6231 Allow caching of filesystem instances to be disabled on a per-instance basis
- Resolved