[HADOOP-6097] Multiple bugs w/ Hadoop archives - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.19.2, 0.20.0, 0.20.1
Fix Version/s: 0.20.2
Component/s: fs
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Bugs fixed for Hadoop archives: character escaping in paths, LineReader and file system caching.

Description

Found and fixed several bugs involving Hadoop archives:

In makeQualified(), the sloppy conversion from Path to URI and back mangles the path if it contains an escape-worthy character.

It's possible that fileStatusInIndex() may have to read more than one segment of the index. The LineReader and count of bytes read need to be reset for each block.

har:// connections cannot be indexed by (scheme, authority, username) – the path is significant as well. Caching them in this way limits a hadoop client to opening one archive per filesystem. It seems to be safe not to cache them, since they wrap another connection that does the actual networking.

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-6097.patch
22/Jun/09 18:01
2 kB
Ben Slusky
HADOOP-6097-0.20.patch
20/Oct/09 22:11
4 kB
Mahadev Konar
HADOOP-6097-0.20.patch
21/Sep/09 21:25
4 kB
Mahadev Konar
HADOOP-6097-0.20.patch
21/Sep/09 18:35
1 kB
Mahadev Konar
HADOOP-6097-v2.patch
02/Sep/09 07:09
1 kB
Thomas White

Issue Links

incorporates

MAPREDUCE-1010 Adding tests for changes in archives.

Resolved

is cloned by

HADOOP-6231 Allow caching of filesystem instances to be disabled on a per-instance basis

Resolved

Activity

People

Assignee:: Ben Slusky

Reporter:: Ben Slusky

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 22/Jun/09 17:57

Updated:: 02/Jul/10 04:46

Resolved:: 20/Oct/09 23:41