[MAPREDUCE-1795] add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

When running MapReduce with concatenated gzip files as input, only the first part ("member" in gzip spec parlance, http://www.ietf.org/rfc/rfc1952.txt) is read; the remainder is silently ignored. As a first step toward fixing that, this issue will add a configurable option to throw an error in such cases.

~~MAPREDUCE-469~~ is the tracker for the more complete fix/feature, whenever that occurs.

Attachments

Issue Links

is related to

HADOOP-6335 Support reading of concatenated gzip and bzip2 files

Resolved

PIG-42 Pig should be able to split Gzip files like it can split Bzip files

Resolved

HADOOP-6835 Support concatenated gzip files

Closed

Activity

People

Assignee:: Greg Roelofs

Reporter:: Greg Roelofs

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 18/May/10 01:18

Updated:: 10/Jun/10 21:57

Resolved:: 10/Jun/10 21:57