[HADOOP-4640] Add ability to split text files compressed with lzo - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Trivial
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.20.0
Component/s: io
Labels:
None

Hadoop Flags:

Reviewed

Description

Right now any file compressed with lzop will be processed by one mapper. This is a shame since the lzo algorithm would be very suitable for large log files and similar common hadoop data sets. The compression rate is not the best out there but the decompression speed is amazing. Since lzo writes compressed data in blocks it would be possible to make an input format that can split the files.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-4640.patch
12/Nov/08 17:22
17 kB
Johan Oskarsson
HADOOP-4640.patch
14/Nov/08 14:22
19 kB
Johan Oskarsson
HADOOP-4640.patch
18/Nov/08 14:27
20 kB
Johan Oskarsson
HADOOP-4640.patch
19/Nov/08 11:31
21 kB
Johan Oskarsson

Activity

People

Assignee:: Johan Oskarsson

Reporter:: Johan Oskarsson

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 12/Nov/08 13:56

Updated:: 08/Jul/09 16:53

Resolved:: 24/Nov/08 11:13