[MAPREDUCE-64] Map-side sort is hampered by io.sort.record.percent - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.21.0
Component/s: performance, task
Labels:
None

Hadoop Flags:

Reviewed

Description

Currently io.sort.record.percent is a fairly obscure, per-job configurable, expert-level parameter which controls how much accounting space is available for records in the map-side sort buffer (io.sort.mb). Typically values for io.sort.mb (100) and io.sort.record.percent (0.05) imply that we can store ~350,000 records in the buffer before necessitating a sort/combine/spill.

However for many applications which deal with small records e.g. the world-famous wordcount and it's family this implies we can only use 5-10% of io.sort.mb i.e. (5-10M) before we spill inspite of having much more memory available in the sort-buffer. The word-count for e.g. results in ~12 spills (given hdfs block size of 64M). The presence of a combiner exacerbates the problem by piling serialization/deserialization of records too...

Sure, jobs can configure io.sort.record.percent, but it's tedious and obscure; we really can do better by getting the framework to automagically pick it by using all available memory (upto io.sort.mb) for either the data or accounting.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

M64-10.patch
01/Feb/10 22:48
114 kB
Christopher Douglas
M64-9.patch
25/Jan/10 05:51
106 kB
Christopher Douglas
M64-8.patch
23/Jan/10 03:25
108 kB
Christopher Douglas
M64-7.patch
11/Jan/10 22:33
107 kB
Christopher Douglas
M64-6.patch
11/Jan/10 07:08
108 kB
Christopher Douglas
M64-5.patch
22/Dec/09 09:26
108 kB
Christopher Douglas
M64-4.patch
09/Dec/09 08:18
106 kB
Christopher Douglas
M64-0i.png
18/Oct/09 07:57
30 kB
Christopher Douglas
M64-1i.png
18/Oct/09 07:57
32 kB
Christopher Douglas
M64-2i.png
18/Oct/09 07:57
29 kB
Christopher Douglas
M64-3.patch
18/Oct/09 01:51
93 kB
Christopher Douglas
M64-2.patch
15/Oct/09 09:33
89 kB
Christopher Douglas
M64-1.patch
10/Oct/09 06:43
86 kB
Christopher Douglas
M64-0.patch
02/Oct/09 23:20
80 kB
Christopher Douglas

Activity

People

Assignee:: Christopher Douglas

Reporter:: Arun Murthy

Votes:: 0 Vote for this issue

Watchers:: 24 Start watching this issue

Dates

Created:: 22/Jan/09 22:02

Updated:: 30/Mar/13 00:22

Resolved:: 05/Feb/10 05:43