[HADOOP-5589] TupleWritable: Lift implicit limit on the number of values that can be stored - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.19.1
Fix Version/s: 0.21.0
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

TupleWritable uses an instance field of the primitive type, long, which I presume is so that it can quickly determine if a position has been written to in its array of Writables (by using bit-shifting operations on the long field). The problem with this is that it implies that there is a maximum limit of 64 values you can store in a TupleWritable.

An example of a use-case where I think this would be a problem is if you had two MR jobs with over 64 reduces tasks and you wanted to join the outputs with CompositeInputFormat - this will probably cause unexpected results in the current scheme.

At the very least, the 64-value limit should be documented in TupleWritable.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-5589-1.patch
27/Mar/09 16:32
9 kB
Jingkei Ly
HADOOP-5589-2.patch
28/Mar/09 19:55
10 kB
Jingkei Ly
HADOOP-5589-3.patch
28/Mar/09 21:34
10 kB
Jingkei Ly
HADOOP-5589-4.patch
23/Apr/09 09:40
14 kB
Jingkei Ly
HADOOP-5589-4.patch
17/Apr/09 13:24
14 kB
Jingkei Ly

Activity

People

Assignee:: Jingkei Ly

Reporter:: Jingkei Ly

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 27/Mar/09 12:15

Updated:: 24/Aug/10 20:36

Resolved:: 23/Apr/09 22:10