[CASSANDRA-1473] Implement a Cassandra aware Hadoop mapreduce.Partitioner - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Incomplete
Fix Version/s: None
Component/s: None
Labels:
None

Description

When using a IPartitioner that does not sort data in byte order (RandomPartitioner for example) with Cassandra's Hadoop integration, Hadoop is unaware of the output order of the data.

We can make Hadoop aware of the proper order of the output data by implementing Hadoop's mapreduce.Partitioner interface: then Hadoop will handle sorting all of the data according to Cassandra's IPartitioner, and the writing clients will be able to connect to smaller numbers of Cassandra nodes.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Stu Hood

Votes:: 4 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 06/Sep/10 20:02

Updated:: 16/Apr/19 09:33

Resolved:: 19/Aug/11 19:49