[CASSANDRA-7776] Allow multiple MR jobs to concurrently write to the same column family from the same node using CqlBulkOutputFormat - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 2.1.1
Component/s: None
Labels:
- cql3
- hadoop

Description

After sstable files are written, all files in the specified output directory are loaded (transferred) to the remote cassandra cluster. If multiple writes occur on a node to the same table (i.e. directory), then the multiple load processes end up transferring the same sstable files multiple times. Furthermore, if directory cleanup of successful outputs is set to occur (CASSANDRA-7777), then there could be errors caused by write/load contention.

This can be simply remedied by using unique output directories for each MR job.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

trunk-7776-v1.txt
15/Aug/14 15:19
1 kB
Paul Pak

Activity

People

Assignee:: Paul Pak

Reporter:: Paul Pak

Authors:: Paul Pak

Reviewers:: Piotr Kolaczkowski

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 15/Aug/14 15:12

Updated:: 16/Apr/19 09:31

Resolved:: 09/Oct/14 16:10