[ACCUMULO-2915] Avoid copying all Mutations when using a TabletServerBatchWriter - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.7.0
Fix Version/s: None
Component/s: client
Labels:
None

Description

Currently in the TabletServerBatchWriter, the following behavior is exhibited:

    // create a copy of mutation so that after this method returns the user
    // is free to reuse the mutation object, like calling readFields... this
    // is important for the case where a mutation is passed from map to reduce
    // to batch writer... the map reduce code will keep passing the same mutation
    // object into the reduce method
    m = new Mutation(m);
    
    totalMemUsed += m.estimatedMemoryUsed();
    mutations.addMutation(table, m);
    totalAdded++;

This means all data is copied twice when writing. The logic for doing this is a bit dubious, since not all clients are going to be subject to MapReduce's use of references.

It'd be good if we provided users with a way of signaling that there's no need to copy the mutation payload. elserj suggested creating something akin to an ImmutableMutation, which help avoid some of the fears the batchwriter attempts to defend against.

Attachments

Issue Links

is related to

ACCUMULO-2945 New mutations should allow for hinting of the final buffer size

Resolved

relates to

ACCUMULO-2925 Timestamp is not propagated to peer

Resolved

ACCUMULO-2589 Create new client API

Open

Activity

People

Assignee:: William Slacum

Reporter:: William Slacum

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 16/Jun/14 22:30

Updated:: 11/Jun/19 06:06