Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
Currently HTable is marked as NOT thread safe, and this JIRA target at improving this to take better usage of the thread-safe BufferedMutator.
Some findings/work done:
If we try to do put to the same HTable instance in parallel, there'll be problem, since now we have HTable#getBufferedMutator like
BufferedMutator getBufferedMutator() throws IOException { if (mutator == null) { this.mutator = (BufferedMutatorImpl) connection.getBufferedMutator( new BufferedMutatorParams(tableName) .pool(pool) .writeBufferSize(connConfiguration.getWriteBufferSize()) .maxKeyValueSize(connConfiguration.getMaxKeyValueSize()) ); } mutator.setRpcTimeout(writeRpcTimeout); mutator.setOperationTimeout(operationTimeout); return mutator; }
And HTable#flushCommits:
void flushCommits() throws IOException { if (mutator == null) { // nothing to flush if there's no mutator; don't bother creating one. return; } getBufferedMutator().flush(); }
For HTable#put
public void put(final Put put) throws IOException { getBufferedMutator().mutate(put); flushCommits(); }
If we launch multiple threads to put in parallel, below sequence might happen because HTable#getBufferedMutator is not thread safe:
1. ThreadA runs to getBufferedMutator and finds mutator==null 2. ThreadB runs to getBufferedMutator and finds mutator==null 3. ThreadA initialize mutator to instanceA, then calls mutator#mutate, adding one put (putA) into {{writeAsyncBuffer}} 4. ThreadB initialize mutator to instanceB 5. ThreadA runs to flushCommits, now mutator is instanceB, it calls instanceB's flush method, putA is lost
After fixing this, we will find quite some contention on BufferedMutatorImpl#flush, so more efforts required to make HTable thread safe but with good performance meanwhile.
Attachments
Attachments
Issue Links
- relates to
-
HBASE-17372 Make AsyncTable thread safe
- Resolved