Description
This issue comes from profiling our application. We have a MultiTableBatchWriter created by normal means. I am attempting to write to it with multiple threads by doing things like the following:
batchWriter.getBatchWriter(table).addMutations(mutations);
In my test with 4 threads writing to one table, this call is quite inefficient and results in a large performance degradation over a single BatchWriter.
I believe the culprit is the fact that the call is synchronized. Also there is the possibility that the zookeeper call to Tables.getTableState on every call is negatively affecting performance:
@Override public synchronized BatchWriter getBatchWriter(String tableName) throws AccumuloException, AccumuloSecurityException, TableNotFoundException { ArgumentChecker.notNull(tableName); String tableId = Tables.getNameToIdMap(instance).get(tableName); if (tableId == null) throw new TableNotFoundException(tableId, tableName, null); if (Tables.getTableState(instance, tableId) == TableState.OFFLINE) throw new TableOfflineException(instance, tableId); BatchWriter tbw = tableWriters.get(tableId); if (tbw == null) { tbw = new TableBatchWriter(tableId); tableWriters.put(tableId, tbw); } return tbw; }
I recommend moving the synchronized block to happen only if the batchwriter is not present, and also only checking if the table is online at that time:
@Override public BatchWriter getBatchWriter(String tableName) throws AccumuloException, AccumuloSecurityException, TableNotFoundException { ArgumentChecker.notNull(tableName); String tableId = Tables.getNameToIdMap(instance).get(tableName); if (tableId == null) throw new TableNotFoundException(tableId, tableName, null); BatchWriter tbw = tableWriters.get(tableId); if (tbw == null) { if (Tables.getTableState(instance, tableId) == TableState.OFFLINE) throw new TableOfflineException(instance, tableId); tbw = new TableBatchWriter(tableId); synchronized(tableWriters){ //only create a new table writer if we haven't been beaten to it. if (tableWriters.get(tableId) == null) tableWriters.put(tableId, tbw); } } return tbw; }
Attachments
Attachments
Issue Links
- is related to
-
ACCUMULO-1859 Conditional Mutation with 1000 conditions is slow.
- Resolved
-
ACCUMULO-4778 Resolving table name to table id is expensive
- Resolved