When we analyzed the performance of our hbase application with many puts, we found that Configuration methods use many CPU resources:
As you can see, getTable().put() is calling Configuration methods which cause regex or synchronization by Hashtable.
This should not happen in 0.99.2 because https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue.
However, it's reproducing nowadays by bugs or leakages after many code evoluations between 0.9x and 1.x.
- finishSetup is called every new HTable() e.g. every con.getTable()
- So getInt is called everytime and it does regex
- BufferedMutatorImpl is created every first put for HTable e.g. con.getTable().put()
- Create ConnectionConf every time in BufferedMutatorImpl constructor
- ConnectionConf gets config value in the constructor
- AsyncProcess is created in BufferedMutatorImpl constructor, so new AsyncProcess is created by con.getTable().put()
- AsyncProcess parse many configurations
So, con.getTable().put() is heavy operation for CPU because of getting config value.
With in-house patch for this issue, we observed about 10% improvement on max-throughput (e.g. CPU usage) at client-side:
Seems branch-2 is not affected because client implementation has been changed dramatically.