The tracking of execution time of coprocessor methods introduced in
HBASE-11516 introduces 2 calls to System.nanoTime() per coprocessor method per coprocessor. This is resulting in a serious performance bottleneck in certain scenarios.
For example consider the scenario where many rows are being ingested (PUT) in a table which has multiple coprocessors (we have up to 20 coprocessors). This results in 8 extra calls to System.nanoTime() per coprocessor (prePut, postPut, postStartRegionOperation and postCloseRegionOperation) which has in total (i.e. times 20) been seen to result in a 50% increase of execution time.
I think it is generally considered bad practice to measure execution times on such a small scale (per single operation). Also note that measurements are taken even for coprocessors that do not even have an actual implementation for certain operations, making the problem worse.