Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
Description
As titled, I observed the perf regression in the final stress testing before upgrading our online cluster to 1.x. More details as follows:
1. HBase version in the comparison test:
- 0.98: based on 0.98.12 with some backports, among which
HBASE-11297is the most important perf-related one (especially under high stress) - 1.x: checked 3 releases in total
1) 1.1.2 with important perf fixes/improvements includingHBASE-15031andHBASE-14465
2) 1.1.4 release
3) 1.2.1RC1
2. Test environment
- YCSB: 0.7.0 with YCSB-651 applied
- Client: 4 physical nodes, each with 8 YCSB instance, each instance with 100 threads
- Server: 1 Master with 3 RS, each RS with 256 handlers and 64G heap
- Hardware: 64-core CPU, 256GB Mem, 10Gb Net, 1 PCIe-SSD and 11 HDD, same hardware for client and server
3. Test cases
- -p fieldcount=1 -p fieldlength=128 -p readproportion=1
- case #1: read against empty table
case #2: lrucache 100% hitcase #3: BLOCKCACHE=>false
4. Test result
- 1.1.4 and 1.2.1 have a similar perf (less than 2% deviation) as 1.1.2+, so will only paste comparison data of 0.98.12+ and 1.1.2+
- per-RS Throughput(ops/s)
HBaseVersion case#1 case#2case#30.98.12+ 383562 257493475941.1.2+ 363050 23275735872 - AverageLatency(us)
HBaseVersion case#1 case#2case#30.98.12+ 2774 4134223711.1.2+ 2930 457229690
It seems there's perf regression on RPCServer (we tried 0.98 client against 1.x server and observed a similar perf to 1.x client)
Attachments
Attachments
Issue Links
- is superceded by
-
HBASE-15971 Regression: Random Read/WorkloadC slower in 1.x than 0.98
- Resolved
- relates to
-
HDFS-10690 Optimize insertion/removal of replica in ShortCircuitCache
- Resolved