[CASSANDRA-19429] Remove lock contention generated by getCapacity function in SSTableReader - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Changes Suggested
Priority: Normal
Resolution: Unresolved
Fix Version/s: 4.0.x, 4.1.x
Component/s: Local/SSTable
Labels:
None

Bug Category:
Degradation - Performance Bug/Regression
Severity:
Normal
Complexity:
Normal
Discovered By:
User Report
Platform:

All
Impacts:

None
Test and Documentation Plan:

Hide

ci

Show
ci

Description

Profiling Cassandra 4.1.3 on large AWS instances, a high number of lock acquires is measured in the `getCapacity` function from `org/apache/cassandra/cache/InstrumentingCache` (1.9M lock acquires per 60 seconds). Based on our tests on r8g.24xlarge instances (using Ubuntu 22.04), this limits the CPU utilization of the system to under 50% when testing at full load and therefore limits the achieved throughput.

Removing the lock contention from the SSTableReader.java file by replacing the call to `getCapacity` with `size` achieves up to 2.95x increase in throughput on r8g.24xlarge and 2x on r7i.24xlarge:

Instance type	Cass 4.1.3	Cass 4.1.3 patched
r8g.24xlarge	168k ops	496k ops (2.95x)
r7i.24xlarge	153k ops	304k ops (1.98x)

Instructions to reproduce:

## Requirements for Ubuntu 22.04
sudo apt install -y ant git openjdk-11-jdk

## Build and run
CASSANDRA_USE_JDK11=true ant realclean && CASSANDRA_USE_JDK11=true ant jar && CASSANDRA_USE_JDK11=true ant stress-build  && rm -rf data && bin/cassandra -f -R

# Run
bin/cqlsh -e 'drop table if exists keyspace1.standard1;' && \
bin/cqlsh -e 'drop keyspace if exists keyspace1;' && \
bin/nodetool clearsnapshot --all && tools/bin/cassandra-stress write n=10000000 cl=ONE -rate threads=384 -node 127.0.0.1 -log file=cload.log -graph file=cload.html && \
bin/nodetool compact keyspace1   && sleep 30s && \
tools/bin/cassandra-stress mixed ratio\(write=10,read=90\) duration=10m cl=ONE -rate threads=406 -node localhost -log file=result.log -graph file=graph.html

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

asprof_cass4.1.3__lock_20240216052912lock.html
23/Feb/24 22:18
10 kB
Dipietro Salvatore
Screenshot 2024-02-26 at 10.27.10.png
26/Feb/24 18:49
191 kB
Dipietro Salvatore
Screenshot 2024-02-27 at 11.29.41.png
27/Feb/24 19:30
193 kB
Dipietro Salvatore
image-2024-03-08-15-51-30-439.png
08/Mar/24 23:51
624 kB
Jon Haddad
image-2024-03-08-15-52-07-902.png
08/Mar/24 23:52
657 kB
Jon Haddad
Screenshot 2024-03-19 at 15.22.50.png
19/Mar/24 23:30
207 kB
Dipietro Salvatore

Issue Links

links to

GitHub Pull Request #3133

GitHub Pull Request #3505

GitHub Pull Request #3644

Activity

People

Assignee:: Dipietro Salvatore

Reporter:: Dipietro Salvatore

Authors:: Dipietro Salvatore

Reviewers:: Maxim Muzafarov, Stefan Miklosovic

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 23/Feb/24 22:22

Updated:: 3 hours ago

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

3h 50m