Details
-
Bug
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
Windows
-
Normal
Description
I have recently upgraded from Cassandra 1.2.18 to Cassandra 3.10 and was surprised to notice performance degradation of my server application.
I dug down through my application stack only to find out that the cause of the performance issue was slower response time of Cassandra 3.10 get_slice as compared to Cassandra 1.2.18 (almost x3 times slower on average).
I am attaching a python script (attack.py) here that can be used to reproduce this issue on a Windows platform. The script uses the pycassa python library that can easily be installed using pip.
REPRODUCTION STEPS:
1. Install Cassandra 1.2.18 from https://archive.apache.org/dist/cassandra/1.2.18/apache-cassandra-1.2.18-bin.tar.gz
2. Run Cassandra 1.2.18 from cmd console using cassandra.bat
3. Create a test keyspace and an empty CF using attack.py script
python attack.py create
4. Run some get_slice queries to an empty CF and note down the average response time (in seconds)
python attack.py
get_slice count: 788
get_slice total response time: 0.31299996376
get_slice average response time: 0.000397208075838
5. Stop Cassandra 1.2.18 and install Cassandra 3.10 from https://archive.apache.org/dist/cassandra/3.10/apache-cassandra-3.10-bin.tar.gz
6. Tweak cassandra.yaml to run thrift service (start_rpc=true) and run Cassandra from an elevated cmd console using cassandra.bat
7. Create a test keyspace and an empty CF using attack.py script
python attack.py create
8. Run some get_slice queries to an empty CF using attack.py and note down the average response time (in seconds)
python attack.py
get_slice count: 788
get_slice total response time: 1.16499996185
get_slice average response time: 0.00147842634753
9. Compare the average response times
EXPECTED:
get_slice response time of Cassandra 3.10 is not worse than on Cassandra 1.2.18
ACTUAL:
get_slice response time of Cassandra 3.10 is x3 worse than that of Cassandra 1.2.18
REMARKS:
- this seems to happen only on Windows platform (tested on Windows 10 and Windows Server 2008 R2)
- running the very same procedure on Linux (Ubuntu) renders roughly the same response times
- I sniffed the traffic to/from Cassandra 1.2.18 and Cassandra 3.10 and it can be seen that Cassandra 3.10 responds slower (Wireshark dumps attached)
- when attacking the server with concurrent get_slice queries I can see lower CPU usage for Cassandra 3.10 that for Cassandra 1.2.18
- get_slice in attack.py queries the column family for non-exisitng key (the column familiy is empty)
I am willing to work on this on my own if you guys give me some tips on where to look for. I am also aware that this might be more Windows/Java related, nevertheless, any help from your side would be much appreciated.