Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15717

Benchmark performance difference between Docker and Kubernetes when running Cassandra:2.2.16 official Docker image

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Not A Problem
    • None
    • Test/benchmark
    • None
    • All
    • None

    Description

      Sorry for the slightly irrelevant post. This is not an issue with Cassandra but possibly with the interaction between Cassandra and Kubernetes.

      We experienced a performance degradation when running a single Cassandra instance inside kubeadm 1.14 in comparison with running the Docker container stand-alone.
      A write-only workload (YCSB benchmark workload A - Load phase) using the following user table:

       

      {{ cqlsh> create keyspace ycsb
      WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 1 }
      ;
      cqlsh> USE ycsb;
      cqlsh> create table usertable (
      y_id varchar primary key,
      field0 varchar,
      field1 varchar,
      field2 varchar,
      field3 varchar,
      field4 varchar,
      field5 varchar,
      field6 varchar,
      field7 varchar,
      field8 varchar,
      field9 varchar);}}

      And using the following script:

       

      {{python ./bin/ycsb load cassandra2-cql -P workloads/workloada -p recordcount=1500000 -p
      operationcount=1500000 -p measurementtype=raw -p cassandra.connecttimeoutmillis=60000 -p
      cassandra.readtimeoutmillis=60000 -target 1500 -threads 20 -p hosts=localhost >
      results/cassandra-docker/cassandra-docker-load-workloada-1-records-1500000-rnd-1762034446.txt
      sleep 15}}

      We used the following image: decomads/cassandra:2.2.16, which uses the official cassandra:2.2.16 as base image and adds a readinessProbe to it.

      We used identical Docker configuration parameters by ensuring that the output of docker inspect is as much as possible the same. First we got the YCSB benchmark in a container that is co-located with the cassandra container in one pod. Kubernetes starts these containers then with network mode net=container:... This is a separate container that links up the ycsb and cassandra containers within the same network space so they can talk via localhost. By this we hope to avoid network plugin interference from the CNI plugin.

      We ran the docker-only container within the Kubernetes node using the default bridge network

      We first performed the experiment on an Openstack VM Ubuntu 16:04 (4GB, 4 CPU cores, 50GB), that runs on a physical nodes with 16 CPU cores. Storage is Ceph however and therefore distributed

      To avoid distributed storage of ceph, we repeated the experiment also on minikube+VirtualBox (12GB, 4 CPU cores, 30 GB) on a Windows 10 laptop with 4 cores/8 logical processors and 16GB RAM. However the same performance degradation was measured.

      Observations (On Ubuntu-OpenStack)

      • Docker:
        • Mean average response latency YCSB benchmark: 1,5 ms-1.7ms
      • Kubernetes
        • Mean average response latency YCSB benchmark: 2.7 ms-3ms
      • CPU usage of the Cassandra Daemon JVM is way lower than Kubernetes (see my position paper: https://lirias.kuleuven.be/2788169?limo=0):

      Possible causes:

      • Network overhead of virtual bridge in Kubernetes is not the cause of the problem in our opinion.
        • We repeated the experiment where we ran the Docker-Only containers inside a Kubernetes node and we linked the containers using the --net=container: mode mechanisms as similar as possible as we could. The YCSB latency stayed the same.
      • Disk/io bottleneck: Nodetool tablestats are very similar. Cassandra containers are configured to write data to a filesystem that is mounted from the host inside the container. Exactly the same Docker mount type is used
        • Write latency is very stable over multiple runs
      • Kubernetes for ycsb user table: 0.0167 ms.
      • Write latency Docker for ycsb usertable: 0.0150 ms.
        • Compaction_history/compaction_in_progress is also very similar (see attached files)

      )

      Do you know of any other causes that might explain the difference in reported YCSB reponse latency? Could it be the the Cassandra Session is closed by Kubernetes after each request?  How can I diagnose this?

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            eddytruyen Eddy Truyen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: