Details

    • Bug
    • Status: Resolved
    • Urgent
    • Resolution: Duplicate
    • None
    • None
    • None
    • Critical

    Description

      We're pretty heavy users of CQL3 and CQL3 collection types. Occasionally, some nodes of the cluster will become extremely sluggish and the cluster as a whole starts to become unresponsive, reads will time out, and nodes will drop mutation messages. This happens when nodes flush Memtables to disk (based on my tail of the system.log on each node).

      I'm a curious guy, so I attached jvisualvm (v1.3.3) to the JVMs that were having this problem. These nodes are spending up to 98% of CPU in org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78). I will attach a thread dump.

      Thi is causing us quite a headache, because we're unable to figure what would be causing this. We tried tuning several configuration settings (column cache size, row key cache size), but the cluster exhibits the same issues even with the default configuration (except for a modified num_tokens and listen_address).

      Attachments

        Issue Links

        Activity

          People

            Unassigned Unassigned Assign to me
            ceineke Chris Eineke
            Votes:
            0 Vote for this issue
            Watchers:
            Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                In order to see discussions, first confirm access to your Slack account(s) in the following workspace(s): ASF