Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6829

nodes sporadically shutting down

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Not A Problem
    • None
    • None
    • None
    • Windows Azure VMs.
      The VMs OS is SUSE Enterprise. I striped 2 logical volumes for each VM, one for data and one for commitlog, and formatted them as XFS.
      Oracle Java 1.7_45
      Datastax Enterprise 4.0 (Cassandra version 2.0.5.22)

    • Normal

    Description

      I deployed a Datastax 4.0 Cassandra cluster in Windows Azure and started load tests. After a while some of the nodes announce shutdown and stop responding to client requests.
      The error preceding the shutdown is "FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-581-Data.db" "Caused by: java.io.IOException: Input/output error".

      The storage I'm using in my VMs is Azure Blob storage. The VMs OS is SUSE Enterprise. I striped 2 logical volumes for each VM, one for data and one for commitlog, and formatted them as XFS.

      I am using Oracle Java 1.7_45

      Restarting the Cassandra process resolves the problem for a short while (minutes) afterwards the problem occurs again.

      I noticed that it happens only in tmp files of a specific table. See the errors from 3 random nodes:

      (1) ERROR [CompactionExecutor:48] 2014-03-09 11:38:45,188 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:48,1,main]
      FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-409-Data.db

      (2) ERROR [CompactionExecutor:37] 2014-03-10 10:04:30,828 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:37,1,main]
      FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-946-Data.db

      (3) ERROR [CompactionExecutor:48] 2014-03-10 10:23:39,248 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:48,1,main]
      FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-874-Data.db

      The table is a wide-row table created as:
      CREATE TABLE event_log (
      time_slice bigint,
      distribution_key int,
      event_id text,
      ... 300 columns ...
      PRIMARY KEY ((time_slice, distribution_key), event_id)
      ) compaction=

      {'class': 'SizeTieredCompactionStrategy'}

      AND
      compression=

      {'sstable_compression': 'LZ4Compressor'}

      ;

      CREATE INDEX EVENT_LOG_2IX ON event_log (event_id);

      'time_slice' represents a 5 minute time-period such as yyyyMMddHHmm where 'mm' is between 00 and 55 with increments of 5.

      The Data files under the 'data' directory got to be very big in a very short time after the test started.
      For example:
      1.5G Mar 10 10:50 /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-jb-968-Data.db
      3.0G Mar 10 11:41 /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-970-Data.db

      Full stack trace:

      ERROR [CompactionExecutor:37] 2014-03-10 10:04:30,828 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:37,1,main]
      FSWriteError in /mnt/dsedata/lib/cassandra/poc/event_log/poc-event_log-tmp-jb-946-Data.db
      at org.apache.cassandra.io.compress.CompressedSequentialWriter.close(CompressedSequentialWriter.java:270)
      at org.apache.cassandra.io.sstable.SSTableWriter.close(SSTableWriter.java:356)
      at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:324)
      at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:204)
      at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
      at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
      at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
      at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
      at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:744)
      Caused by: java.io.IOException: Input/output error
      at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
      at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
      at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:376)
      at org.apache.cassandra.io.compress.CompressionMetadata$Writer.close(CompressionMetadata.java:366)
      at org.apache.cassandra.io.compress.CompressedSequentialWriter.close(CompressedSequentialWriter.java:266)
      ... 13 more

      Attachments

        Activity

          People

            Unassigned Unassigned
            odpeer Oded Peer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: