Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16214

Improve Logging Downstream of DataOutputBuffer Overflows

    XMLWordPrintableJSON

Details

    • Operability
    • Low Hanging Fruit
    • All
    • None
    • Hide
      • changes to DataOutputBuffer should be more or less covered by the associated changes in DataOutputTest
      • the changes to the read response creation an compaction logic I've tested manually via a simple test that looks roughly like this, coupled with hard-coding the max array size in DataOutputTest:
      @Test
      public void testLargeRowSerialization() throws Throwable
      {
          try (ICluster<?> cluster = init(builder().withNodes(2).start(), 1))
          {
              cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (pk int, ck int, a text, b text, PRIMARY KEY (pk, ck)) WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'enabled' : false }");
              
              StringBuilder builder = new StringBuilder();
              for (int i = 0; i < LARGE_MESSAGE_THRESHOLD ; i++)
                  builder.append('a');
              String bigString = builder.toString();
              
              cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, a) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString);
              cluster.get(1).flush(KEYSPACE); // just comment this out to see compaction fail
              
              cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, b) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString);
              cluster.get(1).flush(KEYSPACE);
              
              assertRows(cluster.coordinator(1).execute("SELECT * FROM " + KEYSPACE + ".tbl WHERE pk = ?", ConsistencyLevel.ALL, 1), 
                         row(1, 1, bigString, bigString));
          }
      }
      
      Show
      changes to DataOutputBuffer should be more or less covered by the associated changes in DataOutputTest the changes to the read response creation an compaction logic I've tested manually via a simple test that looks roughly like this, coupled with hard-coding the max array size in DataOutputTest : @Test public void testLargeRowSerialization() throws Throwable { try (ICluster<?> cluster = init(builder().withNodes(2).start(), 1)) { cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (pk int, ck int, a text, b text, PRIMARY KEY (pk, ck)) WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'enabled' : false }"); StringBuilder builder = new StringBuilder(); for (int i = 0; i < LARGE_MESSAGE_THRESHOLD ; i++) builder.append('a'); String bigString = builder.toString(); cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, a) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString); cluster.get(1).flush(KEYSPACE); // just comment this out to see compaction fail cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, b) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString); cluster.get(1).flush(KEYSPACE); assertRows(cluster.coordinator(1).execute("SELECT * FROM " + KEYSPACE + ".tbl WHERE pk = ?", ConsistencyLevel.ALL, 1), row(1, 1, bigString, bigString)); } }

    Description

      There are a number of codepaths that buffer intermediate representations of a partition in memory in a DataOutputBuffer. Compactions and reads (involving multiple SSTables) can both produce partitions whose serialized size is larger than MAX_ARRAY_SIZE, in which case a RuntimeException and really not much else makes it to the logs. We should be able to improve this so that at least the offending keyspace, table, and partition are logged.

      Attachments

        Activity

          People

            maedhroz Caleb Rackliffe
            maedhroz Caleb Rackliffe
            Caleb Rackliffe
            Jordan West, Yifan Cai
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h