Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16214

Improve Logging Downstream of DataOutputBuffer Overflows

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Change Category:
      Operability
    • Complexity:
      Low Hanging Fruit
    • Platform:
      All
    • Impacts:
      None
    • Test and Documentation Plan:
      Hide
      • changes to DataOutputBuffer should be more or less covered by the associated changes in DataOutputTest
      • the changes to the read response creation an compaction logic I've tested manually via a simple test that looks roughly like this, coupled with hard-coding the max array size in DataOutputTest:
      @Test
      public void testLargeRowSerialization() throws Throwable
      {
          try (ICluster<?> cluster = init(builder().withNodes(2).start(), 1))
          {
              cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (pk int, ck int, a text, b text, PRIMARY KEY (pk, ck)) WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'enabled' : false }");
              
              StringBuilder builder = new StringBuilder();
              for (int i = 0; i < LARGE_MESSAGE_THRESHOLD ; i++)
                  builder.append('a');
              String bigString = builder.toString();
              
              cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, a) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString);
              cluster.get(1).flush(KEYSPACE); // just comment this out to see compaction fail
              
              cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, b) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString);
              cluster.get(1).flush(KEYSPACE);
              
              assertRows(cluster.coordinator(1).execute("SELECT * FROM " + KEYSPACE + ".tbl WHERE pk = ?", ConsistencyLevel.ALL, 1), 
                         row(1, 1, bigString, bigString));
          }
      }
      
      Show
      changes to DataOutputBuffer should be more or less covered by the associated changes in DataOutputTest the changes to the read response creation an compaction logic I've tested manually via a simple test that looks roughly like this, coupled with hard-coding the max array size in DataOutputTest : @Test public void testLargeRowSerialization() throws Throwable { try (ICluster<?> cluster = init(builder().withNodes(2).start(), 1)) { cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (pk int, ck int, a text, b text, PRIMARY KEY (pk, ck)) WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'enabled' : false }"); StringBuilder builder = new StringBuilder(); for (int i = 0; i < LARGE_MESSAGE_THRESHOLD ; i++) builder.append('a'); String bigString = builder.toString(); cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, a) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString); cluster.get(1).flush(KEYSPACE); // just comment this out to see compaction fail cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (pk, ck, b) VALUES (1, 1, ?)", ConsistencyLevel.ALL, bigString); cluster.get(1).flush(KEYSPACE); assertRows(cluster.coordinator(1).execute("SELECT * FROM " + KEYSPACE + ".tbl WHERE pk = ?", ConsistencyLevel.ALL, 1), row(1, 1, bigString, bigString)); } }

      Description

      There are a number of codepaths that buffer intermediate representations of a partition in memory in a DataOutputBuffer. Compactions and reads (involving multiple SSTables) can both produce partitions whose serialized size is larger than MAX_ARRAY_SIZE, in which case a RuntimeException and really not much else makes it to the logs. We should be able to improve this so that at least the offending keyspace, table, and partition are logged.

        Attachments

          Activity

            People

            • Assignee:
              maedhroz Caleb Rackliffe Assign to me
              Reporter:
              maedhroz Caleb Rackliffe
              Authors:
              Caleb Rackliffe
              Reviewers:
              Jordan West, Yifan Cai

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 50m
                50m

                  Issue deployment