Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18283

Enhance diagnostic nodetool tablestats output

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 5.0-alpha1, 5.0
    • Tool/nodetool
    • None

    Description

      The nodetool tablestats command lacks some available details which would be very useful to report upon.  This is especially helpful in database-as-a-service environments where servers and their disk files are not directly observable by users.

      1. Currently, for LCS tablestats reports useful details about the number of sstables in each level:

                SSTable count: 6635

                SSTables in each level: [1, 9, 98, 805, 5722, 0, 0, 0, 0]

      This type of additional detail about the sstables is absent from STCS and TWCS as it only reports the table count. 

      1a) For STCS, tablestats should report the max sstable file size on disk. This is useful to know if compaction has failed due to disk space or if a forced compaction created a jumbo table.

      1b) For TWCS, tablestats should report the min & max timestamp, and duration of the sstables representing windows.  This is useful to know if out-of-window writes or rows w/out a TTL have lead many more sstables on disk than expected by the time window configuration.

      STCs example:

                SSTable count: 6635

                SSTable STCS max size: 122,000,000,000

      TWCs example:

               SSTable count: 6635

                SSTables Time Window 15 DAYS, max duration : 362d 7h 16m 49s

      2. While tablestats reports both memtable and disk file sstable statistics. It is useful these are in the same command, but it would clarify the output to separate mem vs disk into two sections

      i.e., 

           -- File statistics

           SSTable count: 6635

           SSTables in each level: [1, 9, 98, 805, 5722, 0, 0, 0, 0] 

           -- Memtable statistics

           Bloom filter false positives: 12184123

           Bloom filter false ratio: 0.07203

           Bloom filter space used: 16874424

           Bloom filter off heap memory used: 16821344

           Index summary off heap memory used: 7525546

           Space used (live): 1324067896238

      3.  Read / Write count should also be reported as a ratio, such as:

           Local read count: 202961459

           Local write count: 40554481

           Local read/write ratio: 5:1    <new>

           Local read latency: 1.957 ms

           Local write count: 40554481

           Local write latency: 0.040 ms

      Attachments

        1. image-2023-02-28-08-08-24-727.png
          24 kB
          Brad Schoening

        Activity

          People

            smiklosovic Stefan Miklosovic
            bschoeni Brad Schoening
            Stefan Miklosovic
            Brandon Williams
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h