Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-17222

Add new metrics to track the number of requests performed by GROUP BY and Aggregation queries

    XMLWordPrintableJSON

Details

    • Operability
    • Low Hanging Fruit
    • All
    • None

    Description

      When a user perform a GROUP BY query or an aggregate query (e.g. SELECT count(*) FROM my_table) internally C* will send multiple internal requests to avoid running out of memory. The page size used for those internal queries is the same as the external page size.

      Having a some visibility on the number of internal requests happening for a group by or an aggregate query is important as it might help administrators to debug performance issues.

      We should add some separate metrics for GROUP BY queries and Aggregate queries

      Additional information for newcomers:

      • A new metric class called AggregationMetrics should be created with an Histogram called internalPagesPerGroupByQuerie and another called internalPagesPerAggregateQuerie (see BatchMetrics for an example
      • High level query paging are managed by AggregationQueryPager. The number of queries performed should be incremented within fetchSubPage and the metrics should be updated on close.
      • To test that the numbers are reliable, you need to create a new Unit Test AggregationMetricsTest. To have some example of how to test group by queries with paging, you can look into SelectGroupByTest.testGroupByWithPaging() to check how to clear the histograms between test you can look into BatchMetricsTests.clearHistogram()

      Attachments

        Activity

          People

            n.v.harikrishna n.v.harikrishna
            blerer Benjamin Lerer
            n.v.harikrishna
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: