Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-12183

Rebalancing process metrics for cache groups

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • Docs Required

    Description

      Intro

      Currently, some of the Apache Ignite rebalance process metrics are not working well enough. For instance, `EstimatedRebalancingKeys` keys time to time returns `-1` value due to a bug, or `rebalanceKeysReceived` metric treated as CacheMetric in fact calculated for the whole cache group (e.g. historical rebalance, see IGNITE-11330 and code block comment below).

      All the rebalance process metrics must be re-worked.

          /**
           * Update rebalancing metrics.
           */
          private void updateGroupMetrics() {
              // TODO: IGNITE-11330: Update metrics for touched cache only.
              // Due to historical rebalancing "EstimatedRebalancingKeys" metric is currently calculated for the whole cache
              // group (by partition counters), so "RebalancedKeys" and "RebalancingKeysRate" is calculated in the same way.
              for (GridCacheContext cctx0 : grp.caches()) {
                  if (cctx0.statisticsEnabled())
                      cctx0.cache().metrics0().onRebalanceKeyReceived();
              }
          }
      

      What we have

      CacheMetrics - statistics must be enabled to see these metrics.

      • getRebalancedKeys
      • getKeysToRebalanceLeft
      • getEstimatedRebalancingKeys
      • getEstimatedRebalancingFinishTime
      • getRebalancingStartTime
      • getRebalanceClearingPartitionsLeft
      • getRebalancingKeysRate
      • getRebalancingBytesRate

      What to do

      All such metrics (or their analogue) must be available for the CacheGroupMetrics. I'd suggest to do the following:

      1. Phase-1
        • rebalancingPartitionsLeft long metric
        • rebalancingReceivedKeys long metric
        • rebalancingReceivedBytes long metric
        • rebalancingStartTime long metric
        • rebalancingFinishTime long metric
        • rebalancingLastCancelledTime long metric
      2. Phase-2
        • rebalancingExpectedKeys long metric
        • rebalancingExpectedBytes long metric
        • rebalancingEvictedPartitionsLeft long metric
      3. Phase-3 (statistics must be enabled)
        • rebalancingKeysRate HitRate metric
        • rebalancingBytesRate HitRate metric
      4. Phase-4
        • Mark rebalancing CacheMetrics deprecated and remove from metrics framework IGNITE-11961.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            aleksandr.b Aleksandr Brazhnikov

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 20m
                3h 20m

                Slack

                  Issue deployment