Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1356

Improve monitoring for state restore

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      There are a couple problems that can affect our ability to troubleshoot state restore from changelog.

      1. KeyValueStorageEngine logs a message for every 1M messages restored, but it doesn't print anything for smaller stores. We should add a message to report the final number of entries restored.

      2. While the "restore-time" metric is a gauge, the KeyValueStorageEngineMetrics "messages-restored" and "messages-bytes" are both counters, and counters are often graphed in terms of deltas so the value disappears after one data point. Since these values only matter for the beginning of the job, we should switch them to gauges so the value is retained for later monitoring.

      Attachments

        Issue Links

          Activity

            People

              jmakes Jake Maes
              jmakes Jake Maes
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: