Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-3092

Extend the FileChannel's monitoring metrics

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7.0
    • 1.8.0
    • File Channel
    • None

    Description

      There are already several generic metrics (e.g. eventPutAttemptCount and eventPutSuccessCount) which can be used to create compound metrics for monitoring the FileChannel's health.
      Some monitoring system's aren't capable to calculate such derived metrics, though, so I recommend to add the following extra counters to represent if a channel operation failed or the channel is in an unhealthy state.

      • eventPutErrorCount: incremented if an IOException occurs during put operation.
      • eventTakeErrorCount: incremented if an IOException or CorruptEventException occurs during take operation.
      • checkpointWriteErrorCount: incremented if an exception occurs during checkpoint write.
      • unhealthy: this flag represents whether the channel has started successfully (i.e. the replay ran without any problem). This is similar to the already existing open flag except that the latter is initially false and is set to true if the initialization (including the log replay) is successfully done. The unhealthy, in contrary, is false by default and is set to true if there is an error during startup.

      Beside these flags I'd also introduce a closed flag which is the numeric representation (1: closed, 0: open) of the negated (already existing) open flag.

      Attachments

        Issue Links

          Activity

            People

              denes Denes Arvay
              denes Denes Arvay
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: