Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3160

Aggregate operator statistics by TaskManager

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Implemented
    • 1.0.0
    • 1.0.0
    • Runtime / Web Frontend
    • None

    Description

      The web client job info page presents a table of the following per task statistics: start time, end time, duration, bytes received, records received, bytes sent, records sent, attempt, host, status.

      Flink supports clusters with thousands of slots and a job setting a high parallelism renders this job info page unwieldy and difficult to analyze in real-time.

      It would be helpful to optionally or automatically aggregate statistics by TaskManager. These rows could then be expanded to reveal the current per task statistics.

      Start time, end time, duration, and attempt are not applicable to a TaskManager since new tasks for repeated attempts may be started. Bytes received, records received, bytes sent, and records sent are summed. Any throughput metrics can be averaged over the total task time or time window. Status could reference the number of running tasks on the TaskManager or an idle state.

      Attachments

        Activity

          People

            greghogan Greg Hogan
            greghogan Greg Hogan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: