Details

    • Reviewed

    Description

      There is a lot more happening in reads, so there's a lot more data to collect and publish in IO stats for us to view in a summary at the end of processes as well as get from the stream while it is active.

      Some useful ones would seem to be:

      counters

      • is in memory. using 0 or 1 here lets aggregation reports count total #of memory cached files.
      • prefetching operations executed
      • errors during prefetching

      gauges

      • number of blocks in cache
      • total size of blocks
      • active prefetches
        + active memory used

      duration tracking count/min/max/ave

      • time to fetch a block
      • time queued before the actual fetch begins
      • time a reader is blocked waiting for a block fetch to complete

      and some info on cache use itself

      • number of blocks discarded unread
      • number of prefetched blocks later used
      • number of backward seeks to a prefetched block
      • number of forward seeks to a prefetched block

      the key ones I care about are

      1. memory consumption
      2. can we determine if cache is working (reads with cache hit) and when it is not (misses, wasted prefetches)
      3. time blocked on executors

      The stats need to be accessible on a stream even when closed, and aggregated into the FS. once we get per-thread stats contexts we can publish there too and collect in worker threads for reporting in task commits

      Attachments

        Issue Links

          Activity

            People

              ahmar Ahmar Suhail
              stevel@apache.org Steve Loughran
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 10m
                  4h 10m