Uploaded image for project: 'IMPALA'
  2. IMPALA-8544

Expose additional S3A / S3Guard metrics



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Backend
    • ghx-label-1


      S3A / S3Guard internally collects several useful metrics that we should consider exposing to Impala users. The full list of statistics can be found in o.a.h.fs.s3a.Statistic. The stats include: the number of S3 operations performed (put, get, etc.), invocation counts for various FileSystem methods, stream statistics (bytes read, written, etc.), etc.

      Some interesting stats that stand out:

      • "stream_aborted": "Count of times the TCP stream was aborted" - the number of TCP connection aborts, a high value would indicate performance issues
      • "stream_read_exceptions" : "Number of exceptions invoked on input streams" - incremented whenever an IOException is caught while reading (these exception don't always get propagated to Impala because they trigger a retry)
      • "store_io_throttled": "Requests throttled and retried" - looks like it tracks the number of times the fs retries an operation because the original request hit a throttling exception
      • "s3guard_metadatastore_retry": "S3Guard metadata store retry events" - looks like it tracks the number of times the fs retries S3Guard operations
      • "s3guard_metadatastore_throttled" : "S3Guard metadata store throttled events" - similar to "store_io_throttled" but looks like it is specific to S3Guard

      We should consider how to expose these metrics via Impala logs / runtime profiles.

      There are a few options:

      • S3AFileSystem exposes StorageStatistics specific to S3A / S3Guard via the FileSystem#getStorageStatistics method; the S3AStorageStatistics seems to include all the S3A / S3Guard metrics, however, I think the stats might be aggregated globally, which would make it hard to create per-query specific metrics
      • S3AInstrumentation exposes all the metrics as well, and looks like it is per-fs instance, so it is not aggregated globally; S3AInstrumentation extends o.a.h.metrics2.MetricsSource so perhaps it is exposed via some API (haven't looked into this yet)
      • S3AInputStream#toString dumps the statistics from o.a.h.fs.s3a.S3AInstrumentation.InputStreamStatistics and S3AFileSystem#toString dumps them all as well
      • S3AFileSystem updates the stats in o.a.h.fs.Statistics.StatisticsData as well (e.g. bytesRead, bytesWritten, etc.)

      Impala has a hdfs-fs-cache as well, so hdfsFs objects get shared across threads.


        Issue Links



              Unassigned Unassigned
              stakiar Sahil Takiar
              0 Vote for this issue
              7 Start watching this issue