Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15620 Über-jira: S3A phase VI: Hadoop 3.3 features
  3. HADOOP-15348

S3A Input Stream bytes read counter isn't getting through to StorageStatistics/instrumentation properly

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0, 3.1.0
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None

      Description

      TL;DR: we should have common storage statistics for bytes read and bytes written, and S3A should use them in its instrumentation and have enum names to match.

      1. in the S3AInputStream we call S3AInstrumentation.StreamStatistics.bytesRead(long), which adds the amount to bytesRead, in a read(), readFully, or forward seek() reading in data
      2. and in S3AInstrumentation.mergeInputStreamStatistics, that is pulled into streamBytesRead.
      3. which has a Statistics name of ""stream_bytes_read"
      4. but that is served up in the Storage statistics as "STREAM_SEEK_BYTES_READ", which is the wrong name.
      5. and there isn't a common name for the counter across other filesystems.

      For now: people can use the wrong name in the enum; we may want to think about retaining it when adding the correct name. And maybe add a @Evolving/@LimitedPrivate scope pair to the enum

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: