Thanks for the feedback Steve! My response below:
We tested this on an HDInsight cluster using real loads against Azure Storage and used Storage Metrics (https://docs.microsoft.com/en-us/rest/api/storageservices/Storage-Analytics-Metrics-Table-Schema) to analyze the performance. For example, with the client-side throttling feature enabled for an account with 10 Gbps ingress limit, the total ingress throughput levels off at approximately 10 Gbps, whereas without the feature the throughput varies between 5 Gbps and 10 Gbps as the service rejects requests and the retry policy kicks in. For example, a teragen job that attempts to upload more than the ingress limit allows can complete 30 to 40% faster with the feature enabled. There can be tens or hundreds of wasb clients and they independently throttle themselves with no knowledge of each other, and together they minimize errors and maximize throughput.
I do not believe there will be any contention with TCP congestion control. I'm confident, but not overly so, which is why the feature is disabled by default. I'd like to obtain feedback and perhaps enable it by default in the future.
The Azure Storage service exposes account level metrics so we're not in the dark today. My team plans to review and implement client-side metrics for wasb. We're a small team, and metrics are important to get right, so we don't want to do it piecemeal, but instead will review and make comprehensive changes.
This change will not compile without the 5.4.0 SDK, as it uses the new ErrorReceivingResponseEvent, but I'll create a separate JIRA as you requested.
I agree the configuration key should have the suffix "enabled", but for consistency I'm using "enable" because all the other wasb configuration keys are this way.
ClientThrottlingInercept logs a debug message saying "Client-side throttling is enabled for the WASB file system." There are no other configurable options for the feature, so nothing else is included in the message.
The HTTP request/response parsing (BlobOperationDescriptor.getContentLengthIfKnown) is best effort. It should not throw or fail for the data that it processes. There are tests to validate this parsing logic and its integration with the Storage SDK, but it is not a general purpose parser--it works inconjunction with the SDK.
I switched to SLF4J.
Removed superfluous InterfaceAudience.Private.
Changed timer name to "wasb-timer-client-throttling-analyzer-<name>". The timer task returns quickly--no blocking, waiting, loops, etc.
I updated ClientThrottlingAnalyzer, TestBlobOperationDescriptor, and TestClientThrottlingAnalyzer as requeseted.