Description
This problem was encountered when trying to specify container image nvcr.io/nvidia/tensorflow:19.12-tf1-py3
When initiating Docker Registry authentication (https://docs.docker.com/registry/spec/auth/token/) with nvcr.io, Mesos URI fetcher receives 'WWW-Authenticate' header without 'service' and 'scope' params, and fails here:
https://github.com/apache/mesos/blob/1e9b121273a6d9248a78ab44798bd4c1138c31ee/src/uri/fetchers/docker.cpp#L1083
This is an example of an unsuccessful request made by Mesos:
curl -s -S -L -i --raw --http1.1 -H "Accept: application/vnd.docker.distribution.manifest.v2+json,application/vnd.docker.distribution.manifest.v1+json,application/vnd.docker.distribution.manifest.v1+prettyjws" -y 60 https://nvcr.io/v2/nvidia/tensorflow/manifests/19.08-py3 HTTP/1.1 401 Unauthorized Content-Type: text/html Date: Wed, 22 Jan 2020 19:01:57 GMT Server: nginx/1.14.2 Www-Authenticate: Bearer realm="https://nvcr.io/proxy_auth?scope=repository:nvidia/tensorflow:pull,push" Content-Length: 195 Connection: keep-alive <html> <head><title>401 Authorization Required</title></head> <body bgcolor="white"> <center><h1>401 Authorization Required</h1></center> <hr><center>nginx/1.14.2</center> </body> </html>
At the same time, docker is perfectly capable of pulling this image.
Note that the document "Token Authentication Specification" (https://docs.docker.com/registry/spec/auth/token/), on which the Mesos implementation is based, is vague on the issue of registries that do not provide 'scope'/'service' in WWW-Authenticate header.
What Docker does differently (at the very least, in the case of nvcr.io):
It sends the initial request not to the maniferst/blob URI, but to the repository root URI (http:://nvcr.io/v2 in this case):
GET /v2/ HTTP/1.1
Host: nvcr.io
User-Agent: docker/18.03.1-ce go/go1.9.5 git-commit/9ee9f402cd kernel/4.15.0-60-generic os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.7 \(linux\))
To this, it receives response with a "realm" that contains no query arguments:
HTTP/1.1 401 Unauthorized
Connection: close
Content-Length: 195
Content-Type: text/html
Date: Wed, 29 Jan 2020 12:22:43 GMT
Server: nginx/1.14.2
Www-Authenticate: Bearer realm="https://nvcr.io/proxy_auth
Then, it composes the scope using the image ref and a hardcoded "pull" action:
https://github.com/docker/distribution/blob/a8371794149d1d95f1e846744b05c87f2f825e5a/registry/client/auth/session.go#L174
(in a full accordance with this spec: https://docs.docker.com/registry/spec/auth/scope/)
and sends the following request to https://nvcr.io/proxy_auth :
GET /proxy_auth?scope=repository%3Anvidia%2Ftensorflow%3Apull HTTP/1.1 Host: nvcr.io User-Agent: Go-http-client/1.1
(Note that 'push' is absent from the scope)