Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
Tested with the org.apache.httpcomponents.httpclient 4.3.6 OSGi bundle distributed by Eclipse Orbit: <http://download.eclipse.org/tools/orbit/downloads/drops/R20160520211859/>
Description
While implementing my own HttpCacheStorage I noticed the following problematic cache revalidation behavior. FYI, this behavior also occurs with BasicHttpCacheStorage (created through CachingHttpClients.createMemoryBound()), so it is not caused by my HttpCacheStorage implementation. Consider this sequence of requests and responses:
- GET /something HTTP/1.1
- Accept: application/json
- 404 Not Found HTTP/1.1
- Cache-Control: max-age=60
This response is cached under the key /something. After 60 seconds, another GET request is performed and send over the network, as the cached 404 response is stale.
- GET /something HTTP/1.1
- Accept: application/json
- 200 OK HTTP/1.1
- Vary: Accept
- Cache-Control: max-age=120
This response is cached under the key {Accept:application/json}/something and key /something’s variantMap is updated to refer to this key. After another 60 seconds, a third GET request is performed which again performs network I/O – even though it IMHO should not.
- GET /something HTTP/1.1
- Accept: application/json
- 200 OK HTTP/1.1
- Vary: Accept
- Cache-Control: max-age=120
This re-validation occurs because a stale 404 response for /something was cached – although its variantMap contains a fresh, selectable 200 response.
FWIW, RFC 7234 has this to say about the subject:
The stored response with matching selecting header fields is known as
the selected response.If multiple selected responses are available (potentially including
responses without a Vary header field), the cache will need to choose
one to use. When a selecting header field has a known mechanism for
doing so (e.g., qvalues on Accept and similar request header fields),
that mechanism MAY be used to select preferred responses; of the
remainder, the most recent response (as determined by the Date header
field) is used, as per Section 4.
According to this, the 200 response should have been selected, as its Date is newer than the 404's responses. Instead, another request for /something is send to the server, even though the most recent cache entry is still fresh.