Instead of using checksums to trigger fetcher cache updates, we can for starters use the content modification time (mtime), which is available for a number of download protocols, e.g. HTTP and HDFS.
Proposal: Instead of just fetching the content size, we fetch both size and mtime together. As before, if there is no size, then caching fails and we fall back on direct downloading to the sandbox.
Assuming a size is given, we compare the mtime from the fetch URI with the mtime known to the cache. If it differs, we update the cache. (As a defensive measure, a difference in size should also trigger an update.)
Not having an mtime available at the fetch URI is simply treated as a unique valid mtime value that differs from all others. This means that when initially there is no mtime, cache content remains valid until there is one. Thereafter, anew lack of an mtime invalidates the cache once. In other words: any change from no mtime to having one or back is the same as encountering a new mtime.
Note that this scheme does not require any new protobuf fields.