If you’ve ever had to deal with ajax GET requests in Internet Explorer you will undoubtedly found out that IE will cache responses that have content-type: “application/json”.
As you probably do most of your development using another browser which does not have this behavior (and who can blame you, IE’s developer tools are not great) you might forget about this “feature” of IE.
When time comes to test in IE and this becomes a problem, usually the knee jerk reaction is just to turn off the cache entirely. I’ve seen people doing this using jQuery and Angular, but I bet it’s common with whatever alternative technology you use.
I must confess, I’ve done this myself, more than once. And worst, I’ve added used request headers like:
If-Modified-Since: Mon, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-cache
Pragma: no-cache
Without having a clue of what Pragma
or Cache-Control
are. That’s never a good idea.
I’ve taken the example above directly from the top voted answer in Stack Overflow “Angular IE Caching issue for $http”. It has loads of up-votes, 302 as I write this.
What’s really funny about that answer is that those headers don’t make much sense the way they are (they will work though). But let’s look at each individually.
If-Modified-Since
when used in a GET or HEAD request makes that request “conditional” (that’s the term used in the documentation). For this header to work the web server must be configured to handle it. The web server responds to a request with this header either with a 304 Not Modified or 200 response code. To be able to do this correctly the server must be able to determine if the response has changed. There’s no way for the server to know that a response that was dynamically generated has changed. Unless you are serving static files this won’t work.
Cache-Control
when set with no-cache
means:
The “no-cache” request directive indicates that a cache MUST NOT use a stored response to satisfy the request without successful validation on the origin server. (emphasis mine)
This is from RFC7234 (RFC about caching in HTTP/1.1). How is the validation performed in the origin server then?
One way is the server responding to a GET request with a response that contains an ETag header. You can think of that ETag as a hash of the content. If the content changes the ETag changes.
When the browser requests the same resource it will include the previous ETag value. The server can then use this to verify if the content that the browser has is still up to date (this is what is meant by successful validation on the origin server in the RFC). If the content is still up to date the server responds with a 304 Not Modified and the browser uses the cached version of the resource.
None of this happens automatically, especially for a response that is dynamically generated.
Finally, the Pragma: no-cache
header is from HTTP 1.0. It was retained in HTTP 1.1 for backwards compatibility. It instructs proxies not to cache the request.
So why does adding these headers work? At some point in time IE and Firefox decided that Cache-Control: no-cache
should just mean do not cache, and that was it. That header now behaves the way someone who did not read the full RFC would expect it to behave. Actually it behaves like Cache-Control: no-store
which actually means “a cache MUST NOT store any part of either this request or any response to it”.
If I were to do this today I’d use only that header and it should be fine:
Cache-Control: no-store
Well, I’d use it only in the requests that I really wouldn’t want cached. What I wouldn’t do is set it globally, for example in Angular by using the $httpProvider to have the header added to every get request automatically (like in the StackOverflow answer mentioned in the beginning of the post). Or by doing something like $.ajaxSetup(cache: false)
in jQuery.
The reason for that is you will disable situations where you’d want your responses to be cached. A very common example is your angular templates. They are just static .html files, and if you disable caching globally you will make the browser fetch the templates every time.
This is just an example of why it’s not a good idea to use stuff off the internet without understanding it. Although, in this particular case it’s easy to understand why it would happen. The RFC about caching is almost 50 pages long and extremely hard to digest.