Varnish is, varnish-cache says, a reverse-proxy acting as a HTTP-accelerator caching static content on server side to make its delivery faster for Web clients. Its installation is quite straightforward, being packaged on Debian as on most Linux distributions. At first, I used it to dispatch HTTP requests on my private server to my virtual servers on backend. It indeed intercepts HTTP connections and redirects them on the appropriate backend. Here is a minimal configuration to make it working:
Redirect client connection to the appropriate backend server
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Requests made on url this-awesome-blog.net are redirected on host hostname-of-my-awesome-blog labeled blog. Other requests will be redirected on default backend hosted on server whose hostname is hostname-of-another-awesome-blog.
Please note that I changed the X-Forwarded-For HTTP Header for the client IP address. More on that here. On apache, logs by default take into account the IP address of the TCP packet accessing the server. To avoid logging the IP address of the proxy, here is the configuration lines to extract the IP address from the X-Forwarded-For header instead:
1 2 3 4 |
|
Internals
Here is how to check whether or not a page is served by varnish (from its cache then) or by the backend server (missed) (and BTW how to add an extra HTTP-Header):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
This is a way to add any HTTP-HEADER; note that many of them are not defined in https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html and therefore must not be implemented in HTTP clients or servers (such as X-Forwarded-For).
Caching
Some data may be cached (GET and HEAD headers, for instance) and some other can’t (typically POST requests).
Here is the session recorded by wireshark when I attempt to reach an already cached page from my blog:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Client asks for content if modified since Sun, 29 Dec 2013 01:57:12 GTM and if ETag does not match the value “12fc-3204-4eea2a5494e00”. Varnish answers in the name of Apache; Both conditions matches hence the HTTP return code is 304. Meaning the browser is able to retrieve the current version of the page in its own cache, and the communication stops here on the HTTP level. Note that apache has not been requested (cache-hit) and otherwise, (varnish needs to requests apache to retrieve the content) this is a cache-miss.
Here are some cache related HTTP Header:
- Expires: Date or time after which the cache has to be considered deprecated/stalled and must be retrieved. This puts a limit of the caching process. It is still possible to refesh cache before, but cache will always be refersh after.
- Cache-Control: Always overrides Expires, and has more features