We had a bunch of machines all hitting the same URLs on www.example.com, so we put up a squid reverse proxy on ourcache.example.com so that http://www.example.com/foo/bar would get served and cached by http://ourcache.example.com/foo/bar
sudo apt-get install squid squid-cgi # squid-cgi enables the cache manager web interface
Edit /etc/squid/squid.conf. It's very well documented, and we only had to modify a few lines from the default ubuntu hardy config:
Allow other local machines to use our cache:
acl our_networks src 10.42.42.0/24
http_access allow our_networks
instead of the default of:
http_access allow localhost
Have the cache listen on port 80 and forward all requests to www.example.com:
http_port 80 defaultsite=www.example.com
cache_peer www.example.com parent 80 0 no-query originserver
In our case, we wanted to cache pages that included "GET parameters" in the URL, such as http://www.example.com/search?query=foo (which is something you should only do in special cases):
# enable logging of the full URL, so you can see what's going on (though it's a potential privacy risk to your users)
Comment out the lines that exclude cgi-bin and GET parameter URLs from being cached:
#acl QUERY urlpath_regex cgi-bin \?
#cache deny QUERY
Then we went to: http://localhost/cgi-bin/cachemgr.cgi to see how well our cache was working (blank login and password by default).
After doing an "/etc/init.d/squid restart", we found that we could hit http://ourcache.example.com/foo/bar.html and get http://www.example.com/foo/bar.html, as expected.