Tags:
caching1Add my vote for this tag extract_stuff1Add my vote for this tag create new tag
, view all tags

Browser and Proxy Cache Control

I've been investigating how TWiki pages can be made more cacheable, by browsers and by proxy caches. My main interest has been fixing the BackFromPreviewLosesText problem (now done!), but better control of caching is also useful for incremental downloads of TWiki to laptops or PDAs (ReadOnlyOfflineWiki and ViewTWikiOnPDA), and for improved performance generally.

Some people have also had problems with proxy caches serving out of date TWiki information (see MetaExpires and brief mention in RenderOnceReadMostly, JohnTalintyre comment) - explicit control of caching by TWiki code is very likely to help here, since proxy caches typically use heuristics only in the absence of explicit cache control HTTP headers (e.g. Last-Modified, Expires, and Cache-Control). From reading MetaExpires, it seems that MichaelSparks has already done some of the changes in BackFromPreviewLosesText, so it would be good to see that code as well. Since sourceforge.net apparently has a server-side proxy cache (according to a comment on TWiki.org), this may affect all users of TWiki.org, even if they don't use a local proxy cache.

Also, the fact that TWiki scripts don't set a Content-Length header means that HTTP/1.1 browsers and proxies have to create a new TCP/IP connection for every image on a page (which can be significant if there are lots of attachments), and can't re-use the same TCP/IP connection for other pages. Enabling such 'persistent connections' can make websites seem much faster, by avoiding overhead and allowing the single connection to build up its allowed data rate (the TCP window size) over several HTTP requests, rather than starting with a low data rate each time.

Some useful resources about how caching works and how CGI scripts can make their output cacheable:

Some relevant TWiki pages:

What do people think about all this? Since it can prevent loss of edits, improve performance, speed downloading and improve the timeliness of cached pages, it seems like a useful area to investigate.

-- RichardDonkin - 20 Jan 2002

See http://www.oreilly.com/catalog/webcaching/chapter/ch05.html for an interesting chapter about interception caching (usually known as transparent caching) - you may be using a proxy cache without knowing it, either near the client or near the web server, in which case cache-control becomes an issue.

Also mentions that, when hitting Refresh on some versions of InternetExplorer, the browser doesn't send a Cache-control: no-cache header if it thinks it is talking to the origin web server. The result is that the transparent proxy cache serves an out-of-date (stale) copy of the page, while IE thinks it has a fresh copy directly from the origin web server (i.e. the TWiki server) - see MSKB:Q266121, which affects IE5.0 pre-SP2 and IE5.5 pre-SP1. This could explain problems such as ViewAfterSaveCachesOldPage - I've had this on TWiki.org, which could be due to SourceForge's apparent use of server-side proxy caches.

There's also an intro to why web caching is useful at http://linux.oreillynet.com/pub/a/linux/2002/02/28/cachefriendly.html - promoting a new O'Reilly book on Web Caching.

Auto-refreshing browser sidebars (e.g. OperaSidebar) and other types of InstantNotification make caching quite important to manage the load on TWiki servers.

-- RichardDonkin - 18 Mar 2002

There is now an improved writeHeaderHandler in the Plugin API, which makes it possible for plugins to experiment with cache control headers without stomping on the fix for BackFromPreviewLosesText. The latter is now implemented in TWikiAlphaRelease and on TWiki.org, along with RefreshEditPage, meaning that you should never lose edits due to caching problems.

Some outstanding bugs/issues that appear to be cache-related:

-- RichardDonkin - 20 Apr 2002

The HTTP_EQUIV_ON_VIEW and related settings are, generally speaking, only used by browsers - most web servers don't parse the HTML document before transmitting it (e.g. Apache does not), and most proxy caches don't analyse the HTML document either (e.g. AOL's proxy caches). So it's best to avoid using these settings for cache control purposes, and to instead experiment with plugins for more sophisticated cache control. If this works well, it could be put into the TWiki core code.

These settings also make it easy to break things by changing cache settings - see BugInHttpEquiv for details.

-- RichardDonkin - 02 Jul 2003

CacheControlHeaders includes a recent implementation and discussion of HTTP cache control headers, and pointers to server-side caching work.

-- RichardDonkin - 02 Jan 2006

Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r13 - 2008-09-02 - TWikiJanitor
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.