this is an informational document. although technical in
nature, it attempts to make the concepts involved understandable and applicable
in real-world situations. because of this, some aspects of the material are
simplified or omitted, for the sake of clarity. if you are interested in the
minutia of the subject, please explore the at the end.
a web cache sits between one or more web servers (also
known as origin servers) and a client or many clients, and watches
requests come by, saving copies of the responses — like html pages, images and
files (collectively known asrepresentations) — for itself. then, if
there is another request for the same url, it can use the response that it has,
instead of asking the origin server for it again.
there are two main reasons that web caches are used:
to reduce latency — because the request is
satisfied from the cache (which is closer to the client) instead of the origin
server, it takes less time for it to get the representation and display it.
this makes the web seem more responsive.
to reduce network traffic — because
representations are reused, it reduces the amount of bandwidth used by a
client. this saves money if the client is paying for traffic, and keeps their
bandwidth requirements lower and more manageable.
if you examine the preferences dialog of any modern web browser (like
internet explorer, safari or mozilla), you’ll probably notice a “cache” setting.
this lets you set aside a section of your computer’s hard disk to store
representations that you’ve seen, just for you. the browser cache works
according to fairly simple rules. it will check to make sure that the
representations are fresh, usually once a session (that is, the once in the
current invocation of the browser).
this cache is especially useful when users hit the “back” button or click a
link to see a page they’ve just looked at. also, if you use the same navigation
images throughout your site, they’ll be served from browsers’ caches almost
instantaneously.
web proxy caches work on the same principle, but a much larger scale. proxies
serve hundreds or thousands of users in the same way; large corporations and
isps often set them up on their firewalls, or as standalone devices (also known
asintermediaries).
because proxy caches aren’t part of the client or the origin server, but
instead are out on the network, requests have to be routed to them somehow. one
way to do this is to use your browser’s proxy setting to manually tell it what
proxy to use; another is using interception. interception
proxies have web requests redirected to them by the underlying network
itself, so that clients don’t need to be configured for them, or even know about
them.
proxy caches are a type of shared cache; rather than just
having one person using them, they usually have a large number of users, and
because of this they are very good at reducing latency and network traffic.
that’s because popular representations are reused a number of times.
also known as “reverse proxy caches” or “surrogate caches,” gateway caches
are also intermediaries, but instead of being deployed by network administrators
to save bandwidth, they’re typically deployed by webmasters themselves, to make
their sites more scalable, reliable and better performing.
requests can be routed to gateway caches by a number of methods, but
typically some form of load balancer is used to make one or more of them look
like the origin server to clients.
content delivery networks (cdns) distribute gateway caches
throughout the internet (or a part of it) and sell caching to interested web
sites. and are examples of cdns.
this tutorial focuses mostly on browser and proxy caches, although some of
the information is suitable for those interested in gateway caches as well.
web caching is one of the most misunderstood technologies on the internet.
webmasters in particular fear losing control of their site, because a proxy
cache can “hide” their users from them, making it difficult to see who’s using
the site.
unfortunately for them, even if web caches didn’t exist, there are too many
variables on the internet to assure that they’ll be able to get an accurate
picture of how users see their site. if this is a big concern for you, this
tutorial will teach you how to get the statistics you need without making your
site cache-unfriendly.
another concern is that caches can serve content that is out of date,
or stale. however, this tutorial can show you how to configure
your server to control how your content is cached.
cdns are an interesting
development, because unlike many proxy caches, their gateway caches are aligned
with the interests of the web site being cached, so that these problems aren’t
seen. however, even when you use a cdn, you still have to consider that there
will be proxy and browser caches downstream.
on the other hand, if you plan your site well, caches can help your web site
load faster, and save load on your server and internet link. the difference can
be dramatic; a site that is difficult to cache may take several seconds to load,
while one that takes advantage of caching can seem instantaneous in comparison.
users will appreciate a fast-loading site, and will visit more often.
think of it this way; many large internet companies are spending millions of
dollars setting up farms of servers around the world to replicate their content,
in order to make it as fast to access as possible for their users. caches do the
same for you, and they’re even closer to the end user. best of all, you don’t
have to pay for them.
the fact is that proxy and browser caches will be used whether you like it or
not. if you don’t configure your site to be cached correctly, it will be cached
using whatever defaults the cache’s administrator decides upon.
all caches have a set of rules that they use to determine when to serve a
representation from the cache, if it’s available. some of these rules are set in
the protocols (http 1.0 and 1.1), and some are set by the administrator of the
cache (either the user of the browser cache, or the proxy administrator).
generally speaking, these are the most common rules that are followed (don’t
worry if you don’t understand the details, it will be explained below):
if the response’s headers tell the cache not to keep it, it won’t.
if the request is authenticated or secure (i.e., https), it won’t be
cached.
a cached representation is considered fresh (that is,
able to be sent to a client without checking with the origin server) if:
it has an expiry time or other age-controlling header set, and is still
within the fresh period, or
if the cache has seen the representation recently, and it was modified
relatively long ago.
fresh representations are served directly from
the cache, without checking with the origin server.
if a representation is stale, the origin server will be asked
to validate it, or tell the cache whether the copy that it
has is still good.
under certain circumstances — for example, when it’s disconnected from a
network — a cache can serve stale responses without checking with the origin
server.
if no validator
(an <code>etag</code> or <code>last-modified</code> header)
is present on a response, and it doesn‘t have any explicit
freshness information, it will usually — but not always — be considered
uncacheable.
together, freshness and validation are
the most important ways that a cache works with content. a fresh representation
will be available instantly from the cache, while a validated representation
will avoid sending the entire representation over again if it hasn’t
changed.
there are several tools that web designers and webmasters can use to
fine-tune how caches will treat their sites. it may require getting your hands a
little dirty with your server’s configuration, but the results are worth it. for
details on how to use these tools with your server, see the sections
below.
html authors can put tags in a document’s <head> section that describe
its attributes. these meta tags are often used in the belief
that they can mark a document as uncacheable, or expire it at a certain
time.
meta tags are easy to use, but aren’t very effective. that’s because they’re
only honored by a few browser caches, not proxy caches (which almost never read
the html in the document). while it may be tempting to put a pragma: no-cache
meta tag into a web page, it won’t necessarily cause it to be kept fresh.
if your site is hosted at an isp or hosting farm and
they don’t give you the ability to set arbitrary http headers
(like<code>expires</code> and <code>cache-control</code>), complain
loudly; these are tools necessary for doing your job.
on the other hand, true http headers give you a lot of
control over how both browser caches and proxies handle your representations.
they can’t be seen in the html, and are usually automatically generated by the
web server. however, you can control them to some degree, depending on the
server you use. in the following sections, you’ll see what http headers are
interesting, and how to apply them to your site.
http headers are sent by the server before the html, and only seen by the
browser and any intermediate caches. typical http 1.1 response headers might
look like this:
the html would follow these headers, separated by a blank line. see
the sections
for information about how to set http headers.
many people believe that assigning a <code>pragma: no-cache</code> http header to a representation will make it uncacheable.
this is not necessarily true; the http specification does not set any guidelines
for pragma response headers; instead, pragma request headers (the headers that a
browser sends to a server) are discussed. although a few caches may honor this
header, the majority won’t, and it won’t have any effect. use the headers below
instead.
the <code>expires</code> http header is a basic means of
controlling caches; it tells all caches how long the associated representation
is fresh for. after that time, caches will always check back with the origin
server to see if a document is changed. <code>expires</code> headers
are supported by practically every cache.
most web servers allow you to set <code>expires</code> response
headers in a number of ways. commonly, they will allow setting an absolute time
to expire, a time based on the last time that the client retrieved the
representation (last access time), or a time based on the last
time the document changed on your server (last modification
time).
<code>expires</code> headers are especially good for making static
images (like navigation bars and buttons) cacheable. because they don’t change
much, you can set extremely long expiry time on them, making your site appear
much more responsive to your users. they’re also useful for controlling caching
of a page that is regularly changed. for instance, if you update a news page
once a day at 6am, you can set the representation to expire at that time, so
caches will know when to get a fresh copy, without users having to hit
‘reload’.
the only value valid in
an <code>expires</code> header is a http date; anything else will most
likely be interpreted as ‘in the past’, so that the representation is
uncacheable. also, remember that the time in a http date is greenwich mean time
(gmt), not local time.
for example:
it’s important to make sure that your web server’s
clock is accurate if you use the <code>expires</code>header. one way to do
this is using the (ntp); talk to your local system administrator to find
out more.
although the <code>expires</code> header is useful, it has some
limitations. first, because there’s a date involved, the clocks on the web
server and the cache must be synchronised; if they have a different idea of the
time, the intended results won’t be achieved, and caches might wrongly consider
stale content as fresh.
another problem with <code>expires</code> is that it’s easy to
forget that you’ve set some content to expire at a particular time. if you don’t
update an <code>expires</code> time before it passes, each and every
request will go back to your web server, increasing load and latency.
http 1.1 introduced a new class of
headers, <code>cache-control</code> response headers, to give web
publishers more control over their content, and to address the limitations
of <code>expires</code>.
useful <code>cache-control</code> response headers include:
<code>max-age=</code>[seconds] — specifies the maximum
amount of time that a representation will be considered fresh. similar
to <code>expires</code>, this directive is relative to the time of the
request, rather than absolute. [seconds] is the number of seconds from the
time of the request you wish the representation to be fresh for.
<code>s-maxage=</code>[seconds] — similar
to <code>max-age</code>, except that it only applies to shared (e.g.,
proxy) caches.
<code>public</code> — marks authenticated responses
as cacheable; normally, if http authentication is required, responses are
automatically private.
<code>private</code> — allows caches that are
specific to one user (e.g., in a browser) to store the response; shared caches
(e.g., in a proxy) may not.
<code>no-cache</code> — forces caches to submit the
request to the origin server for validation before releasing a cached copy,
every time. this is useful to assure that authentication is respected (in
combination with public), or to maintain rigid freshness, without sacrificing
all of the benefits of caching.
<code>no-store</code> — instructs caches not to keep
a copy of the representation under any conditions.
<code>must-revalidate</code> — tells caches that
they must obey any freshness information you give them about a representation.
http allows caches to serve stale representations under special conditions; by
specifying this header, you’re telling the cache that you want it to strictly
follow your rules.
<code>proxy-revalidate</code> — similar
to <code>must-revalidate</code>, except that it only applies to proxy
caches.
when
both <code>cache-control</code> and <code>expires</code> are
present, <code>cache-control</code> takes precedence. if you plan to
use the <code>cache-control</code> headers, you should have a look at
the excellent documentation in http 1.1; see .
in , we said that validation is used by servers and caches to communicate
when a representation has changed. by using it, caches avoid having to download
the entire representation when they already have a copy locally, but they’re not
sure if it’s still fresh.
validators are very important; if one isn’t present, and there isn’t any
freshness information
(<code>expires</code> or <code>cache-control</code>) available, caches
will not store a representation at all.
the most common validator is the time that the document last changed, as
communicated in <code>last-modified</code> header. when a cache has a
representation stored that includes
a <code>last-modified</code> header, it can use it to ask the server
if the representation has changed since the last time it was seen, with
an <code>if-modified-since</code> request.
http 1.1 introduced a new kind of validator called the etag.
etags are unique identifiers that are generated by the server and changed every
time the representation does. because the server controls how the etag is
generated, caches can be sure that if the etag matches when they make
a <code>if-none-match</code> request, the representation really is the
same.
almost all caches use last-modified times as validators; etag validation is
also becoming prevalent.
most modern web servers will generate
both <code>etag</code> and <code>last-modified</code> headers
to use as validators for static content (i.e., files) automatically; you won’t
have to do anything. however, they don’t know enough about dynamic content (like
cgi, asp or database sites) to generate them; see .
besides using freshness information and validation, there are a number of
other things you can do to make your site more cache-friendly.
use urls consistently — this is the golden rule of
caching. if you serve the same content on different pages, to different users,
or from different sites, it should use the same url. this is the easiest and
most effective way to make your site cache-friendly. for example, if you use
“/index.html” in your html as a reference once, always use it that way.
use a common library of images and other elements
and refer back to them from different places.
make caches store images and pages that don’t change
often by using a <code>cache-control: max-age</code> header with a large value.
make caches recognise regularly updated pages by
specifying an appropriate max-age or expiration time.
if a resource (especially a downloadable file) changes, change its
name. that way, you can make it expire far in the future, and
still guarantee that the correct version is served; the page that links to it
is the only one that will need a short expiry time.
don’t change files unnecessarily. if you do,
everything will have a falsely
young <code>last-modified</code> date. for instance, when updating
your site, don’t copy over the entire site; just move the files that you’ve
use cookies only where necessary — cookies are
difficult to cache, and aren’t needed in most situations. if you must use a
cookie, limit its use to dynamic pages.
minimize use of ssl — because encrypted pages are
not stored by shared caches, use them only when you have to, and use images on
ssl pages sparingly.
check your pages with — it can help you apply
many of the concepts in this tutorial.
by default, most scripts won’t return a validator
(a <code>last-modified</code> or <code>etag</code> response
header) or freshness information
(<code>expires</code> or <code>cache-control</code>). while some
scripts really are dynamic (meaning that they return a different response for
every request), many (like search engines and database-driven sites) can benefit
from being cache-friendly.
generally speaking, if a script produces output that is reproducible with the
same request at a later time (whether it be minutes or days later), it should be
cacheable. if the content of the script changes only depending on what’s in the
url, it is cacheable; if the output depends on a cookie, authentication
information or other external criteria, it probably isn’t.
the best way to make a script cache-friendly (as well as perform better)
is to dump its content to a plain file whenever it changes. the web server can
then treat it like any other web page, generating and using validators, which
makes your life easier. remember to only write files that have changed, so
the <code>last-modified</code> times are preserved.
another way to make a script cacheable in a limited fashion is to set an
age-related header for as far in the future as practical. although this can be
done with <code>expires</code>, it’s probably easiest to do so
with <code>cache-control: max-age</code>, which will make the request
fresh for an amount of time after the request.
if you can’t do that, you’ll need to make the script generate a validator,
and then respond
to <code>if-modified-since</code>and/or <code>if-none-match</code> requests.
this can be done by parsing the http headers, and then responding
with <code>304 not modified</code> when appropriate. unfortunately,
this is not a trival task.
some other tips;
don’t use post unless it’s appropriate. responses to
the post method aren’t kept by most caches; if you send information in the
path or query (via get), caches can store that information for the
future.
don’t embed user-specific information in the
url unless the content generated is completely unique to that
user.
don’t count on all requests from a user coming from the same
host, because caches often work together.
generate <code>content-length</code> response
headers. it’s easy to do, and it will allow the response of your
script to be used in a persistent connection. this allows
clients to request multiple representations on one tcp/ip connection, instead
of setting up a connection for every request. it makes your site seem much
faster.
see the for more specific information.
a good strategy is to identify the most popular, largest representations
(especially images) and work with them first.
the most cacheable representation is one with a long freshness time set.
validation does help reduce the time that it takes to see a representation, but
the cache still has to contact the origin server to see if it’s fresh. if the
cache already knows it’s fresh, it will be served directly.
if you must know every time a page is accessed, select one small item on a
page (or the page itself), and make it uncacheable, by giving it a suitable
headers. for example, you could refer to a 1x1 transparent uncacheable image
from each page. the <code>referer</code> header will contain
information about what page called it.
be aware that even this will not give truly accurate statistics about your
users, and is unfriendly to the internet and your users; it generates
unnecessary traffic, and forces people to wait for that uncached item to be
downloaded. for more information about this, see on interpreting access
statistics in the .
many web browsers let you see
the <code>expires</code> and <code>last-modified</code> headers
are in a “page info” or similar interface. if available, this will give you a
menu of the page and any representations (like images) associated with it, along
with their details.
to see the full headers of a representation, you can manually connect to the
web server using a telnet client.
to do so, you may need to type the port (be default, 80) into a separate
field, or you may need to connect
to<code>www.example.com:80</code> or <code>www.example.com 80</code> (note the space). consult your telnet client’s documentation.
once you’ve opened a connection to the site, type a request for the
representation. for instance, if you want to see the headers
for <code>http://www.example.com/foo.html</code>, connect
to <code>www.example.com</code>, port <code>80</code>, and type:
press the return key every time you see <code>[return]</code>; make sure
to press it twice at the end. this will print the headers, and then the full
representation. to see the headers only, substitute head for get.
by default, pages protected with http authentication are considered private;
they will not be kept by shared caches. however, you can make authenticated
pages public with a cache-control: public header; http 1.1-compliant caches will
then allow them to be cached.
if you’d like such pages to be cacheable, but still authenticated for every
user, combine the <code>cache-control: public</code> and<code>no-cache</code> headers. this tells the cache
that it must submit the new client’s authentication information to the origin
server before releasing the representation from the cache. this would look
like:
whether or not this is done, it’s best to minimize use of authentication; for
example, if your images are not sensitive, put them in a separate directory and
configure your server not to force authentication for it. that way, those images
will be naturally cacheable.
ssl pages are not cached (or decrypted) by proxy caches, so you don’t have to
worry about that. however, because caches store non-ssl requests and urls
fetched through them, you should be conscious about unsecured sites; an
unscrupulous administrator could conceivably gather information about their
users, especially in the url.
in fact, any administrator on the network between your server and your
clients could gather this type of information. one particular problem is when
cgi scripts put usernames and passwords in the url itself; this makes it trivial
for others to find and use their login.
if you’re aware of the issues surrounding web security in general, you
shouldn’t have any surprises from proxy caches.
it varies. generally speaking, the more complex a solution is, the more
difficult it is to cache. the worst are ones which dynamically generate all
content and don’t provide validators; they may not be cacheable at all. speak
with your vendor’s technical staff for more information, and see the
implementation notes below.
the expires header can’t be circumvented; unless the cache (either browser or
proxy) runs out of room and has to delete the representations, the cached copy
will be used until then.
the most effective solution is to change any links to them; that way,
completely new representations will be loaded fresh from the origin server.
remember that any page that refers to these representations will be cached as
well. because of this, it’s best to make static images and similar
representations very cacheable, while keeping the html pages that refer to them
on a tight leash.
if you want to reload a representation from a specific cache, you can either
force a reload (in firefox, holding down shift while pressing ‘reload’ will do
this by issuing a <code>pragma: no-cache</code> request header) while
using the cache. or, you can have the cache administrator delete the
representation through their interface.
if you’re using apache, consider allowing them to use .htaccess files and
providing appropriate documentation.
otherwise, you can establish predetermined areas for various caching
attributes in each virtual server. for instance, you could specify a directory
/cache-1m that will be cached for one month after access, and a /no-cache area
that will be served with headers instructing caches not to store representations
from it.
whatever you are able to do, it is best to work with your largest customers
first on caching. most of the savings (in bandwidth and in load on your servers)
will be realized from high-volume sites.
caches aren’t required to keep a representation and reuse it; they’re only
required to not keep or use them under some
conditions. all caches make decisions about which representations to keep based
upon their size, type (e.g., image vs. html), or by how much space they have
left to keep local copies. yours may not be considered worth keeping around,
compared to more popular or larger representations.
some caches do allow their administrators to prioritize what kinds of
representations are kept, and some allow representations to be “pinned” in
cache, so that they’re always available.
generally speaking, it’s best to use the latest version of whatever web
server you’ve chosen to deploy. not only will they likely contain more
cache-friendly features, new versions also usually have important security and
performance improvements.
uses
optional modules to include headers, including both expires and cache-control.
both modules are available in the 1.2 or greater distribution.
the modules need to be built into apache; although they are included in the
distribution, they are not turned on by default. to find out if the modules are
enabled in your server, find the httpd binary and run <code>httpd -l</code>; this should print a list of the available modules (note that this
only lists compiled-in modules; on later versions of apache,
use <code>httpd -m</code>to include dynamically loaded modules as well).
the modules we’re looking for are expires_module and headers_module.
if they aren’t available, and you have administrative access, you can
recompile apache to include them. this can be done either by uncommenting the
appropriate lines in the configuration file, or using
the <code>-enable-module=expires</code>and <code>-enable-module=headers</code> arguments
to configure (1.3 or greater). consult the install file found with the apache
distribution.
once you have an apache with the appropriate modules, you can use mod_expires
to specify when representations should expire, either in .htaccess files or in
the server’s access.conf file. you can specify expiry from either access or
modification time, and apply it to a file type or as a default. see the for more information, and speak with your local apache
guru if you have trouble.
to apply <code>cache-control</code> headers, you’ll need to use the
mod_headers module, which allows you to specify arbitrary http headers for a
resource. see .
here’s an example .htaccess file that demonstrates the use of some
headers.
.htaccess files allow web publishers to use commands normally only found
in configuration files. they affect the content of the directory they’re in
and their subdirectories. talk to your server administrator to find out if
they’re enabled.
note that mod_expires automatically calculates and inserts
a <code>cache-control:max-age</code> header as appropriate.
apache 2’s configuration is very similar to that of 1.3; see the 2.2 and documentation
for more information.
’s internet
information server makes it very easy to set headers in a somewhat flexible way.
note that this is only possible in version 4 of the server, which will run only
on nt server.
to specify headers for an area of a site, select it in
the <code>administration tools</code> interface, and bring up its
properties. after selecting the <code>http headers</code> tab, you
should see two interesting areas; <code>enable content expiration</code> and <code>custom http headers</code>. the first
should be self-explanatory, and the second can be used to apply cache-control
see the asp section below for information about setting headers in active
server pages. it is also possible to set headers from isapi modules; refer to
msdn for details.
as of version 3.6, enterprise server does not provide any obvious way to set
expires headers. however, it has supported http 1.1 features since version 3.0.
this means that http 1.1 caches (proxy and browser) will be able to take
advantage of cache-control settings you make.
to use cache-control headers, choose <code>content management | cache control directives</code> in the administration server. then, using the
resource picker, choose the directory where you want to set the headers. after
setting the headers, click ‘ok’. for more information, see the .
one thing to keep in mind is that it may be easier to
set http headers with your web server rather than in the scripting language. try
both.
because the emphasis in server-side scripting is on dynamic content, it
doesn’t make for very cacheable pages, even when the content could be cached. if
your content changes often, but not on every page hit, consider setting a
cache-control: max-age header; most users access pages again in a relatively
short period of time. for instance, when users hit the ‘back’ button, if there
isn’t any validator or freshness information available, they’ll have to wait
until the page is re-downloaded from the server to see it.
cgi scripts are one of the most popular ways to generate content. you can
easily append http response headers by adding them before you send the body;
most cgi implementations already require you to do this for
the <code>content-type</code>header. for instance, in perl;
since it’s all text, you can easily
generate <code>expires</code> and other date-related headers with
in-built functions. it’s even easier if you use <code>cache-control: max-age</code>;
this will make the script cacheable for 10 minutes after the request, so that
if the user hits the ‘back’ button, they won’t be resubmitting the request.
the cgi specification also makes request headers that the client sends
available in the environment of the script; each header has ‘http_’ prepended to
its name. so, if a client makes
an <code>if-modified-since</code> request, it will show up
as<code>http_if_modified_since</code>.
see also the library, which
automatically handles etag generation and
validation, <code>content-length</code> generation and gzip
content-coding for perl and python cgi scripts with a one-line include. the
python version can also be used to wrap arbitrary cgi scripts with.
ssi (often used with the extension .shtml) is one of the first ways that web
publishers were able to get dynamic content into pages. by using special tags in
the pages, a limited form of in-html scripting was available.
most implementations of ssi do not set validators, and as such are not
cacheable. however, apache’s implementation does allow users to specify which
ssi files can be cached, by setting the group execute permissions on the
appropriate files, combined with the <code>xbithack full</code> directive. for more information, see the .
is a server-side
scripting language that, when built into the server, can be used to embed
scripts inside a page’s html, much like ssi, but with a far larger number of
options. php can be used as a cgi script on any web server (unix or windows), or
as an apache module.
by default, representations processed by php are not assigned validators, and
are therefore uncacheable. however, developers can set http headers by using
the <code>header()</code> function.
for example, this will create a cache-control header, as well as an expires
header three days in the future:
remember that the <code>header()</code> function must come before
any other output.
as you can see, you’ll have to create the http date for
an <code>expires</code> header by hand; php doesn’t provide a function
to do it for you (although recent versions have made it easier; see the ). of
course, it’s easy to set a<code>cache-control: max-age header</code>, which is
just as good for most situations.
for more information, see the .
automatically handles <code>etag</code> generation and
content-coding for php scripts with a one-line include.
, by is a commercial
server-side scripting engine, with support for several web servers on windows,
linux and several flavors of unix.
cold fusion makes setting arbitrary http headers relatively easy, with
the <code></code> tag.
unfortunately, their example for setting
an <code>expires</code> header, as below, is a bit misleading.
it doesn’t work like you might think, because the time (in this case, when
the request is made) doesn’t get converted to a http-valid date; instead, it
just gets printed as a representation of cold fusion’s date/time object. most
clients will either ignore such a value, or convert it to a default, like
january 1, 1970.
however, cold fusion does provide a date formatting function that will do the
job; <code></code>.
in combination with <code></code>,
it’s easy to set expires dates; here, we set a header to declare that
representations of the page expire in one month;
you can also use the <code>cfheader</code> tag to
set <code>cache-control: max-age</code> and other headers.
remember that web server headers are passed through in some deployments of
cold fusion (such as cgi); check yours to determine whether you can use this to
your advantage, by setting headers on the server instead of in cold fusion.
when setting http headers from asps, make sure you
either place the response method calls before any html generation, or
use<code>response.buffer</code> to buffer the output. also, note that some
versions of iis set a <code>cache-control: private</code> header on
asps by default, and must be declared public to be cacheable by shared
active server pages, built into iis and also available for other web servers,
also allows you to set http headers. for instance, to set an expiry time, you
can use the properties of the <code>response</code> object;
specifying the number of minutes from the request to expire the
representation.<code>cache-control</code> headers can be added like
this:
in asp.net, <code>response.expires</code> is deprecated; the proper
way to set cache-related headers is with <code>response.cache</code>;
the http 1.1 spec has many extensions for making pages cacheable, and is the
authoritative guide to implementing the protocol. see sections 13, 14.9, 14.21,
and 14.25.
an excellent introduction to caching concepts, with links to other online
resources.
jeff goldberg’s informative rant on why you shouldn’t rely on access
statistics and hit counters.
examines http resources to determine how they will interact with web caches,
and generally how well they use the protocol.
one-line include in perl cgi, python cgi and php scripts automatically
handles etag generation and validation, content-length generation and gzip
content-encoding — correctly. the python version can also be used as a wrapper
around arbitrary cgi scripts.
this document is copyright ? 1998-2013 mark nottingham <>. this work is
licensed under a .
all trademarks within are property of their respective holders.
although the author believes the contents to be accurate at the time of
publication, no liability is assumed for them, their application or any
consequences thereof. if any misrepresentations, errors or other need for
clarification is found, please contact the author immediately.