Next Previous Contents

2. Terminology and Concepts for Hierarchical Caching

2.1 Cache

In this document, the term cache refers to an HTTP proxy that caches some requests.

2.2 Objects

An object is a generic term for any document, image, or other type of data available on the Web. Uniform Resource Locators (URLs) identify objects unambiguously, and can refer to data available from HTTP, FTP, Gopher, and other types of servers. Because a much of today's Web content is comprised of images, audio, video and binary files, we prefer to say that the cache stores objects rather than documents or pages.

2.3 Hits and Misses

A cache hit means a valid copy of the requested object exists in a cache. A cache miss means that the object does not exist, or is no longer valid. A cache must forward cache misses toward the origin server.

2.4 Origin Server

The origin server is the authoritative source for an object. For URLs, the origin server is simply the hostname part of the URL.

2.5 Hierarchy vs. Mesh

Web caches can be arranged hierarchically, or in a mesh. When the cache topology has a tree-like structure, we usually use the term hierarchy. If the structure is rather flat, we call it a mesh. In either case these terms simply refer to the fact that caches can be ``connected'' to each other.

2.6 Neighbours, Peers, Parents, Siblings

The terms neighbour and peer are synonymous, and generically refer to other caches in a hierarchy or mesh. The terms parent and sibling refer to the relationship between a pair of caches.

In a parent relationship, the child cache will forward requests to its parent cache. If the parent does not hold a requested object, it will forward the request on behalf of the child. A cache hierarchy should closely follow the underlying network topology. Parent caches should be located along the network paths towards the greater Internet. For example, if your Internet Service Provider (ISP) operates a cache, it should probably be a parent to yours, since your Web traffic will have to travel along your ISPs infrastructure anyway.

With a sibling relationship, a peer may only request objects already held in the cache; a sibling can not forward cache misses on behalf of the peer. The sibling relationship should be used for caches ``nearby'' but not in the direction of your route to the Internet. For example, it may make sense for a number of department-specific caches within an organization to have sibling relationships among them. This approach is even more compelling when there is no parent cache available for the organization as a whole.

Note that a single peering relationship is one-way. A child can forward requests to its parent, but not vice-versa, unless another peering relationship exists for the other direction.

2.7 Fresh, Stale, Refresh

These terms refer to the status of cached objects. If an object is fresh then it will be returned as a cache hit. If the object is stale, then Squid will refresh it by including an IMS request header and forwarding the request on toward the origin server.


Next Previous Contents