A Web cache – also called a proxy server – is a network entity that satisfies HTTP requests on the behalf of an origin web server. The web cache has its own disk storage and keeps copies of recently requested objects in this storage. As shown in fig. 2.11,a user’s browser can be configured so that all of the user’s HTTP requests are first directed to the Web cache. Once a browser is configured, each browser request for an object is first directed to the web cache. As an example, suppose a browser is requesting the object http://www/someschool.edu/campus.gif. Here is what happens:
- The browser establishes a TCP connection to the Web cache and sends an HTTP request for the object to the web cache
- The web cache checks to see it if has a copy of the object stored locally. If it does, the web cache returns the object within an HTTP response message to the client browser.
- If the web cache does not have the object, the web cache opens a TCP connection to the origin server, that is, to www.someschool.edu. The web cache then sends an HTTP request for the object into the cache-to-server TCP connection. After receiving this request, the origin server sends the object within an HTTP response to the Web cache.
- When the web cache receives the object, it stores a copy in its local storage and sends a copy, within an HTTP response message, to the client browser (over the existing TCP connection between the client and browser and the Web cache).
Note that a cache is both a server and a client at the same time. When it receives requests from and sends responses to a browser, it is a server. When it sends requests to and receives responses from an origin server, it is a client.
Typically a web cache is purchased and installed by an ISP. For example, a university might install a cache on its campus network and configure all of the campus browsers to point to the cache. Or a major residential ISP (such as AOL) might install one or more caches in its network and preconfigure its shipped browsers to point to the installed caches.
Web caching has been deployed in the internet for two reasons. First, a web cache can substantially reduce the response time for a client request, particularly if the bottleneck bandwidth between the client and the origin server is much less than the bottleneck bandwidth between the client and the cache. If there is a high-speed connection between the client and the cache, as there often is, and if the cache has the requested object, then the cache will be able to deliver the object rapidly to the client. Second, as we will soon illustrate with an example, web caches can substantially reduce traffic on institution’s access link to the internet link. By reducing traffic, the institution (for example, a company or university) does not have to upgrade bandwidth as quickly, thereby reducing costs. Furthermore, web caches can substantially reduce web traffic in the internet as a whole, thereby improving performance for all applications.
To gain a deeper understanding of the benefits of caches, let’s consider an example in the context of fig. 2.12. This figure shows two networks- the institutional network and the rest of the public internet. The institutional network is a high-speed LAN. A router in the institutional network and a router in the internet are connected by a 15 Mbps link. The origin servers are attached to the internet but are located all over the globe. Suppose that the average object size is 1 Mbits and that the average request rate from the institution’s browsers to the origin servers is 15 requests per second. Suppose that the HTTP request messages are negligibly small and thus create no traffic in the networks or in the access link (from institutional router to the internet router). Also suppose that the amount of time it takes from when the router on the internet side of the access link in fig2.12 forwards an HTTP request (within an IP datagram) until it receives the response (typically within many IP datagrams) is two seconds on average. Informally, we refer to this last delay as the “Internet delay.”
The total response time – that is, the time from the browser’s request of an object until its receipt of the object – is the sum of the LAN delay, the access delay (that is, the delay between the two routers), and the internet delay. Let’s now do a very crude calculation to estimate this delay. The traffic intensity on the LAN is
(15 requests/sec) x (1 Mbits/request)/(100 Mbps) = 0.15
Whereas the traffic intensity on the access link (from the internet router to institution router) is
(15 requests/sec) x (1 Mbits/request)/(15 Mbps) = 1
A traffic intensity of 0.15 on a LAN typically result in, at most, tens of milliseconds of delay; hence, we can neglect the LAN delay. However, as discussed earlier, as the traffic intensity approaches 1 (as is the case of the access link in figure 2.12), the delay on a link becomes very large and grows without bound. Thus, the average response time to satisfy requests is going to be on the order minutes, if not more, which unacceptable for the situation’s users. Clearly some-thing must be done.
One possible solution is to increase the access rate from 15 Mbps to, say, 100 Mbps. This will lower the traffic intensity on the access link to 0.15, which translates to negligible delays between the two routers. In this case, the total response time will roughly be two seconds, that is, the internet delay. But this solution also means that the institution must upgrade its access link from 15 Mbps to 100 Mbps, a costly proposition.
Now consider the alternative solution of not upgrading the access link but instead installing a web cache in the institutional network. This solution is illustrated in figure 2.13. Hit rates – the fraction of requests that are satisfied by a cache – typically range from 0.2 to 0.7 in practice. For illustrative purposes, let’s suppose that the cache provides a hit rate of 0.4 for this institution. Because the clients and the cache are connected to the same high-speed LAN, 40 percent of the requests will be satisfied almost immediately, say, within 10 milliseconds, by the cache. Nevertheless, the remaining 60 percent of the requests still need to be satisfied by the origin servers. But only 60 percent of the requested objects passing through the access link, the traffic intensity on the access link is reduced from 1.0 to 0.6. Typically, a traffic intensity less than 0.8 corresponds to a small delay, say, tens of milliseconds, on a 15 Mbps link. This delay is negligible compared with the two-second internet dealy. Given these consideration, average delay therefore is
0.4 x (0.01 seconds) + 0.6 x (2.01 seconds)
Which is just slightly greater than 1.2 seconds. Thus, this second solution provides an even lower response time than the first solution, and it doesn’t require the institution to upgrade its link to the internet. The institution does, of course, have to purchase and install a web cache. But his cost is low – many caches use public-domain software that runs on inexpensive PCs.
Through the user of Content Distribution Networks (CDNs), web caches are increasingly playing an important role in the internet. A CDN company installs many geographically distributed caches throughout the internet , thereby localizing much of the traffic. There are shared CDNs (such as Akamai and Limelight) and dedicated CDNs (such as Google and Microsoft). We will discuss CDNs in more detail in Module 7.