Cookies

Cookies

PCBWay

We mentioned earlier that an HTTP server is stateless. This simplifies server design and has permitted engineers to develop high-performance web servers that can handle thousands of simultaneous TCP connections. However, it is often desirable for a web site to identify users, either because the server wishes to restrict user access or because it wants to server content as a function of the user identity. For these purposes, HTTP uses cookies. Cookies, defined in [RFC 6265], allow sites to keep tract of users. Most major commercial web sites use cookies today.

As shown in figure 2.10, cookie technology has four components: (1) a cookie header line in the HTTP response message; (2) a cookie header line in the HTTP request message; (3) a cookie file kept on the user’s end system and managed by the user’s browser; and (4) a back-end database at the web site. Using fig. 2.10, let’s walk through an example of how cookies work. Suppose Susan, who always accesses the web using Internet Explorer from her home PC, contacts Amazon.com for the first time. Let us suppose that in the past she has already visited the eBay site. When the request comes into the Amazon Web server, the server creates a unique identification number and creates an entry in its back-end database that this is indexed by the identification number. The Amazon web server then responds to Susan’s browser, including in the HTTP response a Set-cookie: header, which contains the identification number. For example, the header line might be:

Set-cookie: 1678

When Susan’s browser receives the HTTP response message, it sees the Set-cookie: header. The browser then appends a line to the special cookie file that it manages. This line includes the hostname of the server and the identification number in the Set-cookie: header. Note that the cookie file already has an entry for eBay, since Susan has visited that site in the past. As Susan continues to browse the Amazon site, each time she requests a web page, her browser consults her cookie file, extracts her identification number for this site, and puts a cookie header line that includes the identification number in the HTTP request. Specifically, each of her HTTP request to the Amazon server includes the header line:

Cookie: 1678

In this manner, Amazon server is able to tract Susan’s activity at the Amazon site. Although the Amazon web site does not necessarily know Susan’s name, it knows exactly which pages user 1678 visited, in which order, and at which times! Amazon uses cookies to provide its shopping cart service – Amazon can maintain a list of all of Susan’s intended purchases, so that she can pay for them collectively at the end of the session.

If Susan return to Amazon’s site, say, one week later, her browser will continue to put the header line Cookie: 1678 in the request messages. Amazon also recommends products to Susan based on web pages she has visited at amazon in the past. If Susan also registers herself with Amazon-providing full name, e-mail address, postal address, and credit card information – Amazon can then include this information in its database, thereby associating Susan’s name with her identification number (and all of the pages she has visited at the site in the past!). This is how Amazon and other e-commerce sites provide “one-click shopping” –when Susan chooses to purchase an item during a subsequent visit, she doesn’t need to re-enter her name, credit card number, or address.

From this discussion we see that cookies can be used to identify a user. The first time a user visits a site, the user can provide a user identification (possibly his or her name). During the subsequent sessions, the browser passes a cookie header to the server, thereby identifying the user to the server. Cookies can thus be used to create a user session layer on top of stateless HTTP. For example, when a user logs into a web-based e-mail application (such as Hotmail), the browser send the cookie’s information to the server, permitting the server to identify the user throughout the user’s session with the application.

Although cookies often simplify the internet shopping experience for the user they are controversial because they can also be considered  as an invasion of privacy. As we just saw, using a combination of cookies and user-supplies account information, a web site can learn a lot about a user and potentially sell this information to a third party. Cookie Central includes extensive information on the cookie controversy.