The protection of privacy is one of the most important issues on the Internet today. Internet users routinely report that privacy protection is one of their greatest concerns. More Internet sites are collecting personal information from users through online registrations, surveys, and forms. Information is also collected from users surreptitiously with "cookies." Web users are understandably concerned about the potential loss of privacy.
We surveyed the Top 100 web sites as reported by www.100hot.com on June 5, 1997. According to 100hot, the site "lists the most popular sites on the web excluding browser companies, ISP's, colleges, and Adult sites." The list is compiled daily in cooperation with Alta Vista.
We are aware that there are several other services that compile lists of popular Internet sites, but we think the 100hot list provides a good sample of popular sites. A review of these sites also offers a snapshot of current privacy practices on the Internet today.
For purposes of this survey, we decided to examine the collection of personal information and the existence of privacy policies on the Internet. We did not look at the adequacy of security standards, such as whether credit card transactions receive sufficient protection, the availability of good encryption, or the privacy issues related to "spam" (unsolicited commercial e-mail). These are all important issues for on-line privacy and should be examined in a separate study.
One of the first issues we considered was whether personal information is collected at the surveyed web site. For the first part of this query, we were specifically interested in whether the site collected Personally Identifiable Information (PII), such as name or address, directly from the user. We counted email addresses as PII, even though it is possible to spoof an email address and it is not always clear to whom an email address refers.
Many web sites (49 of our sample) collect personal information through on-line registrations, mailing lists, surveys, user profiles, and order fulfillment requirements. However, some web sites, such as CNN, TV Guide, the Washington Post, and the Weather Channel, do not generally collect any personally identifiable information.
We were not able to determine whether web sites are linking data collected on-line with other databases. This classic computer matching technique is oftentimes one of the first indicators of a privacy problem. It is also likely to emerge as a significant issue in the near future. For example, America Online is matching its active member list with demographic and psychographic data obtained from Donnelley Marketing ["America Online Snoops Into Subscriber's Incomes, Children," Privacy Times, May 30, 1997]. We think this issue bears further examination.
There are other search methods we might have tried, such as running a search engine with the domain name and the word "privacy," but this seemed to us to be beyond the call of duty. We felt that users should be able to locate privacy policies quickly and easily and that a privacy notice should be clear and conspicuous.
We excluded privacy policies that were posted to a web site that were actually internal privacy policies for a company and its employees.
We found that only 17 of the sites that we visited actually had privacy policies, and few were easy to find.
There are many different privacy policies, but all good policies share certain characteristics: they explain the responsibilities of the organization that is collecting personal information and the rights of the individual who provided the personal information. Typically, this means that an organization will explain why information is being collected, how it will be used, and what steps will be taken to limit improper disclosure. It also means that individuals will be able to obtain their own data and make corrections if necessary.
Several web sites provided reasonably good privacy notices. Amazon.com, for example, tells users that it does not rent or sell its mailing list to anyone. But Amazon also advises users, "If you would like to make sure we never sell or rent information about you to third parties, just send an e-mail message to email@example.com." We thought this statement created unnecessary ambiguity in an otherwise good policy.
Several sites post notices stating that individuals using their sites cannot transmit information that violates privacy, but have no privacy policies themselves.
In examining the few privacy policies that we found, we considered the extent to which users are able to restrict the secondary use of their personal information. Eight of the surveyed sites provide some degree of use limitation. The use limitations are mainly limited to determining whether the collecting organization will be authorized to share (or sell) the information to a third party.
One of the important goals of most privacy laws is to ensure that individuals have the ability to inspect personal information that is collected by others and to make corrections if necessary. This is to ensure that individuals know what information about them is available to others, and also to encourage data collectors to be more forthcoming about how personal information is gathered.
We were interested in finding whether web sites made it possible for users to access the information that the site collected about them.
We couldn't find any site in our sample that currently allows users to access their own file, with the exception of Firefly. The Firefly web site allows users to create a personal profile, to access the profile, and to revise the profile. Firefly provides a good example of user control over a personal profile on the Internet.
We were interested in whether users could access sites without disclosing personally identifiable information. Given their nature, we did not look closely at surreptitious techniques that may allow web servers to collect identifying information, such as email addresses or TCIP/IP addresses, from web clients.
We found that every site at least provides access to the home page and most sites let users visit many services on the site without disclosing any personally identifiable information.
We thought the widespread practice of allowing anonymous browsing, even on the most popular web sites, was an important indicator of how privacy is actually protected on the Internet. By avoiding the collection of personal information, web sites encourage users to visit sites. In the physical world, we note that very few stores require the collection of personal information before allowing someone to enter.
We suspect that preserving anonymity may be the easiest way to protect on-line privacy.
There has been a great deal of controversy about the cookies feature in browser software. On the one hand, cookies make it possible for a web server to "recognize" a web client and enables certain features that are useful for surfing and on-line commerce, such as retaining screen preferences, storing passwords, and creating virtual shopping carts.
At the same time, cookies also enable the surreptitious collection of information from the user.
We were interested to see how many of the top 100 web sites enabled the cookies feature. We visited each web site and then checked our cookies file to see if a new line was added. We did not, of course, visit every page or every linked site at each site we visited, so we may have missed some pages that generate cookies.
Of the 100 sites, 24 enable cookies. The cookies feature is often used for registration and password storing, but may also be used to create logs of user interests and preferences (for instance, tracking particular articles that a user accesses at an on-line news site).
We thought it was noteworthy that none of the sites that enabled cookies told the user that information about the user was being placed on the user's system. We think that more could be done to make such transactions "transparent" -- that is to say, readily apparent to the user.
Even though privacy is one of the top concerns among Internet users, few webs sites today actually have privacy policies or provide users with information about privacy practices. This makes it almost impossible for users to make informed decisions about their on-line activities.
Many have argued for notice and consent procedures and self-regulation to protect on-line privacy. But a review of the top 100 web sites reveals that only a handful provide any meaningful privacy notice. There is also virtually no indication that any meaningful steps have been taken to protect user privacy by self-regulatory means.
In the absence of meaningful privacy policies, net surfers today also have little assurance that personal information that is provided at a web site might not be misused. Not surprisingly, many users are reluctant to disclose personal information and some provide false information when asked.
Although privacy policies are virtually non-existent on the Internet today, we found that anonymity continues to play an important role in protecting on-line privacy. Many of the top web sites allows users to visit without giving up personal information. Anonymity plays a particularly important role for those sites, such as CNN, that are providing news and information to the on-line community.
Techniques to provide users with more information about privacy practices, such as eTRUST and other similar branding techniques, should be encouraged. These services should provide clear and meaningful designations for privacy practices. They should also be backed up with regular auditing. We also have doubts about proposed techniques, such as P3, that require users to disclose privacy preferences. We think that good privacy policies should provide meaningful information for users about web site practices and not require users to disclose personal information. Many users are also likely to consider their privacy preferences to be, well, private.
We suspect that one of the simplest and most effective solutions to on-line privacy is to continue the practice of anonymity. Anonymity is already widespread on the Internet -- virtually all of the sites that we surveyed allowed users to use the site without disclosing who they were. When personally identifiable information is collected, web sites should develop clear privacy policies.
Users of web-based services and operators of web-based services have a common interest in promoting good privacy practices. Strong privacy standards provide assurance that personal information will not be misused, and should encourage the development of on-line commerce. We also believe it is matter of basic fairness to inform web users when personal information is being collected and how it will be used.
Protecting privacy will be one the greatest challenges for the Internet. Until clear practices are established and good policies put in place, our advice is simply this: "Surfer beware."
GVU's WWW User Surveys. One of the best sources for information about the attitudes of Internet users toward privacy issues is the semi-annual survey conducted by the Graphics, Visualization, and Usability Center of the Georgia Institute of Technology. More information about public attitudes toward privacy may be found at the EPIC Privacy Survey page.
OECD Privacy Guidelines. Many privacy policies are derived from the 1980 Guidelines on Privacy and Transborder flows of the Organization for Economic Cooperation and Development (OECD). Other related policies may be found at the International Privacy Documents archive of Privacy International.
EPIC Privacy Archive. The EPIC Privacy Archive contains an extensive collection of documents, reports, news items, policy analysis and laws related to privacy issues.