Thanks to the Internet, certain issues such as copyright and privacy rights have come to the attention of the public as never before. Still, although both concepts touch on innumerable commonplace interactions in our daily lives, not many of us can clearly explain the complex balance of privileges and responsibilities encapsulated in the vague terms. Polls show that the majority of Internet users are concerned about the tracking and profiling that takes place as they use the Internet, yet few users actually make use of the technologies that do exist to protect their privacy. To do so would require that they understand how Web sites gather information about them and how that information can be used. This article will make you and your patrons more informed Internet user and will help you make decisions about Internet privacy.
Much in the same way that the telephone network connects telephones through a system of wires, switches and complex connections, the Internet connects computers to each other over a global network. The computer can be the personal computer on your desktop at work, it can be the computer at your home that is used by your entire family, or it can be a supercomputer that is used by thousands of scientists. The Internet sees only computers, not the people who use them. We humans are entirely invisible to the Internet, invisible and irrelevant.
Given this, it might seem odd that this network is responsible for an astonishing invasion of our privacy. If the Internet doesn't even know that we exist, how can it gather information about us?
The data gathering functions that exist today have been purposefully built into the applications of the World Wide Web. Internet privacy is not a technical issue -- it is social, it is legal, it is economic, and the technology is developed to serve these purposes.
Prior to 1994, commercial activity was not allowed over the Internet. The network was receiving federal funding as a research project and therefore could not be used for commercial purposes. That meant no advertising of any kind. When the Internet was privatized in April of 1994, the door was opened to an entirely new network, one that has been called the "commodity" Internet.
When web sites first began including advertisements on their pages, there was very little that they could use to entice advertisers to their particular site. At most, a web server kept a simple log of activity against the site. Site owners used these to boast about the number of "hits" their site received. Hit counts, however, told nothing about the visitors to the site.
Since that time, Internet software has been modified to provide more information about the interaction, and techniques have been developed to deliver information about "customers" to web sites. For example, browsers have been enhanced to pass along additional information, sometimes referred to as "browser chatter." What do browsers chatter about? First, they pass along information about the browser and its capabilities: the brand of the browser (Netscape or Internet Explorer), the version number, what plug-ins are available. This helps sites know what they can send to the browser. But they also pass along non-technical information such as the previous site visited, including the actual search that was done if the user found the site through a search engine.
An important piece of information for commercial web sites is that of returning visitors. How can these web sites know who their regular customers are? The web server logs include the Internet address of the computer that visits the site, but these addresses refer to the computer that visited, not the person. It seems logical that the same computer means the same person, but that is not true for those who log on to the Internet through a dial-up account. The Internet service provider has a range of Internet addresses that it assigns to computers as the dial-up connection is made. When that person logs off the address goes back into the pool and is assigned to another user.
Because the server logs give inadequate information to identify return visitors, web sites identify their visitors using a technique called a "cookie." A cookie is a small text file that the web site writes to your hard drive. The cookie file itself can be very simple, containing the name of the web site that "owns" that cookie and a unique identifier that it has assigned to your computer. The first time you visit the site a unique ID, let's say "1234567", is assigned and written to the cookie file. Each time you return to that site it can read the cookie that it set and say "hey, here's 1234567 again."
While that seems quite innocent, this doesn't tell the whole story. Behind the scenes, the web site can be storing a database of information about when you visit the site and what web pages you access. Exactly what information is being gathered is unknown to you, but in general the goal is to build a profile of you as a customer.
Let's say that on one visit to the site you click on an article about the latest car models coming out of Detroit. On another visit you look up some daily sports scores. Now when you visit the site the banner ads that are sent to your screen will be those that might appeal to a car and sports enthusiast. On the other hand, if you visit the site and read book reviews and do the crossword puzzle, you are more likely to see advertising for an online bookstore. This is called "targeted advertising." Targeted ads are more likely to catch the customer's attention and lead to eventual sales.
When cookies were first created there was a conscious effort to make them secure and relatively private. Only the site that creates the cookie can read the cookie. Because of this rule, cookies can contain information that you wouldn't want others to read, such as passwords that you may have set for access to the web site. If you have ever signed up at a site for a personalized view and discovered when you return you don't have to type in your password, it is because your ID and password have been stored in a cookie file. This is very handy for those of us who have a tendency to forget our passwords.
But it turns out that there is a flaw in the cookie protocol as it exists today, and that flaw has been exploited by businesses that call themselves "network advertisers."
Most web sites that you visit today have banner ads on them. These ads are under the control of a small number of companies that act as middle men between advertisers and web sites. The banner ads are actually sent to the screen by the network advertiser, not by the web site that you have visited. Banner ads can also set and read cookies; they are like a web page inside a web page. By placing banner ads, and their cookies, on thousands of web pages, these advertising companies are able to gather data about your web surfing habits over a wide range of the World Wide Web. You are now "1234567" not just on one site but wherever you go on the web.
You may be under the impression that you receive a cookie from a banner ad only when you click on the ad. In fact it is the display of the banner ad on your screen that sets and reads the cookie. Even if you never click on a banner ad you are being profiled by network advertisers as you surf the web.
The profiles that are created are aggregate in nature. Customers are grouped into profile categories that correspond to various marketing goals, with some groups being more desirable than others, of course. These profiles tell advertisers who visits the site and what kind of customers they might be.
Cookies create an identity on the Internet, but the identity is still tied to a computer, not a person. Depending on who uses that computer, the profile that is created may not correspond to a single person. For example, a cookie on your home computer could be creating a profile of a person who reads The Wine Spectator and loves Barbie dolls. This does not create a useful customer profile.
Advertisers need information about people, not about computers, and one way to get closer to the individual behind the computer is to offer personalized services. Many sites now allow you to create an identity and to personalize the site to your taste. There are a number of these "My" sites on the Internet ("My Yahoo", "My Excite") where you can create your own web page from a selection of services offered by the site. This generally means selecting categories of services that will appear on the screen when you visit. As you return to the site and log on to your "personalized" view you are creating an identity that is pseudonymous but that corresponds to a single person.
This identity is still private, at least it is until you give out certain information about yourself. And you might have already done it.
Many web site policies will state that they do not gather "personally identifiable information" such as your name, address and phone number. Those data elements identify you directly, but there are many other data elements that can be used to triangulate your identity through the use of outside sources. For example, on many web sites you can get your local weather by typing in your zip code. On other web sites, your daily horoscope is available if you give your date of birth (day, month and year). This combination of zip code and date of birth makes it possible to find you in a variety of public files, including voter registration records and motor vehicle department files.
In a simpler example, your e-mail address may not seem to identify you, but if you include your name and address in the signature of your e-mail, and if you have ever sent e-mail to a discussion list that is archived on the Internet, anyone with your e-mail address can find your full contact information.
It isn't just the data that you give out today that may identify you, it's data that you have given out or that has been gathered about you your entire life that can be combined in new and unpredictable combinations. Because of this vast "mine" of information, no one can really say what is and what isn't "personally identifiable."
Of course, if you make purchases over the Internet you will give out your name, address and credit card information as part of the transaction. Because you know that this is personally identifiable information you can make conscious choices about who you do business with, selecting only reputable businesses, perhaps ones you have also dealt with off line. You may, however, be giving out other information in an inadvertent or off-hand manner without being aware that it could identify you at some future date.
The question of Internet privacy has been on the federal government's radar for years, but there still isn't a clear solution. The U.S. does not have a general privacy law that can be applied to this new technology, so the Department of Commerce is promoting "industry self-regulation." Self-regulation is based on the premise that it is in the best interest of the online industry to provide a level of privacy to win the confidence of consumers. Self-regulation can't actually be enforced, but the Federal Trade Commission (FTC), whose responsibility it is to make sure that consumers are protected, has asked businesses to follow its stated "fair information practices" regarding consumer data and to provide a privacy policy on their web site. Consumers are advised to look for the privacy policy and read it to determine if they want to do business with that site.
When companies violate their own privacy policies and harm consumers in doing so, the FTC can sanction those companies under its general consumer protection laws. Some companies have received sanctions and others have been made to bring their practices in line with their own privacy policies. Unfortunately, there is no sanction for not providing a privacy policy at all and undoubtedly some sites prefer that solution because it seems to eliminate a source of liability for the company. Also, web site privacy policies do not usually cover the practices of third parties such as the network advertisers who gather information through banner ads.
It is estimated that Internet companies lose millions or even billions of dollars of potential business every year because the fear of privacy invasion is keeping many people from doing business on the Internet. This fact is a strong motivation for Internet businesses to embrace a solution that raises consumer confidence. If self-regulation fails to achieve this goal, legislation may be needed.
In a global economy, major differences in how countries treat personal data is bound to lead to some tension in the marketplace. Companies in the U.S. are already being forced to think about data privacy in their interaction with foreign customers. The European Union passed data privacy legislation in 1995 which went into effect in 1998, and Canada will be enacting its Personal Information Protection and Electronic Documents Act in 2001. These foreign laws may have an influence on how the U.S. approaches the privacy issue.
Internet privacy was addressed by the presidential candidates in the 2000 elections and we can expect that some privacy bills will be introduced in the upcoming Congress. These bills will be controversial and hotly contested, and not necessarily unjustly. Privacy is an extremely complex issue with many personal, political, economic, and technical components. And although technology has a role in facilitating the invasion of our privacy, legislation that focuses on technology instead of consequences will soon be outdated. Creation of viable privacy legislation will take time and a great deal of debate.
There are a few simple steps that you can take toward protecting your privacy while you are online. Following these steps doesn't give you perfect privacy, but they will allow you to "opt out" of some of the more common ways that data can be gathered about you.
The same is true for other services, such as online postcards. By using any of these sites you are giving information about yourself or others. Is that really the way you want to say "Happy Birthday" to a friend or co-worker?
Free trials of services usually ask for your name, an address or institutional information, and an e-mail address. While you could give fictitious information for some of this, these sites generally will e-mail your password to you, thus compelling you to give a real e-mail address in order to participate. Again, check the privacy policy to see what use may be made of that information.
If you feel strongly about controlling your identity on the Internet, there are services that can allow you to surf the web either anonymously or pseudonymously. These services protect your identity while allowing you to continue to make use of what the web has to offer. Some examples of companies with identity solutions are ZeroKnowledge, Privada, and Anonymizer.
One of the skills of our modern age is an awareness of our personal data and who may be gathering it. We can't help but talk to strangers as we transact business today, so we must be ever vigilant in terms of who gets information about us. We also cannot expect that we can keep perfect control over our personal data. Living a normal life means giving out some information about yourself. This is not paranoia and we needn't live in fear.
We need to learn to develop new patterns of behavior, to ask certain questions before giving out information about ourselves. It's a new kind of common sense that I call "privacy literacy." Like other kinds of literacy, once learned it becomes almost second nature.
Cookie Cutters
Cookie Central
Privacy Services
ZeroKnowledge
Privada
Anonymizer
Here are some general areas that you should investigate in preparing your library's privacy policy:
The library policy should be short and clear. You can always refer users to another document for more details, but few users will want the details. Tell your users that your library, following the policy of the American Library Association, aims to provide confidentiality for all of its users. Let them know if their library use is protected by state law. Give some concrete examples of privacy protections ("No one can find out what books you have checked out in the past. When books are returned, the circulation record is erased."). Admit to areas where personally identifiable information is gathered or logged, such as e-mail addresses, and let the user know if those are covered by your policy. And end with the contact information for someone in your library who can answer questions about privacy and your library.
Make the link to your privacy policy large and obvious, preferably placed so that it will appear on the first visible screen when users enter your site. Repeat the link on any pages that ask for information from users.