Searcher
Vol.8, No. 7 • July/Aug. 2000
• FEATURE •
Privacy Perspectives for Online Searchers: 
Confidentiality with Confidence?
by Josh Duberman Partner, Pivotalinfo LLC
and Michael Beaudet Manager of New Technologies, 
Information Technology Department, PE Corporation

SIDEBARS
Good Old Consumer Reports
Recent Privacy Problems
Tips to Protect Individual 
Privacy Online
Privacy Resources
Tech Talk on Privacy
Information Professionals — 
What to Do About Privacy
It may have once been true that on the Internet no one knew you were a dog, as illustrated in an old New Yorker cartoon. But as they say on the Web now, these days marketers probably know your favorite brand of dog food. In fact, you may have told them yourself, in exchange for a discount coupon.

An amazing amount of personal information is readily available on the Net. For example, if you’ve forgotten the date of a friend’s birthday, just check http://anybirthday.com. That brings up another disquieting thought. Just how many companies or Web sites use a birth date to verify identification? Once you know the birthday, it’s simple to send a free, digital greeting card, but when you e-mail that card to a friend, have you also given it to marketers looking for new customers? In fact, have you compromised your friend’s privacy or passwords?

It is not only individuals who worry about privacy. So do companies, though businesses may call the problem one of confidentiality. We all know company information is readily available on the Net. As the greatest competitive intelligence gold mine of modern times, the Internet is now rife with cyber-prospectors and claim-jumpers. Many information professionals are familiar with the dichotomy of searching the Internet for as much information on the competition as possible, while trying to help prevent our own companies and our fellow employees from posting too much.

In fact, we ourselves must be careful not to disclose too much. As information professionals, sometimes the very questions we ask can be too revealing — even without identifying information attached. Searchers could seriously compromise companies’ interests if certain patent or trademark questions became known. Also, given the rapid improvements in database mining techniques, it’s possible that merely an increased number of questions concerning a particular company or technology could tip off investors to impending business activities.

While the marketing arms race rages on the new Web frontier, information professionals must carefully balance privacy concerns against convenience, efficiency, and cost issues. It’s another information dimension to take into account, along with more familiar themes of information ownership/rights, fair use, verifiability, currency, etc. But missteps along the virtual privacy dimension of information could have disastrous consequences for clients and information professionals alike. This article presents some of the issues and questions involved in online privacy from the information professional’s perspective. We offer it as a resource for making more informed decisions in this rapidly changing area.
 

Privacy Concerns of Information Professionals Versus Consumers
Privacy is a hot topic this election year, with discussions in government legislatures, court decisions, and lots of articles in both the popular and professional press. Much of this media coverage has focused on consumer privacy. While some of these issues overlap, the privacy concerns of information professionals can differ significantly from consumer concerns in several areas:

1. The definition of “sensitive” information — For consumers, sensitive “identifying” information includes name, address, credit-card numbers, SSNs, etc. Information professionals have broader concerns along this axis of information privacy:

2. Motive — Consumers seek privacy primarily to avoid information harvesting for marketing purposes, while searchers seek to guard their clients from competitive intelligence gathering as well.

3. Roles played — Consumers act primarily for themselves and their families, whereas information professionals usually serve as information intermediaries, acting for their clients and companies. Info pros may also have information privacy responsibilities extending beyond their own direct actions and decisions. They may play a leadership role in educating their companies, clients, and co-workers about information issues and provide input for information policy formulation.

Some privacy solutions may apply equally well to both info pros and consumers. For example, consumers are advised to designate only one credit card for Internet purchases and to scrutinize bills carefully in an attempt to minimize any losses through security lapses. Of course, info pros should follow the same practice with company credit cards. Some of the advice relating to “sensitive questions” could apply to both consumers and information professionals, though the application of that advice may apply to different subjects, e.g., medical conditions versus potential acquisition targets.

Some privacy concerns regarding online research may primarily interest information professionals. However, it’s interesting to note how many topics, previously of interest solely to info pros, have migrated into broader arenas of discussion. Many Internet-related topics, such as search engine technology, electronic copyright issues, and, of course, privacy concerns, frequently discussed across the length and breadth of the Net were introduced in the information science literature.
 

Query Confidentiality — When It Can Hurt to Ask
If the substance of certain queries became known, particularly queries concerning intellectual property and legal subjects, it might cause significant problems for clients. For example, a trademark must be maintained and “in use” in order to remain valid. If a competitor were alerted that someone was inquiring about their trademark, they could quickly “dust it off” and put it back in use. Additionally, if the company holding the trademark became aware of external interest in the trademark, the company’s executives would be much better prepared for negotiating a higher price for licensing the trademark’s use.

Another example might be a search for a domain name. As the market price paid for catchy domain names soars, you might wonder about the privacy of the “Is this domain name available?” search itself. Could a domain name search site watch for specific interesting queries, then grab the idea and register it quickly before you have a chance? Could the same site regularly notify paying clients when inquiries appear close to their licensed domain names? Or might someone use a network sniffer [see the “Tech Talk” sidebar beginning  on page 44 for more information] to track inquires on a specific site?

In patent searching, one could really find out quite a lot about a proposed invention by examining the search strategy, particularly within the sciences and in technical fields. The information thus revealed might even give the idea away completely — or at least allow others to start developments along the same path.

Asking certain questions might trigger other kinds of privacy attacks. For example, inquiries into particularly sensitive areas might alert a company to someone’s interest; the company could then try to get a subpoena and force the ISP or search site to reveal who was asking. Particular inquiries might alert a government department or organization, which might have an even easier time finding out who was asking.

And of course, in matters involving litigation, it becomes crucial to find out who knew what and when they knew it. Knowing what one party’s researchers searched for in very specific subjects could give a good deal of circumstantial evidence to the opposing side. It certainly might give more force to a company or opposing party’s arguments for subpoenas or discovery motions.
 

Is Your Computer Secure?
It’s important to start a quest for online privacy by securing your computer from intrusion. Various products can assist you, including access control software and hardware, encryption software, security testing sites and software suites, firewalls (especially important for high-speed, “always on” connections), and physical security systems. There are special solutions for laptop security while traveling, including cables, theft alarms, and even some ways to restrict the angles of view on your laptop to prevent your airline seatmate from seeing your work.

[Detailed discussion of most of these items falls beyond the scope of this article, but information is available in the sidebars. For more information, also see S. Kennedy, “Through the Virtual Back Door — Cyber-Sneaks Can Enter Your Computer Without You Even Knowing,” Information Today, vol. 17, no. 2, February 2000, http://www.infotoday.com/it/feb00/kennedy.htm, and various computer accessory sites such as http://www.a2zsolutions.com/ and http://www.pcguardian.com.]

One particular “personal” firewall, ZoneAlarm 2.0, is worth mentioning here since it provides two-way protection. It alerts you whenever an outside computer tries to communicate with your computer through the Internet or if a program in your computer tries to communicate with others on the Internet. You can set Zone Alarm to pass all communications by a particular program — your browser, for example — or to refuse all communications, or to alert and ask you each time. This can be very effective in detecting and defeating programs that attempt to communicate your internal information to outsiders without your knowledge or consent. There’s a detailed discussion of ZoneAlarm 2.0 at http://grc.com/su-firewalls.htm; it’s free for noncommercial use and available at http://www.zonelabs.com/.
 

Technology Basics — E-Mail and Surfing
To understand how to protect your privacy online, you should learn some of the mechanics of what happens during online sessions. The choices you make about which software you use can determine what data others can gather about you or your company.

Did you know that your browser routinely sends several bits of information about you and your computer with every Web page you access, including browser environment variables and possibly even cookies? [See the “Tech Talk” sidebar for definitions of italicized terms.] Your information can then be forwarded to advertisers, who can compile databases of information about everyone who accesses any of their ad sites.

Using software such as Cookie Managers or Proxies can help limit privacy violations while surfing. New technologies are emerging which are a hybrid of several forms of privacy management, such as Zero Knowledge Systems’ Freedom [http://www.freedom.net]. Freedom allows you to create customized “nyms” (short for pseudonyms) that you can use to take advantage of the personalization features of some Web sites, while assuming an alternate identity at other sites.

You can also think about how to limit possible privacy violations when using e-mail. How can you guarantee that only the intended recipient has access to the message? Unlike the return receipt options of LAN (Local Area Network) based e-mail systems such as Lotus Notes or Microsoft Exchange, Internet-based e-mail systems may not technically ensure that the recipient has received the message at all.

Many users take similar approaches to both e-mail and Web surfing privacy management. Establishing separate e-mail accounts for different projects can create temporary “new identities.” Free Web-based e-mail services such as HotMail [http://www.hotmail.com] are readily available, but review other possibilities at http://www.free-email-address.com/. Don’t overlook the option of using (pseudo-)anonymous re-mailing. Encryption tools such as PGP can encrypt and digitally sign e-mail, thereby hiding it from prying eyes, but these tools can be unwieldy and difficult to use.

Another problem with e-mail is that e-mail messages can last forever, as illustrated recently in the Microsoft anti-trust hearings. There is a new breed of Web-based e-mail service that can solve many of these problems. Their advanced features can include time-based expiration and even self-destructing messages, return receipts, and automatic encryption of incoming and outgoing messages. Examples of these new e-mail services include Hush Communications HushMail [http://www.hushmail.com] and ZipLip [http://www.ziplip.com].

Information about you can be collected even when you are not browsing or using e-mail. An example of this trend is the Sponsored Software promoted by companies such as Radiate [http://www.radiate.com]. Radiate pays shareware and commercial software manufacturers for placing advertisements within its applications. Radiate then collects information gathered from the applications’ users and returns the aggregated data to their clients.
 

Searching — Pick a Search Engine That Doesn’t Pick You Over
Many Internet search engines routinely collect users’ search terms. Such collection might even form an integral part of their business plans. Like many other Internet portals and sites, many Internet search engines are funded by advertising revenue. These sites get more money if they can attract more people to view the ads, and even tailor those ads to their user’s interests — which are often revealed by the search terms requested. Have you ever noticed a banner ad on a search site change to reflect a theme similar to the subject you’re searching? Sometimes, it can become almost comical. Try putting in an arcane subject or a long, complex search statement at Yahoo! and watch the little Amazon.com box jump up on your results page claiming it has whole books on whatever subject you entered.

As early as 1996, both Lycos and Infoseek announced plans to record users’ search terms and build profiles in order to offer customized advertising and content. Robin Johnson, then-president of Infoseek, said, “The idea is to capture someone’s behavior and use that information to put future searches into context” [Inter@ctive Week, 7/23/96, http://www.zdnet.com/intweek/daily/960723b.html].

Here’s an extract from Google’s privacy policy: “Google may share information about you with advertisers, business partners, sponsors, and other third parties. However, we only divulge aggregate information about our users and will not share personally identifying information with any third party without your express consent. For example, we may disclose how frequently the average Google user visits Google, or which other query words are most often used with the query word ‘Linux’” [http://www.google.com/privacy.html].

Some “search voyeur” sites even allow users to view others’ searches. Search Engine Watch’s “What People Search For” page lists search sites that offer Live Search Displays, Top Keyword Lists, and Keyword Databases at http://searchenginewatch.internet.com/facts/searches.html. Would you make a different decision about which search site to use if you knew others might see your search terms — or repeat your searches?
 

Patents — Doubly Careful
Maintaining the confidentiality of search terms becomes particularly important in patent and intellectual property searches. In these searches, just the questions asked could give away important competitive advantages.

The IBM Intellectual Property Network Internet patent search site is a good example of how user privacy concerns could be answered in a straightforward manner. It clearly says that no information is distributed outside IBM, nor used for competitive intelligence within IBM [http://patent.womplex.ibm.com/welcome]. In addition, the policy states, “Search queries are never correlated with a specific IP address or user” [http://patent.womplex.ibm.com/privacy].

The confidentiality of patent queries submitted to the U.S. Patent and Trademark Office (USPTO) server [http://www.uspto.gov] came under discussion on the Patent Information Users’ Group list (PIUG-L) a while ago. Since governmental organizations have the responsibility to provide information to the public under the Freedom of Information Act (FOIA), patent searchers wanted to know exactly what information the site collected on searchers.

Jane Myers of the USPTO replied, “The PTO maintains logs … [which] could be requested under FOIA.Our Solicitor would decide what to release. Logs contain the IP address of those accessing the PTO Web site, which could be used to determine further information about the origin of the request.… Logs contain time of access but not length of time online. Patents retrieved during a search are not logged; search strategies are not logged.” [See the full text of this quote at http://www.derwent.com/piugl98/0762.html.]

In an interesting turn, these confidentiality questions led to further discussion of the somewhat reflexive practice of using an FOIA request to get information on previous FOIA requests, as a source of competitive intelligence. An example of the results from one such request, which shows the subjects and requestors of previous FOIA requests, is available at http://www.derwent.com/piugl98/0841.html.

[More messages in this discussion are available at http://www.derwent.com/piugl98/.]

It’s possible that this PIUG-L discussion had a far-reaching effect, since the privacy policy previously posted on the USPTO Web site [see http://www.derwent.com/piugl98/0675.html] was changed shortly thereafter. The current USPTO privacy policy states: “Please note that PTO does not record or log the parameters of search requests submitted to these databases. Such uncollected information has thus never been disclosed through sale or FOIA request, intentionally or otherwise, to any third party. PTO does not plan to change this operational policy” [http://www.uspto.gov/patft/index.html].
 

Traditional Subscriber-Based Search Services
Search services based on subscriber revenues have financial incentives to keep customer information and queries confidential, since users will take their business elsewhere if they don’t get the privacy they expect. Certainly, help desk personnel at traditional search services have long stated that queries and subscriber identities are kept highly confidential. Given the importance of this issue to users, you might expect that formal confirmation of this policy would be prominently placed in the material available from the services. But when we looked, we did not find clear statements of these policies on the confidentiality of queries; at least, it was not readily apparent. Perhaps more rigorous searches would uncover them, and, we admit, we did not contact the search services directly. (Perhaps this issue might be addressed more comprehensively in a future article.)

Traditional search services are definitely sensitive to the issue of query confidentiality. Dialog’s ERA (Electronic Redistribution and Archiving) service, begun back in the early 1990s, created a stir when users realized that some file producers wanted the accession number of the items redistributed — through which, presumably, the queries could be deduced. Some file producers wanted the identities of the users as well.

Some traditional search services do have well-publicized policies regarding the privacy of personally identifying information — at least in the databases provided. LEXIS was badly burned when it offered access to the P-Trak database containing address and telephone listings from credit bureau files — and which originally included social security numbers as well. LEXIS got a lot of negative publicity — even though other vendors carried the same material, and still do, in some cases. Lexis purged all Social Security numbers and issued a very clear and prominent privacy policy at http://www.lexis-nexis.com/lncc/general/privacy.html.

Dialog made an early mention of its policy in a Chronolog article from November 1983: “Because we are concerned about the confidentiality needs of our customers, we have a policy against revealing customer identities” (Dialog Chronolog, File 410, Accession No. 0001749). In a personal communication, Dan Wagner, Dialog CEO, affirmed, “You have our assurance that your e-mail address will NOT ever be distributed to anyone outside of The Dialog Corporation. We view a customer’s address as confidential information” (5/26/98). But now Dialog has a new owner, Thomson Corporation. Does that mean equal, better, or less privacy protection?
 

Secure Communications and Vendor-Side Data Holes
Some search services are sensitive enough to confidentiality issues that a secure communications channel for online searching is offered. One can search using SSL (Secure Sockets Layer) on a number of search services, including DialogWeb, STN on the Web, and IBM’s Intellectual Property Network. However, some vendors reserve SSL solely for the transmission of credit-card information, even though this might allow others to read passwords and session contents.

Recently UC Berkeley graduate student Richard Fromm detailed his efforts to convince eBay to encrypt passwords in an interesting Web posting at http://avocado.dhs.org/ebpd/: “The pitfalls of sending passwords in the clear have been recognized for many years. The only surprising thing is that too many people still don’t take security seriously and continue to repeat the same mistakes over and over again.” When eBay didn’t increase their security in response, Fromm states that he wrote an eBay password daemon that can sniff network traffic for eBay user IDs and passwords. (See Authentication entries in “Tech Talk” sidebar.) This daemon program is available for downloading on the site, though Fromm disclaims any malicious or illegal use. Perhaps eBay will have implemented increased security of communications by the time this article is published

There are also other communication security issues. Some search services promote e-mail delivery of search results, billing, and account information. Do any services allow you to request encryption for these e-mails?

Of course, even if the communication channel is totally secure, the vulnerability of your confidential information on someone else’s site depends on the security of that site. Look at recent incidents in which hackers stole credit-card information. For example, one hacker tried to extort money from CD Universe after claiming to have stolen 300,000 credit-card numbers in January 2000. John Ryan, CEO of encryption software supplier Entrust, said, “This wouldn’t have happened if the data had been encrypted” (Dick Satran, “NetTrends: Devices Add to Security Challenge,” Reuters, 1/19/2000).
 

Privacy Policy Problems — Read Them Early and Often
Although some Web sites still lack a posted privacy policy, an increasing number of sites have them — though they may require some searching to find. Once found, it’s important to read the policy carefully, so that you can be sure you agree with it. Some privacy policies can be unclear, ambiguous,  hard to understand, or may refer to unspecified relationships with unspecified companies. You may also need to read the “Terms and Conditions,” User/Subscriber/Service Agreements, or equivalents, since these may modify the privacy policy. For example, the privacy policy at one Web site stated clearly that no information would be shared without user permission, but the accompanying subscriber agreement declared that by subscribing, users automatically gave permission for their information to be shared. [For a listing and analysis of several privacy possibilities (along with the companies’ response), see “A Surprise in Every Package,” Industry Standard, March 13, 2000, p. 208+ http://www.thestandard.com/article/display/0,1151,12453,00.html.]

Regardless of the current wording of the privacy policy or other legal notices, the phrase “changes can be made at any time” is relatively common in these agreements. In fact, the agreements often state that these changes may be made without notice. Users must regularly check the notices for updates or changes. Here is one example from Amazon.com, though the company did add an “opt-out” clause: “Amazon.com does not sell, trade, or rent your personal information to others. We may choose to do so in the future with trustworthy third parties, but you can tell us not to by sending a blank e-mail message to never@amazon.com…. If we decide to change our privacy policy, we will post those changes on this page so that you are always aware of what information we collect, how we use it, and under what circumstances we disclose it” [http://www.amazon.com/exec/obidos/subst/misc/policy/privacy.html/102-0375170-5013609].

Armand Prieditis, CEO of Unconventional Wisdom, has developed a number of questions for rating privacy policies, including the following: Is the policy prominent and easily accessible? Is it clear? Is it short? What information is collected? Is there an opt-out choice available? Is there provision for users to make changes, updates, or deletion to their personal data? Is there a contact given at the company for questions relating to their privacy practices? Is there special handing for information about children? [See “Fourteen Features of a Good Privacy Policy,” Prieditis’ slide from his talk at the Search Engines 2000 Conference, at http://www.infonortics.com/searchengines/sh00/prieditis_files/frame.htm.]

For some examples of relatively consumer-oriented or “privacy-friendly” policies, see those of Junkbusters [http://www.junkbusters.com/ht/en/aboutus.html#policy] and Ralph Nader for President [http://www.votenader.com/privacy.html]. Carol Ebbinghouse reviewed the privacy policies of many sites that searchers care about (“Privacy: Another Licensing Issue,” Searcher, vol. 7, no. 2, February 1999, pp. 18+). The Smart Computing Guide to PC Privacy (vol. 8, no. 4, April 2000, pp. 123-8) carries a list synopsizing the privacy policies of the 50 most popular sites [http://www.smartcomputing.com/editorial/article.asp?article=articles/archive/g0804/45g04/45g04.asp&guid=bcecpk10].

A number of other reviews of privacy policies are available, but, caveat surfer — we recommend that you check the current information of any given site and, if you use the site frequently, keep checking the policies regularly.

In an effort to mitigate the necessity for reading multiple, often confusing, privacy policies, The World Wide Web Consortium [http://W3c.org] is developing the Platform for Privacy Preferences (P3P) [http://www.w3.org/P3P/]. Due out in the summer of 2000, P3P will enable users to choose their own preferences concerning the kind and quantity of information they are willing to provide. Users will be warned when they surf a site which has a privacy policy that goes beyond their pre-set privacy limits. At this writing Microsoft has just promised to provide free Internet tools for P3P in the fall of 2000. However, P3P is the subject of some controversy. Some critics feel that there are insufficient incentives for Web sites to enroll in the program. Junkbusters president Jason Catlett said that wide adoption remains years away (The New York Times, 4/7/2000). Catlett said that companies are using P3P as “an excuse to use in their lobbying against enforceable privacy rights for the American consumer: a Pretext for Privacy Procrastination” [http://www.cfp2000.org/papers/catlett.pdf].

Even if you approve a site’s privacy policy, what assurance do you have that the site will actually comply with its posted policy? A recent study by the California HealthCare Foundation alleges that a number of health care Web sites shared personal consumer health information with other sites in violation of its own privacy policies. The Federal Trade Commission has been asked to review these charges. [For more information see http://ehealth.chcf.org/index_show.cfm?doc_id=34.]
 

Privacy Seals — Good Housekeeping?
A number of online privacy seals have been established in an attempt to reassure consumers about often confusing privacy policy provisions. These seals include TRUSTe, CPA WebTrust, BBBOnline, and SecureAssure. All of these seals set standards that participating sites must meet. [For details, check http://www.truste.com/, http://www.cpawebtrust.com/, http://www.bbbonline.org/businesses/privacy/index.html, and http://www.secureassure.org/.] Note that sites bearing the same privacy seal may have substantially different privacy policies, so you still need to read the privacy policy and other agreements.

Critics charge that an inherent conflict of interest exists with certificate programs subsidized by fees from participating sites. They also charge them with a lack of enforcement actions. The programs seldom withdraw the seals, even for flagrant violations. TRUSTe in particular has been called an attempt by industry to avoid government oversight, and that it “…proves industry self-regulation on privacy won’t work” (Industry Standard, March 20, 2000, p. 168).

Independent third-party certification could be very useful in promoting consumer trust, especially regarding sensitive issues. One particularly useful application might lie in  guaranteeing site security against hacker attacks, since companies are understandably reluctant to detail security arrangements openly on their Web sites.
 

The Role Of Government — Laws and Regulations

The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well-meaning, but without understanding.
— Supreme Court Justice Louis Brandeis, the leading advocate of the legal right to privacy in the U.S.
Some consider government and law enforcement a bigger danger to personal privacy than corporations, though this may not necessarily apply to the privacy concerns of information professionals. Certainly the role of the government is easily one of the more controversial points in any discussion on the subject of privacy. Here are some very brief notes on recent items of interest to information professionals.

The FBI, SEC, and FTC have all recently announced Web investigation initiatives, each in its respective spheres. The FBI and Department of Justice requested widespread surveillance powers and the budget to link many private and public databases and to buy new data-mining tools. “If there is going to be a Big Brother, it is us, the FBI,” said FBI Supervisory Special Agent Paul George. Putting this remark in context, George also said, “There are reasons law enforcement should and does have the power to arrest and search…. There are worse things than having your privacy violated — like murder. I don’t know how [others] can say that there is no price to privacy or price to security in this equation. In order to prevent crime, information has to be collected ... if justified.”

Junkbusters’ Jason Catlett responded that without proper regulations about when and how data can be collected, such an assertion makes everyone a suspect. Catlett continued, “It’s like they are saying that we have a lot of robbers, so in order to protect the banks — rather than make them more secure — they are requiring the identity of everyone who walks in front of banks” (ZDNet News, 4/6/2000).

The SEC has proposed to monitor Web activities involving investment offers that provide false or misleading information, claims of “inside” information, and other fraud-related behavior. The plan has elicited some controversy, since the SEC intends to unmask anonymous message posters, and some object on constitutional grounds (The Wall Street Journal, 3/28/2000). The Federal Trade Commission also plans to protect privacy by investigating Web site practices relating to the gathering of personal information.

Congress is involved in a number of investigations of Internet privacy issues with legislation on the subject expected either this year or next. The administration has long been a proponent of industry self-regulation, but privacy is an issue with strong popular support, and this is an election year, so there may well be changes ahead.

The administration has recently relaxed its prohibition on the exportation or use of strong encryption by U.S. citizens outside the US. This may have direct consequences for information professionals in multinational companies and those doing business with foreign companies, permitting additional options in cryptography software. However, the administration continues to back law enforcement requests for encryption “back doors” and electronic communication eavesdropping capabilities.

The existence of the long-suspected international electronic communications interception network, Echelon, was recently confirmed from documents obtained under FOIA by National Security Archive senior fellow Jeffrey Richelson. However, despite the fears of number of organizations — including the government of France — that Echelon was used for industrial espionage, Richelson said, “My research suggests that it’s much more limited than the extreme cases make out” [http://www.wired.com/news/politics/0,1283,34932,00.html]. In a telling point for information professionals, Richelson contended that the National Security Agency is overwhelmed by the amount of information on the Internet, and “its ability to collect and process information is not nearly as immense as some of these accounts make it out to be…. This agency is not doing all that well against the new information technology” (The New York Times, 2/24/2000).

The European Union and the U.S. have come to a tentative agreement on “safe-harbor” provisions under which U.S. companies will protect the privacy of European Internet shoppers. The EU had threatened to stop doing business with the U.S. unless more was done to protect EU citizens’ legal right to privacy. The U.S. and the EU had agreed to these rights in the 1980 Guidelines Governing the Protection of Privacy and Transborder Flows of Personal Data [http://www.oecd.org/dsti/sti/it/secur/prod/PRIV-EN.HTM], often seen as a basic guide to privacy and fair information practices.

It appears that “opt-in” permission from EU citizens will be required before their information is transferred or sold, and U.S. companies could be prosecuted under criminal law for any lapses (Financial Times, 2/23/2000). This agreement is worth monitoring, since it could eventually have broader implications for information professionals and U.S. citizens as well, who might ask, “Why do they get rights we don’t?”— especially in an election year.

The new Children’s Online Privacy Protection Act (COPPA), effective as of 4/21/2000, is one of the few explicit privacy laws in the U.S.; it requires that parental permission be obtained before information is gathered on children younger than 13. FTC attorney Loren G. Thompson said, “It’s a high priority for the agency … we will be enforcing this law and looking at violators closely.” The FTC will check Web sites at random for compliance; each violation could cost operators $11,000 (The New York Times, 4/21/2000). However, COPPA was based on the FTC’s 1996 privacy rules, and it doesn’t address possibilities available with current technology, such as Net-capable cell phones or PDAs.

Critics also say that COPPA may have unintended consequences. Parry Aftab, of the online privacy firm CyberAngels, estimated the cost of COPPA compliance at $60,000 to $100,000 per site  [The New York Times, 5/12/2000, http://www.nytimes.com/library/tech/00/05/cyber/articles/12coppa.html]. A number of children’s sites, including Thomas the Tank Engine, Ecrush, and NCBi, have suspended e-mail newsletters to avoid violating COPPA — thus disappointing their young fans [Wired News, 5/13/2000, http://www.wired.com/news/politics/0,1283,36325,00.html].
 

ISPs and Subpoena Power — Big Brother’s Best Buddy
All of a surfer’s behavior and messages can easily be tracked at the ISP (Internet Service Provider) level. Predictive Networks [http://www.predictivenetworks.com] is attempting to harvest that data by offering subscribers lower-cost surfing in exchange, with personalized ads and content. However, note that subscribers clearly must opt-in to the program, unlike other, less-obvious monitoring programs.
In 1998, FBI director Louis Freeh told Congress that he would like to require that ISPs keep records of  user IP addresses and screen names. Law enforcers could then obtain this information by subpoena if necessary.

Of course, others could also subpoena ISPs for this information, and this power offers a major threat to online anonymity — and thus to free speech, some say. Ironically, the Orange County Register newspaper has been accused of using a subpoena to remove a critic’s anonymity (The New York Times, 7/24/98). In another case, an anonymous employee claims that he was dismissed from his job after he had criticized his employer, who then used a subpoena to discover his identity (Cleveland Plain Dealer, 4/11/2000).

Often ISPs have little incentive to fight subpoenas and may not even allow their subscribers sufficient time or notice to do so either. As noted elsewhere, this subpoena power can also have significant implications for information professionals, possibly enabling full identity and query disclosure. Concerns over such vulnerability could push users toward systems such as Zero Knowledge’s Freedom, which is essentially exempt from subpoena disclosure since no record is kept of user IDs.
 

Database Mining and User Profiling
Some of the techniques involved in database mining and user profiling are very familiar to information professionals. Indexing and categorization are our stock in trade, and of course so is “the dossier effect” — building a mosaic of information around a particular subject or person through which the educated eye can infer implicit additional information. And the Internet can certainly supply a lot of information — perhaps more than ever before. As Ian Goldberg of Zero-Knowledge Systems said, “Everything that can be linked together will be” (“Stay Anonymous If You Want to Stay Unnetted,” Karlin Lillington, Irish Times, 2/18/2000, p. 61).

When collaborative filtering and predictive modeling are performed on huge amounts of warehoused data, the effect can really call for a new look at the question, “What is personally identifying information anyway?” Today marketers integrate census data with many other information sources. This is one reason why there was such a reaction to DoubleClick’s proposed mining of Abacus data for profiling.

Powerful data mining, profiling, and predictive tools developed for the marketing industry could also be applied to competitive intelligence purposes. All the information traffic to and from a particular site could be sniffed and then stored in a huge database for analysis, threading, deconstruction, etc. Or at least it could be analyzed eventually, no doubt when sufficiently capable tools are available. Remember, when Alexa captured a 2-terabyte snapshot of the entire Web in 1997 and presented a copy to the Library of Congress, the archive was not very searchable, due to technology limitations.

It may be difficult to interpret information taken out of context. When a mother mentioned in a telephone conversation that her son had “bombed” in a school play, the Canadian Security Agency identified her as a potential terrorist (Jeffrey Rosen, “The Eroded Self,” The New York Times, 4/30/2000).

Information professionals may well understand another problem facing data miners — incompatible database software and data structures. One solution may arrive soon — CPEX [Customer Profile Exchange, http://www.cpex.org], a new XML-based industry specification scheduled for release this summer. While it could become a privacy advocate’s nightmare, it might also incorporate privacy controls. JunkBusters’ Jason Catlett said, “There’s an old saying that if you automate a mess, you just get a bigger mess. The sharing of personal information is a big mess in this country right now. That said, the CPEX developers are clearly thinking about privacy because they know it’s a potential party-stopper…. I think it’s a good idea if standards provide a way for companies to easily abide by fair information practices (Wired News, 11/15/99).

A small number of Internet advertising agencies produce almost all Web banner ads. These companies include 24/7 Media, Flycast, Real Media, MatchLogic, Doubleclick-Netgravity, Adforce, and Engage Technologies-Adsmart Network. (See “Your Browser Is Selling You Out,” PC Computing, 3/2000 p. 92, for links to privacy policies and opt-out information.) Detailed information on these and additional profilers is available in “Special Report: The Privacy Problem,” The Industry Standard, 3/13/2000, http://www.thestandard.com/article/display/0,1151,12587,00.html.

Some of these agencies promise advertisers very focused and targeted marketing. Such promises require Web profilers to impinge on what some would consider private information. At present, targeted advertising on the Web isn’t selling. Advertisers just want to reach as large an audience as possible and are only interested in the most basic categories, such as age group, gender, or geographic location. As Martin Smith of MatchLogic, a targeted Web advertising company, explained, “The clients say we have to prove we are faster, better and cheaper for them to use us. We will. This is like television in 1950” (Saul Hansell, “So Far, Big Brother Isn’t Big Business,” The New York Times, 5/7/2000). In other words, though Web profilers may not be a big problem today, there’s nothing to prevent them from becoming a big problem tomorrow.
 

Privacy Versus Personalization
To date, advertising or sponsorship has served as the main funding source for all the “free” Internet surfing, searching, and content, as well as for much of the software and hardware innovations that enable it. And what is the coin of the realm on the advertising-supported Web? As Forrester Research analyst Chris Charron said, “If the business model is advertising, the commodity is personal data” (Wired News, 6/24/98). Certainly information professionals understand that information isn’t free. It’s really a question of understanding what the price will be beforehand and agreeing to pay it. Full disclosure should include details of who else gets the collected information and how it will be used.

When users see options to personalize their Internet connections, e.g., on portals or through specific user-interest profiles, no one argues that personalization by itself is a bad goal. Everyone wants the convenience and efficiencies promised by the concept of personalized service, but now the word “personalization” may have possibly additional implications due to previous industry practices.

Analyst Andrew Shein of the Electronic Privacy Information Center (EPIC) says, “...companies that are collecting detailed information from online consumers without their knowledge and consent are not personalizing — they are invading privacy” (Wired News, 11/16/99).
 

Privacy as a Selling Point
There are powerful inducements for the success of e-commerce. Forrester Research senior analyst John Nail estimated that privacy concerns kept online consumers from spending an additional $2.8 billion online in 1999 (ZDNet News, 5/9/2000).

Getting e-commerce right means making sure consumers are comfortable, and polls show that many users are worried about online privacy. For example, 81 percent of Internet patent information users expressed concern over the lack of security of Internet sites [“Managing Patent Information: An Emerging Two-Tiered Approach,” available from Derwent, http://www.derwent.com]. As Steve Larsen, VP of online ad company Net Perceptions, said, “ Unless people can trust the people on the other side, commerce won’t happen. Good privacy policies just make good business sense” (Wired News, 11/16/99).

Some marketers are hearing the consumer call for privacy so clearly that the issue of privacy is becoming a marketing strategy. The Privacy Consortium is a new group of 26 Internet ad companies based on “permission marketing.” The Consortium’s guidelines require that businesses state what information is collected, offer a clear “opt-out” policy, and perform a yearly privacy audit. “I insist we eat our own dog food,” said co-chair Bonnie Lowell, who continued, “If you don’t want to comply, don’t join” (The New York Times, 4/3/2000).

A number of new products and services focus on consumer privacy. Bonnie Lowell’s company Younology sells the program “My Orby,” which shows consumers how they are being tracked on the Web.

Search site TopClick advertises itself as “The Internet’s PRIVATE Search Engine.” TopClick has also collected a large number of privacy resources, including a good privacy news service, and sponsors a “Partners in Privacy” Affiliate program.

Zero-Knowledge Systems is one of a number of personal identity-management companies in the new infomediary market space. These infomediaries “will protect the privacy of their clients” and “help customers maximize the value of their data,” said John Hagel III and Marc Singer, in Net Worth — Shaping Markets When Customers Make the Rules (McKinsy & Co., 1999, 0-87584-889-3). Other players include PrivaSeek, Lumeria, Popular Demand, Enonymous, PrivacyBank, Ezlogin, and Proxymate.
 

So What’s an Information Professional to Do?
Information professionals certainly understand the benefits of both personalization and privacy in the delivery of information. Actually, it’s what we already do — it’s our job to deliver personalized, tailored information with confidentiality. Much of the current discussion involving Internet topics sounds very familiar.

In a recent interview, Yahoo! co-founder Jerry Yang could have been speaking about information professionals when he said, “People ask me, ‘How are you going to keep people from just bypassing you and going directly to your partners?’  The answer is … we will be bypassed if we stop adding value for the customer. That is a very difficult challenge.” Yang spoke about Yahoo!’s plans for improvements that will give users a “personalizable and comprehensive shopping experience,” adding, “…to build a relationship, it’s critical to do it over time. We hope we have developed a level of trust and ease that only time can buy” (“Yang: Yahoo! Getting Personal,” Matthew Broersma, ZDNet News, 3/2/99).

Information professionals have a long history of commitment to information privacy. Recall the librarians who refused to reveal patrons’ borrowing histories during the FBI’s notorious “spies in the stacks” Library Awareness program [Herbert N. Foerstel, Surveillance in the Stacks — The FBI’s Library Awareness Program, Greenwood Press, 1991, http://info.greenwood.com/books/0313267/0313267154.html]. Our privacy policies are posted. We need to publicize our observance of those guidelines and distance ourselves from questionable practices, especially in regard to obtaining and distributing confidential and personal information. [See the second from last entry in the “Information Professionals — What to Do About Privacy” sidebar on page 47.]

Information professionals get a lot of use from the Internet and its “free” information. Most of us can’t really do our jobs without it any more. We may even have a duty to search the Internet, as T. R. Halvorson described in “Searcher Responsibility for Quality in the Web World” [Searcher, vol. 6, no. 9, October 1998, http://www.infotoday.com/searcher/oct98/halvorson.htm].

Companies are also required to take “reasonable measures” to protect confidential information. As we learn more about the lack of privacy on the Web, it certainly starts to seem reasonable to use some of the privacy protection tools described here. You don’t have to wait until you’re in an accident to remember to use a seat belt every time you’re in a car.

Information professionals can bring a lot of advantages to the discussion of confidentiality and privacy on the Web. The current situation can be a real opportunity for us to take a proactive stance to protect privacy on the Internet, show leadership, and make contributions. We can start by educating ourselves, our companies, and our vendors.

There’s lots of activity in the online privacy arena. The situation is changing rapidly, the laws are in flux, and parts of this article may well be dated by the time it’s published. For every calendar year, at least 7 years zoom by on the Web. Yes, you guessed it, everyone’s a dog on the Internet — and we have to keep on learning new tricks.


Acknowledgments: Though all the errors are our own, we’d like to give our thanks to many, including April Thatcher, Tania Beaudet, T. R. Halvorson, Dawn, Amelia Kassel, Anne Mintz, Kim Means, Kim Emmons, Nancy Lambert, Lynn Peterson, John Fisher, Zoia Horn, David Kalow, Rick Weisberg, Dave Farber, Cleo, too-often-unsung help desk personnel and list-serve posters, and other colleagues and co-workers.

Good Old Consumer Reports
The age-old protector of consumer interests, Consumer Reports, continues to fight on in the Information Age. In the May 2000 issue, a comprehensive online privacy report appeared as the first of a three-part series [“Special Report: How to Protect Your Privacy, Part I,” May 1, 2000, vol. 65, no. 5, p. 43+]. The eight-page article covers the basics of online privacy, the technology involved, privacy policies, and financial privacy. Later this year, Consumer Reports plans to publish two additional reports that will cover medical information and profiling by marketers. The May article actually mentioned lack of privacy while performing online medical research in an example excerpted below (used with permission of course):
Companies with an interest in knowing who’s curious about them can use cookies to find out. For example, when we entered the term “Chrysler” in the search engine at Web portal AltaVista, the site transmitted our information request to the automaker’s computer, not just to the usual ad network. Thus, if we wanted to investigate whether a Chrysler product was subject to a class-action lawsuit, the company would be tipped off to the time and date of visits by anyone searching for related key words such as a vehicle model name, say, or a product recall. Courts in recent cases have issued subpoenas enabling companies that believe they are the targets of potentially damaging civil actions to discover the identity of individuals who were under the assumption that their Web-based legal research remained anonymous. With a few exceptions, the Web portals have complied. We think it’s a breach of trust for these sites to share information such as Web searches that can later subject visitors to legal jeopardy.
Recent Privacy Problems
Some recent problems involving online privacy have attracted a lot of media attention to the subject.

In the fall of 1999, RealNetworks was accused of violating its privacy policy by collecting user information from its widely downloaded RealJukeBox music player program. RealNetworks offered a patch to disable the reporting functions and revised its privacy policy. Several class action suits were filed: California plantiffs asked for $500 per user as market value for the information sent, and Pennsylvania plantiffs asked for $30 per user as software refunds. RealNetworks counter-sued, saying that problems must be settled via arbitration as specified in its software licensing agreement. [For more information, see http://www.smartcomputing.com/editorial/article.asp?article=articles/archive/g0804/41g04/41g04.asp&guid=xtgic4u4.]

The largest U.S. Internet ad agency, DoubleClick, was recently embroiled in a very public controversy concerning its plan to merge user names with previously anonymous Internet behavior profiles. (See Clear GIFs in the “Tech Talk” sidebar for a discussion of some of the technology involved.) DoubleClick had intended to obtain some of the real-world information from its recently acquired subsidiary, the direct marketing company Abacus. DoubleClick backed down from the plan, announced that it would submit to independent audits of its privacy practices, and launched a number of privacy initiatives, including a consumer privacy education site, PrivacyChoices [http://www.privacychoices.org/].

Nevertheless, DoubleClick still faces FTC and SEC reviews, as well as six lawsuits — and the incident may serve as impetus for a number of bills and congressional actions in this election year [http://www.isp-planet.com/politics/cookie_crumbles.html].
 

Three Books = Three Privacy Perspectives
Several recently published books illuminate some of the trends, policies, and technologies involved in the greater privacy issues of our society.

Much of the current struggle for privacy is rooted in the modern marketing industry. Douglas Rushkoff’s disturbing book, Coercion — Why We Listen to What ‘They’ Say [Riverhead Books, 1999, 1-57322-115-5], reveals the current tricks of the trade. Discussing salesmen’s scripts and the techniques of cults, Rushkoff shows how far marketing has come since the 1957 publication of Vance Packard’s best-selling The Hidden Persuaders. Rushkoff writes about various manipulative tools, including visual and scent cues, shopping center design, multilevel marketing groups and various kinds of advertising. Interviews with sales and advertising professionals, telemarketers, and consumers complete the picture.

Rushkoff writes, “Corporations and consumers are in a coercive arms race. Every effort we make to regain authority over our actions is met by an even greater effort to usurp it.” This arms race is certainly reflected in current online privacy issues and technologies.

Rushkoff’s definitions address the balance between privacy and going too far: “Persuasion is simply an attempt to steer someone’s thinking by using logic. Influence is the act of applying readily discernible pressure: I want you to do this; I have power over you, so do it. Coercion seeks to stymie our rational processes in order to make us act against — or, at the very least, without — our better judgment.”

This underscores the importance of permission-based “opt-in” marketing. We may well feel that we are being coerced when our information is taken online without our knowledge and used to manipulate us and convince us to buy.

David Brin’s The Transparent Society — Will Technology Force Us to Choose Between Privacy and Freedom? [Addison-Wesley, 1998, 0-201-32802-X] offers erudite and compelling arguments for the ultimate of “Sunshine Laws”: a city of glass houses governed by mutual accountability.

Brin says that surveillance technology will soon be everywhere, resulting in an inevitable loss of privacy. The biggest danger is that too few people will watch. Brin advocates “reciprocal transparency” —  an open society in which everyone has access to the same information about everyone else. It’s a challenging and often fascinating book, with eclectic philosophical, historic, and technological discussions and examples and is well worth reading.

Many of Brin’s points are echoed elsewhere. Nobel Laureate Arno Penzias, author of Ideas and Information, recently said that our current “urban anonymity” is a historical anomaly. Our origins began in small villages where everyone knew everything about each other; and in the near future everything will again be known with everyone’s online data trails stored in computer databases [James Fallows, “Frontier Days,” The Industry Standard, 11/14/99, http://www.thestandard.com/article/display/0,1151,7618,00.html].

As a scientist and science-fiction author, Brin’s technology predictions regarding ubiquitous surveillance certainly seem accurate. Improved surveillance and pattern recognition systems were recently installed to protect the Las Vegas Mirage Resort from casino cheaters [http://www.viisage.com/March29_2000.htm]. Mitsubishi’s struggles to market a new accident-recording car camera, along with additional possible applications (on police guns, in elevators, and on school kids’ jackets to protect them from bullies) are discussed in Robert Buderi’s article “Great Idea, Tough Sell” [Upside, 2/28/2000, http://www.upside.com/texis/mvm/story?id=38b6dea80].

Speaking directly to online anonymity, Brin predicts, “So you use anonymous remailers to reconvey all your messages, so that nobody (except the remailer owner) can trace your identity? Better be careful. Experts at linguistic analysis are developing effective ways to appraise and detect spelling and grammar patterns that are unique to each individual” (p. 287).

Journalist and computer security expert Simson Garfinkel’s Database Nation — The Death of Privacy in the 21st Century [O’Reilly, 2000, 1-56592-653-6] offers an entertaining, frightening, and absorbing account of how our lives may be affected by the increasing loss of privacy in the near future. Garfinkel lists numerous current threats to privacy, including the end of due process, biometrics, misuse of medical and genetic information, systematic capture of everyday events and data, runaway marketing, and commodification of personal information. The book has real strengths in its breadth of coverage and depth of detail describing the history and technologies involved. More information is available, as well as the full text of the chapter on medical records, at http://www.databasenation.com.

Garfinkel tells stories of real people suffering from present-day privacy breaches and extrapolates a chilling future of disappearing privacy. But these are really cautionary tales, wake-up calls meant to alert the reader and prevent the outcome before it’s too late. Publisher O’Reilly compares Database Nation to Rachel Carson’s historic Silent Spring, which helped launch the environmental movement of the ’60s.
Garfinkel similarly argues for a government role in preventing industry encroachment on individual rights. In a recent interview, he said that industry self-regulation hasn’t worked in the past [The Pizzo Files, 4/27/2000, http://www.oreillynet.com/pub/a/network/2000/04/27/garfinkel/index.html]: “We tried using the marketplace to regulate the chemical industry in the 1950s, and the result was that we killed a lot of species, we polluted rivers, and the air was unbreathable in many cities. The marketplace doesn’t regulate issues when there are externalities. You need to have regulation so that companies are forced to bear the brunt of what they throw onto society. And privacy is very much like that.”

Garfinkel continued with particular relevance for information professionals: “…most people in our society are not really well-versed enough to protect their privacy by making informed decisions, just as they aren’t really well versed enough to protect their health by reading the ingredients and deciding if a particular ingredient on a bottle is known to cause cancer or not. Instead what we do is we have a law that says if a substance is known to cause cancer you can’t put it in the food supply. But we don’t have rules right now that say if a product is known to cause privacy problems you can’t put it in the information industry.”

Perhaps it’s time that information professionals learned about the wider privacy implications of current industry practices and joined in efforts to preserve what privacy is necessary for the practice of our profession. Database Nation — The Death of Privacy in the 21st Century is an important book, worth reading, and a partial inspiration for this article.

Tips to Protect Individual Privacy Online
Here is a quick list of the Electronic Freedom Foundation’s Top 12 Ways to Protect Your Online Privacy. [For complete details go to http://www.eff.org/pub/Privacy/eff_privacy_top_12.html.]

1. Do not reveal personal information inadvertently.
2. Turn on cookie notices in your Web browser and/or use cookie  software.
3. Keep a “clean” e-mail address.
4. Don’t reveal personal details to strangers or just-met “friends.”
5. Realize you may be monitored at work, avoid sending highly personal e-mail to mailing lists, and keep sensitive files on your home computer.
6. Beware of sites that offer some sort of reward or prize in exchange for your contact or other information.
7. Do not reply to spammers for any reason.
8. Be conscious of Web security.
9. Be conscious of home computer security.
10. Examine privacy policies and seals.
11. Remember that YOU decide what information about yourself to reveal, when, why, and to whom.
12. Use encryption!

For more lists of privacy tips, go to the following sites: 

Privacy Tips from PrivacyScan
http://www.privacyscan.com/privacytips.html
It includes software recommendations.

Fact Sheet # 18: Privacy in Cyberspace
http://www.privacyrights.org/FS/fs18-cyb.htm
From the Privacy Rights Clearinghouse.

Privacy Tips from Privacy Journal
http://townonline.koz.com/servlet/visit_ProcServ?DBPAGE=cge&GID=
00001000010887059862929943&PG=01001000010896973444229799

Privacy Resources
Software

COTSE — Privacy Resources
http://www.cotse.com/privres.htm
Extensive list with abbreviated but very helpful annotations.

Privacy Market Place
http://www.topclick.com/pc_marketplace.html
From TopClick. Very extensive list.

EPIC Online Guide to Practical Privacy Tools
http://www.epic.org/privacy/tools.html
Includes some unique software.

PrivacyPlace.com — Marketplace
http://www.privacyplace.com/marketplace.html
Includes some unique software and descriptions.
 

General Resources: Organizations and Data Sources

Smart Computing Guide to PC Privacy
http://www.smartcomputing.com/editorial/stoc.asp?guid=xtgic4u4&vol=8&iss=4&type=4
Entire issue carries lots of detail on software, technology, incidents, history, organizations, and resources [vol. 8, no. 4, April 2000].

Epic Online Guide to Privacy Resources
http://www.epic.org/privacy/privacy_resources_faq.html

“MacWorld’s Internet Privacy Guide,”
MacWorld, July 2000, pp. 62-69, 72-77.
This two-part article provides protective guidelines for Mac users that cover both general online issues and e-mail in particular.

TopClick’s Privacy Resources:

Organizations —
http://www.topclick.com/pc_organizations.html

Resources, organized by general topic —
http://www.topclick.com/pc_resources.html

Electronic Frontier Foundation (EFF)
http://www.eff.org
Home page has links to archives of resources, information, and policy-related material in various areas, including privacy, medical privacy, and digital surveillance.

Junkbusters
http://www.junkbusters.com/ht/en/links.html#reduce
Links to a wide variety of resources, including information and software, re: junk mail, spam, telemarketing, cookies, filtering, and privacy. Other resources include http://privacy.net/Resources/, http://www.onion-router.net/Other_Sites.html.
 

Keeping Up on New Developments

TopClick’s Privacy Center
http://www.topclick.com/pc_news.html
Lists Internet Privacy Headlines, searchable by general topic, with links to the full stories.

EPIC
http://www.epic.org
Lists latest news on home page as well as links to many other resources.

The Privacy Forum Digest
http://www.vortex.com/privacy

Tech Talk on Privacy
Here are brief explanations of some of the more technical aspects of Internet communication technologies and software and their bearing on privacy issues. Some additional resources are listed below, including a number of references to specific chapters in the useful publication Smart Computing Guide to PC Privacy [vol. 8, no. 4, April 2000, abbreviated here as SCGPP, with complete table of contents available at  http://www.smartcomputing.com/editorial/stoc.asp?guid=mne9hwm4&vol=8&iss=4&type=4].

Environment Variables are bits of data captured by the server about your Web browser. The information these variables contain includes the network (or IP) address of your computer, the username you are currently using to log into the server, the type of computer and browser you use, any Cookies set on your computer for the site you are browsing, the name of the Web page you are requesting, and the address of the last page you requested. This means that if you traveled from Microsoft.com to Netscape.com, Netscape can tell that you have come from the Microsoft.com site. See more details at SCGPP’s “Understanding IP Addresses.”

HTTPS and SSL are semi-interchangeable terms used to describe the current standard in Web encryption. HTTPS refers to the standard Web connection protocol HTTP with the addition of the SSL (Secure Sockets Layer) encryption method. SSL can encrypt data using a variety of key-lengths, including 128-bit keys in North American and 56-bit keys in Europe and Asia. You can feel reasonably assured when using an SSL encrypted site that the data sent to and from your Web browser is secure. SSL uses a trusted third-party authority, known as a Certificate Authority, to verify that you are communicating with a trusted site. The Certificate Authority is responsible for issuing certificates only to companies that can prove their identity. You can confirm that you are communicating with the correct site by clicking on the Security button within Netscape or by selecting Certificates from within the Internet Properties of Microsoft’s Internet Explorer. For more information, see SCGPP’s “What Is SSL & How Does It Work?”

On the Internet, there are two primary methods for user Authentication, which is how you tell the server you are accessing who you are. These methods are Session Authentication and Application-Level Authentication and usually differ in how your login information is requested.

The Session Authentication method uses the browser’s built-in username and password mechanism: A pop-up dialog appears asking for your username and password for the site you are accessing, or possibly for a realm within that site. Some browsers will also ask if you would like them to remember your username and password for future visits. Although your user information is stored in a file on your hard disk, these files are different from Cookies and cannot be accessed in the same manner. Once you have entered your username and password, both are weakly encrypted using the Base64 algorithm and forwarded to the server for each successive request within the secured area of the site. This makes the Session Authentication method more secure than Application-Level Authentication; but if HTTPS is used with either authentication method, then both are equally secure.

Application-Level Authentication is an increasingly popular method for managing user logins. Application-Level Authenticated sites tend to use standard Web forms to capture your authentication data, such as username and password. This data goes to the server as unencrypted clear text, unless you have used HTTPS. It’s important to note that Application-Level Authentication relies on Cookies to verify a user’s identity once he or she has logged onto a site. So Cookie Management solutions will only work with Application-Level Authenticated sites.

Cookies are bits of data that Web sites can store on your computer for a pre-determined period of time. There are two types of Cookies that Web developers can use: persistent and non-persistent. Non-persistent Cookies store temporary data on your browser, contain information particular to your current user session, and are destroyed when you quit your browser application. Persistent Cookies are set by the server along with an expiration date and stored on your browser until the expiration date is past, at which point the Cookie is overwritten by a new Cookie set by the same site, or the user deletes the Cookie.

Cookies can usually only be retrieved by the Web site that set them. The Web site developer may choose to encrypt Cookies. However, encryption only provides some anonymity for your Cookie Jar. Though the content of the Cookie itself cannot be determined, the URL of the site for which it is valid is available in clear text. For a detailed specification about Cookies, go to Cookie Central [http://www.cookiecentral.com], or read SCGPP’s “Cookie Crumb Trails.”

There are a few choices for managing your Cookies. The Cookie management software that came bundled with your computer gives you three options: Allow all Cookies, Disallow all Cookies, or Ask before setting each Cookie. There are disadvantages to disallowing all Cookies sent to your computer. You then eliminate the non-persistent Cookies, allowing Web applications to set session variables, which may affect your ability to log into some secured sites.

Cookie Management software can allow you a greater level of granular control over the Cookies that your browser accepts or rejects. Many versions of this software can automatically deny Cookies from specific sites while allowing them for other, trusted sites. These packages allow you to take advantage of the personalization features of sites such as Amazon.com [http://www.amazon.com] or Dow Jones Interactive [http://www.djinteractive.com], while denying the potential privacy risks of online advertising companies such as DoubleClick [http://www.doubleclick.com]. For more information, check some of the sites mentioned in the “Privacy Resources” sidebar and SCGPP’s “Control Your Cookie Consumption.”

Some security-conscious users also disable ActiveX, Java, and JavaScript on their browsers to avoid possible privacy violations. The disadvantage is that then you can’t view Java-enabled sites with these browsers. As an alternative, you might try running several browsers or versions of browsers, with different capabilities enabled. More information appears at SCGPP’s “Preventing Possible Web Intrusions.”

Recently there has been a lot of concern over the use of Clear GIFs as a way to violate a browser’s anonymity. GIF stands for the Graphics Interchange Format, a graphics file format invented by CompuServe to allow its users to view the graphics and online images of other users. While GIF has slowly been pushed from the Web imaging world by the PNG (portable network graphics) and JPEG (joint picture experts group) formats, it is still viewable within most Web browsers.

The way a Clear, or single pixel, GIF works is very similar to the monitoring methods used by online advertising agencies. A particular Web page has an embedded image URL that references a GIF on another server. That second server then has access to the information passed by users visiting the first Web site. For more detailed information about Clear GIFs, see SCGPP’s “Beware of Web Bugs & Clear GIFs.”

Even more frightening than Clear GIFs is the concept of Malicious Code. Malicious Code on the Internet is probably one of the single biggest threats to user online privacy. It most easily compares to a computer virus, like the LoveBug virus that hit the Internet on May 4, 2000. It was written using VBScript, which can run not only in Microsoft’s Exchange software but in Microsoft’s Internet Explorer software as well. Once your browser is infected, Malicious Code could also track your visits to various Web sites over the course of a session. For more information, see SCGPP’s “Keep It Sealed” and “When Viruses Attack.”

A Network Sniffer is a computing device, either hardware or software, that can analyze the packets of data traveling over a computer network. Sniffers, as the devices are often called, can capture traffic using a variety of filters, including all the traffic originating on a given browser, all the traffic traveling between a given browser and a specified server, or all the traffic received by a specified server. Once this traffic has been captured, it is fairly straightforward to view the contents of the traffic. In fact, most packet analysis software will even decode common encoding formats such as Base64.

By definition, a Proxy is the authority or power to act on another’s behalf; this is true on the Internet as well. A proxy is used to repackage a request from a browser for a given site to appear as though the package is coming from the proxy itself. At the same time a proxy can filter the content returned to the browser as well. Several variations on the Proxy server exist that allow you to Manage Cookies, block advertising, and limit certain types of data from being sent to your browser. For information on specific proxies and associated issues, check out ProxyMate [http://www.lpwa.com], the Internet Junk Buster [http://internet.junkbuster.com], the Department of Defense’s Onion Router project [http://www.onion-router.net], Anonymizer[http://www.anonymizer.com], or Zero Knowledge System’s Freedom [http://www.freedom.net]. You can test your Web anonymity at various sites, including Richard Smith’s Test Page for Web Anonymizing Services [http://www.tiac.net/users/smiths/anon/test.htm and http://www.anonymizer.com/3.0/index.shtml], clicking on “Who are you?”

Like shareware and freeware programs, Sponsored Software is free to download and appears to have little or no cost. In lieu of paying for the use of the commercial-quality software, such as Qualcomm’s Eudora [http://www.eudora.com], you may be able to run the program if you will agree to view advertisements. Many of these software publishers use software by companies such as Radiate (formerly Aureate Media) that can display ads on your computer even when you’re not surfing the Web. Using sponsored software, which still requires a connection through your ISP or company, is all that is needed. Since this software can report information about your habits, it has also been called an “adbot”or “spyware.” Recently questions have arisen as to whether the Radiate software can collect other information about your computer or your surfing habits even while you aren’t using the sponsored software, but studies performed using a Sniffer [http://kumite.com/myths/myths/myth036.htm] have proven this to be untrue.

The Web isn’t the only way that people communicate over the Internet; most people also use E-Mail. Remember that sending E-Mail is roughly analogous to sending a postcard using the traditional postal system. Neither type of message has its privacy guaranteed. The message is available to anyone who wishes to read it. In the physical world we have the option of sending a letter instead of a postcard, which hides the content of the message within an envelope, making it difficult for others to read the content. In the electronic world we do not have physical envelopes, but we do have a few options.

These options include encryption, e.g., using software such as PGP to make your e-mail unreadable to anyone but the intended recipient. Alternatively, you could choose to mask your identity by creating an alias or pseudonym using services such as Yahoo!’s Yahoo!Mail [http://mail.yahoo.com] or Microsoft’s HotMail [http://www.hotmail.com]. Or you might choose to take advantage of the several anonymous or semi-anonymous Remailer Services on the Internet. Lastly you can use one of the new generations of secure e-mail services such as Hush Corporation’s HushMail [http://www.hushmail.com] or ZipLip [http://www.ziplip.com], which enable secure encrypted messaging using your standard Web browser.

According to the History of PGP FAQ, PGP was created in the mid-1980s by Phil Zimmer. PGP is based upon a concept called Public Key Encryption in which each individual has two keys — one public and one private. The public key is made available to everyone with whom the user would like to exchange private e-mail. The private key should never be shared with anyone. If you encrypt a message to a user using that user’s PGP public key, then only that user could read the message using their private key. To use PGP, you need to have a “key ring” of all the public keys of individuals with whom you’d like to communicate privately.

Another benefit of using public key encryption is the ability to digitally sign a message. By using your private key to encrypt a message you make it unreadable to anyone who does not have your public key. Using the public key to decode the message guarantees the recipient that the message was signed with your private key. For more information, see SCGPP’s “Master the Art of PGP.”

A Remailer Service allows you to forward e-mail to anyone on the Internet while the remailer removes your return address information. Some remailers are truly anonymous and offer no way for the recipient to reply to the sender of the message, while others are considered pseudo-anonymous. The pseudo-anonymous servers do strip off the author’s name and e-mail address but instead of discarding this information completely, the servers keep the information on file in order to forward responses to the anonymous messages back to the originating author. A wealth of more information on Remailer Services appears at Andre Bacard’s site on Web and e-mail privacy [http://www.andrebacard.com/privacy.html].

However, only truly anonymous remailers can be considered secure, as the administrators of the pseudo-anonymous server anon.penet.fi learned. In 1996 the Church of Scientology, with the cooperation of Interpol, succeeding in forcing the administrators of anon.penet.fi to reveal the identity of an individual accused of distributing church secrets while hiding behind the anonymity of the remailer. More information about this incident appears at http://www2.thecia.net/~rnewman/scientology/anon/penet.html.

Information Professionals — What to Do About Privacy
Privacy/Confidentiality Audit
  • Check out your system. Test your security with free online evaluations from http://grc.com, http://www.onion-router.net/Tests.html or, for high-speed connections, http://www.dslreports.com/secureme. [See other sites in the “Privacy Resources” sidebar on page 42 and “Tech Talk” on page 44.]
  • Check out yourself and your company on the Net — search for your own name, your “nom de surf,” your co-workers, your company and its products/services using various search engines, http://www.deja.com and http://www.company.sleuth.com/index.cfm. You can also order a report from PrivacyScan for $39.95: http://www.privacyscan.com/orderreport.html
  • Check out your company’s privacy issues. Talk to the IT department; check policies; check actual practices, even records storage. (One worker found a bunch of improperly secured personnel records, copied information, and, later, many employees had their identities stolen and multiple charges on their credit cards.)
  • Check out your vendors, search engine/reference/online database sites, information services, etc. What are the policies on both privacy and usage? How is the security? What assurances do you have?
  • Keep up-to-date with new developments in technology, software, laws, etc., that affect privacy issues.


Implement Solutions

  • Consider using cookie manager, e-mail encryption, and anonymous surfing software.
  • Establish new “nom de surfs” from free Net e-mail services when necessary.
  • Use outside consultants who don’t have to identify their clients, especially when you don’t want others to know about your company’s interest in a subject. (By the way, in the spirit of full disclosure: One of the authors is a consultant who gets a considerable amount of business for this very reason.)
  • Anytime you’re asked to submit confidential information, check for SSL (https://…). Also, consider restarting your browser to create a new session before and after submitting such information to avoid clickstream capture.
  • When faced with decisions affecting privacy, such as, “Why keep your bookmarks, e-mail, and/or storage online?,” think through the risk-benefit equation.


Publicize/Evangelize

  • Use affirmative, proactive information privacy notification techniques in a manner similar to the use of copyright protection notices. For example, consider using headers on search output or e-mail. (“The information contained in this e-mail is confidential and is intended for the use of the individual or entity named above. If you have received this e-mail in error, please contact us. Thank you.”)
  • Educate others inside and outside your company. Have information ready to give to both the overly suspicious and the overly casual. Soothe paranoid users and frighten users not scared enough.
  • Formulate privacy guidelines. Ask search services and sites to agree to them.
  • Publicize that we information professionals ourselves conform to ethical guidelines in which confidentiality figures prominently (and make sure that we do): ALA Code of Ethics [http://www.ala.org/alaorg/oif/ethics.html]; AIIP Code of Ethical Business Practice [http://www.aiip.org/purethics.html]; ASIS Professional Guidelines [http://www.asis.org/AboutASIS/professional-guidelines.html]; and SCIP Code of Ethics for CI Professionals [http://www.scip.org/ci/ethics.html].
  • Vote your conscience with your ballot and your wallet when it comes to the politicization of privacy issues. Ask your colleagues and professional organizations to do the same.


Contents Searcher Home