My Rules of Information

Vol. 10 No. 1 — January 2002

• FEATURE •
My Rules of Information
by Marylaine Block • Editor, ExLibris, the weekly e-zine

Table of Contents

Previous Issues

Subscribe Now!

ITI Home

A few years back, just before doing my first bibliographic instruction session for a class of freshmen, I had to figure out what the few, most important things were we could teach them, the things we information professionals knew and the students didn't, the lessons that would make all the difference between finding and not finding what they needed. I emerged from my office with a piece of paper with four sentences on it: my four rules of information. I have added to them over the years, but the fact that I and my colleagues still know and practice them seems to me the signal difference between us and our users.

I didn't invent the rules. I merely codified them. Codification — another one of the things that information professionals routinely do when people ask them questions.

Rule One: Go Where It Is

Ordinary people may think that people like us, people good at finding information, succeed because we know special tricks to play with search engines. Instead, what we really know is that, for many questions, search engines won't help at all, because the information needed is not on the Internet. Perhaps it's hiding in a 1935 issue of Harper's, or an 1865 issue of the New York Times, or a book that compares and contrasts the administration of health insurance in various European nations, or an unpublished dissertation, or a proprietary market survey conducted by Proctor & Gamble, or a Senate committee hearing from 1965.

Whether consciously aware of it or not, when anybody asks us a question, the first thing we do is sort through our mental maps of the information territory. By the time we say, "Let's try running a search on MEDLINE," we've already assessed the user's need (information about a specific therapy for a medical condition) and knowledge-level (medical professional or student) and decided where the relevant and useful information can most likely be found (articles in the medical literature).

No matter what the question, we go through that same mapping and sorting process: asked for reproductions of art works, we go to our art encyclopedias or to the Internet; asked what a dollar would buy in 1966, we go to Historical Statistics or Statistical Abstracts or to ads in local newspapers from 1966. Different tools retrieve different kinds of information, and part of the art of librarianship is knowing which tool works best for each job.

When a librarian asked me for prime sources for information on the Delaware watershed, my first reaction was:

the Army Corps of Engineers.
the U.S. Fish and Wildlife Service.
the Environmental Protection Agency.
equivalent agencies for the state of Delaware.

Sensing a theme, I ran a search through searchgov.com, and sure enough, found a wealth of documents from those and other federal and state agencies. But reasoning that most of the people concerned with watershed problems would be scientists, I also used SciSeek.com to search through science sites on the Net, where I found a good deal of other information related to the environment, chemistry, and engineering of the watershed.

Then I ran the search through EBSCOhost to scan several full-text article databases, where I found articles drawn from both the science literature and travel and sport magazines.

Librarians also understand that formats are not interchangeable. Magazine and newspaper articles explain even complex subjects to lay readers in understandable language. Scholarly and professional journals publish original research (the mere words "research" or "study" in a user's question may automatically send us to a full-text journal database). But since research is necessarily confined to one very small, controllable aspect of an issue, it's like a puzzle piece. When we want to see the pattern the puzzle pieces make, or when we want a thorough backgrounder on a topic, we look for books, which generalize from and make sense of the original research. Government documents will supply statistics, laws, financial information, and even public records of who we are, what we own, where we've been.

We know the strengths and limitations of each format. The Internet is wonderful for pictures and demonstrations, for government documents, for FAQ files, for discussion forums, and for delivering full-text databases. But we know that it's next to useless for journal literature and government documents from before 1995; for this material, we still need to use our old indexes and our backfiles of periodicals. We also know better than to trust the authority and accuracy of the Internet. You may find a quote you seek on the Internet — even multiple versions of it — but do not count on finding a correct source for it. Our attitude toward Net sources is as suspicious as a Cold War warrior: Trust, but only after you verify.

We know who is most likely to produce various kinds of information. For most hardheaded statistical data, we'll start with Statistical Abstracts, but for softer lifestyle data, we may turn to professional marketing research done on behalf of advertisers who need to know how to pitch products. Asked how often American teenage boys take showers, I tried a full-text business periodicals database, looking for market research often published in magazines like American Demographics. (The entirely counter-intuitive answer, incidentally, is that over a third of teenage boys shower at least twice a day.)

We know that sometimes the best sources are ordinary people, individually or in groups, who are passionate about a subject. When our users want solid, authoritative information about diabetes, for example, we lead them to the Web page for the American Diabetes Association. When users want to talk to fellow sufferers about their experiences and what it's like to live with the disease, we lead them to support groups.

When we want to know whether a brand-new technology or game works, we listen in on discussion groups on the Net. And when a topic is utterly obscure, we go online, because the Net is the perfect place for eccentrics to share their passion for bagpipes, or medieval maps, or bad Scrabble hands.

Think of us searchers as travel agents in the world of information: we help our travelers get there faster because we know the best routes and whether to go by plane, train, or automobile.

Rule Two: The Answer You Get Depends on the Questions You Ask
Corollary: If You Don't Like the Answers, Change the Questions

People assume that librarians must know all the answers, but what we really know is how to ask good questions. We know how to slide up and down that continuum from general to narrow until we find the exact set of parameters that work.

One way we to work that slide is with language. If we don't find enough with a specific term, we move to the next level of generality; if we find too much, we look for more specific terms.

For instance, when asked to find research about whether fat people make less money than average-sized people do for the same work, some of the words we would try out might be obesity or weight, salary or wages or pay, discrimination or differential. Or we could use a more general statement: Obesity and employment discrimination, which would retrieve research on all varieties of discrimination: interviewing, salaries, evaluations, promotions, etc. Whichever combination of words we use, we know we will get a different set of results, so of course we will use all the logical terms that occur to us. What's more, when we hit pay dirt, we will use any new terms we pick up from the results to continue the search.

But we slide up and down that continuum in other ways, too. When we decide to search using subject headings, we start at the narrow end, trying to guarantee we only get documents entirely about our topic. When we're desperate to find anything at all — when what we need is damn fool luck — we start at the general end with a keyword search. Once we find something, we then use our best tricks to parlay it into more.

We start at the general end when we use OR terms like a drift net to catch every fish in a quarter-mile radius and work our way down to the narrow end by using AND terms to throw out the illegal fish.

When we decide to search a smaller chunk of the information universe — a card catalog, or MEDLINE, or a specialized search engine like searchgov.com — we're also searching at the narrow end of the continuum.

The risk in doing a narrow search is that we might miss germane sources that just don't carry the terms we had chosen or ones that the database or search engine used did not index. When we start at the broad end, we risk finding nothing but false drops, as when I tested out search engines by looking for information on the singer known simply as "E." (Try it. It ain't pretty.)

By sliding up and down that continuum, combining different terms, in different kinds of searches, in different kinds of sources, always suspecting there's something more to be found, we increase our chances of finding not just an answer for our patron, but a good answer.

Rule Three: The Answer Should Match the Information Need

Librarians need to understand not only the question, but the kind of answer that will make the patron happy. How much have we helped people when we give them answers that aren't the kind they wanted: an armload of books to someone who wanted an encyclopedia article, or a set of Web pages to someone who wanted verbal answers to specific questions, or arcane articles in medical journals to a patient who needs understandable information on a disease she'd just been diagnosed with, or a set of citations to someone who simply wanted to print out a few articles and take them home to study?

Take this as a given: Librarians are curious people who get lost in the thrill of the hunt. We can always follow clues more doggedly and find more information than the patron has any use for or interest in. Unless we're assisting a scholar with research, our problem is not finding information, it's knowing when to stop — with a polite suggestion, of course, that other avenues exist which users could pursue should they want more.

Rule Four: Research Is a Multi-Stage Process

Sometimes the hunt has to be indirect. To find information on that singer, E, I needed to start with a rock music encyclopedia or Web site. I went to the Universal Band List, where I got a biography, a discography, info on his current band, The Eels, their official Web site, and tour information.

And when someone does want to dig out every last jot and tittle of information on a topic, this brings out our bloodhound instincts and every bit of skill we have. First, we will go every place where we might conceivably find our answers, searching not just one database, but every plausible database. We will look in journal databases, Dissertation Abstracts, OCLC's WorldCat, indexes of conference papers, etc. We will scour the Internet, using not only general search engines but also specialized search engines and expert sites and invisible Web databases.

Whenever we find something, we scour it for further clues, following up items in bibliographies, searching for more work by those authors, finding the authors' e-mail addresses, doing citation searches to see who's quoting them. Whenever we pick up new terms to search with, we go back to places we've already searched and use them. When we find something that is just what our patron had in mind, we use whatever devices the databases or search engines allow — clickable subject headings or a "more like this" feature — to reshuffle the deck and find more items like that one.

Rule Five: Information Is Meaningless Until Queried by Human Intelligence

Fact: Sweden is the biggest user of catsup.

Fact: Fifty-one percent of St. Louis residents say they have never visited the Arch.

Fact: According to the NEC Research Institute, 1.5% of Web sites are pornographic.

Now that you know that, are you wiser or happier? In fact, is there any earthly reason for you to care? Without context those facts are not information but raw data. They become information only when we ask a question like one of these:

If I plan to market salsa in Sweden, what competition will I face?
Should St. Louis aim some of its tourist promotional advertising at its own citizens?
How big a problem is pornography on the Web? (Note that this question requires a lot more data than just this bit.)

The world is as full of data as it is of stuff: arrowheads and pottery shards and mollusk fossils, old letters and diaries, recipe books from the fifties, Lego sets, and Barbie dolls.

All are meaningless until somebody does something with it — asks a question, puts it together with other bits of data, thinks about its meaning, until somebody puts the shards together and discovers a past civilization, or finds in the tattered letters evidence of a political conspiracy, or learns from those old recipes when canned soups and other packaged foods infiltrated our cooking habits.

There is little point in randomly accumulating data unless you know what you want to do with it. You need to start with a question, or a thesis, preferably a target statement that tells you not only what information you will want to look at, but also what information will not help you at all. If you say you want to find out about the economic effects of patents, you may focus entirely on winners and losers.

This means you might ignore squabbles over patent protection and arguments about what is patentable. Your data will focus on stock prices and balance sheets and price lists.

Rule Six: Question Your Answers — Information May Be True But Still Wrong

I live in Davenport, Iowa. In May 2001, we hosted God knows how many network news reporters, all pointing their cameras at our baseball stadium, which was surrounded, and filled, by the Mississippi River. The cameras showed the nation our River Drive — inundated — and the small army of volunteers filling sandbags. Small wonder every relative I had phoned with offers to send waterwings, which I didn't need.

The reporters were telling the truth, as far as it went. What they neglected to do was tell the rest of the story, turn the cameras around, or even sideways. Had they done so, the nation would have realized that Davenport is built on a monster-sized hill, and that 99% of the city was unaffected by the flood, except for bad traffic conditions.

The day the flood crested, the sun was shining and I was watching a construction crew build a sunroom onto my house.

This is a cautionary tale. These were perfectly honest reporters. They weren't trying to distort the truth, but they did anyway. And remember, some of the sources we draw on, like politicians on either side telling the story of the year 2000 election drama in Florida, are trying to distort the truth, to make the data reflect their version of the truth. We need to understand that all our knowledge is incomplete and provisional, subject to change as new evidence and new theories come along. Thirty years ago, dinosaurs were cold-blooded, and now they're not. The dinosaurs didn't change, folks, but the evidence and the interpretation did. So we tend not to be too cocky about the "answers" we hand out to people.

We know enough to question our data. When we know that a zero result is not possible, we rethink our search strategy — did we misspell the word or the name? Are we looking in the wrong place? We question statistical data, asking "Sez who?" and "How do they know?" and "What was their method?" If somebody gives a precise number for the percentage of adult Americans who pick their noses, we know enough to wonder just how many people would answer that question honestly. We don't just settle for the first answer that comes along; we confirm, confirm, confirm.

Rule Seven: Ask a Librarian

Well, duh. Of course we ask librarians.

Because we know our collections cold.
Because sometimes people give up when the answers weren't in the places they expected to find them. (How often is the real question concealed behind the question, "Where's the Readers' Guide?")
Because we try to figure out the actual information need and fit it to the way our systems are organized.
Because we are better at thinking up and down a continuum — if we don't have books on Siamese cats, we do have books on cat breeds and cat care; we also have magazine indexes and databases that will find us articles on Siamese cats; we may even have the right sort of books in the children's collection where the patron didn't think to look.
Because we know how to make the databases sit up, roll over, and lick our faces. The fact that our users did not find an answer doesn't mean it doesn't exist. (The fact that we didn't might, however.)
Because, unlike our users, we start out with the gut-deep conviction that the answer exists, and by God, on our honor as librarians, we are going to find it.

The question is, how come hardly anybody but us seems to know that?

Did those rules strike a chord of recognition? They should have, because every good librarian I know operates by them all the time. Which explains why we can so consistently and easily make a few mystical passes over our catalogs and computers and come up with answers that amaze the laity.

I often suggested to students that information is a lot like pizza — the hungrier you are, the more you eat. The more thorough your search needs to be, the more you need to search through all the available resources. This is my guess about what formats of information occupy what percentage of the total of information cumulated over the past 3 centuries. I believe documents produced by local, national and international governments over the centuries are the largest single source of information, followed by books and periodicals. Even with well over a billion pages, and adding sites at the rate of millions a day, the Internet still has a lot of catching up to do before it can compete. The remaining small segments include things like dissertations, conference papers, videos, movies, photographs, maps, databases, etc.
Each individual piece of the information pizza can be sliced even thinner. Even a tiny slice like magazines or journals can be subdivided into all the different databases that index or abstract them — Medline, ERIC, Biological Abstracts, Agricola. If you really want a thorough search, you need to check every likely itty-bitty slice.
These are the original rules of information, pretty much as I scrawled them and copied them off. They've grown a bit since then.
1. Go where it is.
2. The answer you get depends on the question you ask.
3. Research is a multi-stage process.
4. Ask a librarian

Marylaine Block's e-mail address is mblock@netexpress.net

Table of Contents

Previous Issues

Subscribe Now!

ITI Home