few years back, just before doing my first bibliographic instruction session
for a class of freshmen, I had to figure out what the few, most important
things were we could teach them, the things we information professionals
knew and the students didn't, the lessons that would make all the difference
between finding and not finding what they needed. I emerged from my office
with a piece of paper with four sentences on it: my four rules of information.
I have added to them over the years, but the fact that I and my colleagues
still know and practice them seems to me the signal difference between
us and our users.
I didn't invent
the rules. I merely codified them. Codification — another one of the things
that information professionals routinely do when people ask them questions.
Rule One: Go Where It Is
may think that people like us, people good at finding information, succeed
because we know special tricks to play with search engines. Instead, what
we really know is that, for many questions, search engines won't help at
all, because the information needed is not on the Internet. Perhaps
it's hiding in a 1935 issue of Harper's, or an 1865 issue of the
York Times, or a book that compares and contrasts the administration
of health insurance in various European nations, or an unpublished dissertation,
or a proprietary market survey conducted by Proctor & Gamble, or a
Senate committee hearing from 1965.
aware of it or not, when anybody asks us a question, the first thing we
do is sort through our mental maps of the information territory. By the
time we say, "Let's try running a search on MEDLINE," we've already assessed
the user's need (information about a specific therapy for a medical condition)
and knowledge-level (medical professional or student) and decided where
the relevant and useful information can most likely be found (articles
in the medical literature).
No matter what
the question, we go through that same mapping and sorting process: asked
for reproductions of art works, we go to our art encyclopedias or to the
Internet; asked what a dollar would buy in 1966, we go to Historical
Statistics or Statistical Abstracts or to ads in local newspapers
from 1966. Different tools retrieve different kinds of information, and
part of the art of librarianship is knowing which tool works best for each
When a librarian
asked me for prime sources for information on the Delaware watershed, my
first reaction was:
Sensing a theme, I
ran a search through searchgov.com, and sure enough, found a wealth of
documents from those and other federal and state agencies. But reasoning
that most of the people concerned with watershed problems would be scientists,
I also used SciSeek.com to search through science sites on the Net, where
I found a good deal of other information related to the environment, chemistry,
and engineering of the watershed.
the Army Corps of
the U.S. Fish and
for the state of Delaware.
Then I ran the
search through EBSCOhost to scan several full-text article databases,
where I found articles drawn from both the science literature and travel
and sport magazines.
understand that formats are not interchangeable. Magazine and newspaper
articles explain even complex subjects to lay readers in understandable
language. Scholarly and professional journals publish original research
(the mere words "research" or "study" in a user's question may automatically
send us to a full-text journal database). But since research is necessarily
confined to one very small, controllable aspect of an issue, it's like
a puzzle piece. When we want to see the pattern the puzzle pieces make,
or when we want a thorough backgrounder on a topic, we look for books,
which generalize from and make sense of the original research. Government
documents will supply statistics, laws, financial information, and even
public records of who we are, what we own, where we've been.
We know the strengths
and limitations of each format. The Internet is wonderful for pictures
and demonstrations, for government documents, for FAQ files, for discussion
forums, and for delivering full-text databases. But we know that it's next
to useless for journal literature and government documents from before
1995; for this material, we still need to use our old indexes and our backfiles
of periodicals. We also know better than to trust the authority and accuracy
of the Internet. You may find a quote you seek on the Internet — even multiple
versions of it — but do not count on finding a correct source for it. Our
attitude toward Net sources is as suspicious as a Cold War warrior: Trust,
but only after you verify.
We know who is
most likely to produce various kinds of information. For most hardheaded
statistical data, we'll start with Statistical Abstracts, but for
softer lifestyle data, we may turn to professional marketing research done
on behalf of advertisers who need to know how to pitch products. Asked
how often American teenage boys take showers, I tried a full-text business
periodicals database, looking for market research often published in magazines
like American Demographics. (The entirely counter-intuitive answer,
incidentally, is that over a third of teenage boys shower at least twice
We know that sometimes
the best sources are ordinary people, individually or in groups, who are
passionate about a subject. When our users want solid, authoritative information
about diabetes, for example, we lead them to the Web page for the American
Diabetes Association. When users want to talk to fellow sufferers about
their experiences and what it's like to live with the disease, we lead
them to support groups.
When we want to
know whether a brand-new technology or game works, we listen in on discussion
groups on the Net. And when a topic is utterly obscure, we go online, because
the Net is the perfect place for eccentrics to share their passion for
bagpipes, or medieval maps, or bad Scrabble hands.
Think of us searchers
as travel agents in the world of information: we help our travelers get
there faster because we know the best routes and whether to go by plane,
train, or automobile.
Rule Two: The Answer You Get
Depends on the Questions You Ask
If You Don't Like the Answers, Change the Questions
People assume that
librarians must know all the answers, but what we really know is how to
ask good questions. We know how to slide up and down that continuum from
general to narrow until we find the exact set of parameters that work.
One way we to work
that slide is with language. If we don't find enough with a specific term,
we move to the next level of generality; if we find too much, we look for
more specific terms.
For instance, when
asked to find research about whether fat people make less money than average-sized
people do for the same work, some of the words we would try out might be
or weight, salary or wages or pay,
Or we could use a more general statement: Obesity and employment discrimination,
which would retrieve research on all varieties of discrimination: interviewing,
salaries, evaluations, promotions, etc. Whichever combination of words
we use, we know we will get a different set of results, so of course we
will use all the logical terms that occur to us. What's more, when we hit
pay dirt, we will use any new terms we pick up from the results to continue
But we slide up
and down that continuum in other ways, too. When we decide to search using
subject headings, we start at the narrow end, trying to guarantee we only
get documents entirely about our topic. When we're desperate to
find anything at all — when what we need is damn fool luck — we start at
the general end with a keyword search. Once we find something, we then
use our best tricks to parlay it into more.
We start at the
general end when we use OR terms like a drift net to catch every fish in
a quarter-mile radius and work our way down to the narrow end by using
AND terms to throw out the illegal fish.
When we decide
to search a smaller chunk of the information universe — a card catalog,
or MEDLINE, or a specialized search engine like searchgov.com — we're also
searching at the narrow end of the continuum.
The risk in doing
a narrow search is that we might miss germane sources that just don't carry
the terms we had chosen or ones that the database or search engine used
did not index. When we start at the broad end, we risk finding nothing
but false drops, as when I tested out search engines by looking for information
on the singer known simply as "E." (Try it. It ain't pretty.)
By sliding up and
down that continuum, combining different terms, in different kinds of searches,
in different kinds of sources, always suspecting there's something more
to be found, we increase our chances of finding not just an answer
for our patron, but a good answer.
Rule Three: The Answer Should
Match the Information Need
to understand not only the question, but the kind of answer that will make
the patron happy. How much have we helped people when we give them answers
that aren't the kind they wanted: an armload of books to someone who wanted
an encyclopedia article, or a set of Web pages to someone who wanted verbal
answers to specific questions, or arcane articles in medical journals to
a patient who needs understandable information on a disease she'd just
been diagnosed with, or a set of citations to someone who simply wanted
to print out a few articles and take them home to study?
Take this as a
given: Librarians are curious people who get lost in the thrill of the
hunt. We can always follow clues more doggedly and find more information
than the patron has any use for or interest in. Unless we're assisting
a scholar with research, our problem is not finding information, it's knowing
when to stop — with a polite suggestion, of course, that other avenues
exist which users could pursue should they want more.
Rule Four: Research Is a Multi-Stage
Sometimes the hunt
has to be indirect. To find information on that singer, E, I needed to
start with a rock music encyclopedia or Web site. I went to the Universal
Band List, where I got a biography, a discography, info on his current
band, The Eels, their official Web site, and tour information.
And when someone
want to dig out every last jot and tittle of information on a topic, this
brings out our bloodhound instincts and every bit of skill we have. First,
we will go every place where we might conceivably find our answers, searching
not just one database, but every plausible database. We will look in journal
databases, Dissertation Abstracts, OCLC's WorldCat, indexes of conference
papers, etc. We will scour the Internet, using not only general search
engines but also specialized search engines and expert sites and invisible
Whenever we find
something, we scour it for further clues, following up items in bibliographies,
searching for more work by those authors, finding the authors' e-mail addresses,
doing citation searches to see who's quoting them. Whenever we pick up
new terms to search with, we go back to places we've already searched and
use them. When we find something that is just what our patron had in mind,
we use whatever devices the databases or search engines allow — clickable
subject headings or a "more like this" feature — to reshuffle the deck
and find more items like that one.
Rule Five: Information Is
Meaningless Until Queried by Human Intelligence
Fact: Sweden is
the biggest user of catsup.
percent of St. Louis residents say they have never visited the Arch.
to the NEC Research Institute, 1.5% of Web sites are pornographic.
Now that you know
that, are you wiser or happier? In fact, is there any earthly reason for
you to care? Without context those facts are not information but raw data.
They become information only when we ask a question like one of these:
The world is as full
of data as it is of stuff: arrowheads and pottery shards and mollusk fossils,
old letters and diaries, recipe books from the fifties, Lego sets, and
If I plan to market
salsa in Sweden, what competition will I face?
Should St. Louis aim
some of its tourist promotional advertising at its own citizens?
How big a problem
is pornography on the Web? (Note that this question requires a lot
more data than just this bit.)
All are meaningless
until somebody does something with it — asks a question, puts it
together with other bits of data, thinks about its meaning, until
somebody puts the shards together and discovers a past civilization, or
finds in the tattered letters evidence of a political conspiracy, or learns
from those old recipes when canned soups and other packaged foods infiltrated
our cooking habits.
There is little
point in randomly accumulating data unless you know what you want to do
with it. You need to start with a question, or a thesis, preferably a target
statement that tells you not only what information you will want
to look at, but also what information will not help you at all. If you
say you want to find out about the economic effects of patents, you may
focus entirely on winners and losers.
This means you
might ignore squabbles over patent protection and arguments about what
is patentable. Your data will focus on stock prices and balance sheets
and price lists.
Rule Six: Question Your Answers
— Information May Be True But Still Wrong
I live in Davenport,
Iowa. In May 2001, we hosted God knows how many network news reporters,
all pointing their cameras at our baseball stadium, which was surrounded,
and filled, by the Mississippi River. The cameras showed the nation our
River Drive — inundated — and the small army of volunteers filling sandbags.
Small wonder every relative I had phoned with offers to send waterwings,
which I didn't need.
The reporters were
telling the truth, as far as it went. What they neglected to do was tell
the rest of the story, turn the cameras around, or even sideways. Had they
done so, the nation would have realized that Davenport is built on a monster-sized
hill, and that 99% of the city was unaffected by the flood, except for
bad traffic conditions.
The day the flood
crested, the sun was shining and I was watching a construction crew build
a sunroom onto my house.
This is a cautionary
tale. These were perfectly honest reporters. They weren't trying to distort
the truth, but they did anyway. And remember, some of the sources we draw
on, like politicians on either side telling the story of the year 2000
election drama in Florida, are trying to distort the truth, to make
the data reflect their version of the truth. We need to understand that
our knowledge is incomplete and provisional, subject to change as new evidence
and new theories come along. Thirty years ago, dinosaurs were cold-blooded,
and now they're not. The dinosaurs didn't change, folks, but the evidence
and the interpretation did. So we tend not to be too cocky about the "answers"
we hand out to people.
We know enough
to question our data. When we know that a zero result is not possible,
we rethink our search strategy — did we misspell the word or the name?
Are we looking in the wrong place? We question statistical data, asking
"Sez who?" and "How do they know?" and "What was their method?" If somebody
gives a precise number for the percentage of adult Americans who pick their
noses, we know enough to wonder just how many people would answer that
question honestly. We don't just settle for the first answer that comes
along; we confirm, confirm, confirm.
Rule Seven: Ask a Librarian
Well, duh. Of course
we ask librarians.
The question is, how
come hardly anybody but us seems to know that?
Because we know our
people give up when the answers weren't in the places they expected to
find them. (How often is the real question concealed behind the question,
"Where's the Readers' Guide?")
Because we try to
figure out the actual information need and fit it to the way our systems
Because we are better
at thinking up and down a continuum — if we don't have books on Siamese
cats, we do have books on cat breeds and cat care; we also have magazine
indexes and databases that will find us articles on Siamese cats; we may
even have the right sort of books in the children's collection where the
patron didn't think to look.
Because we know how
to make the databases sit up, roll over, and lick our faces. The fact that
our users did not find an answer doesn't mean it doesn't exist. (The fact
that we didn't might, however.)
Because, unlike our
users, we start out with the gut-deep conviction that the answer exists,
and by God, on our honor as librarians, we are going to find it.
Did those rules
strike a chord of recognition? They should have, because every good librarian
I know operates by them all the time. Which explains why we can so consistently
and easily make a few mystical passes over our catalogs and computers and
come up with answers that amaze the laity.