|With the demise of so many dot-com companies in the past year, the
term "dot-com" has practically become an expletive that’s acceptable even
in the University of Oxford’s revered Bodleian Library (as in, "We lost
to Cambridge, dot-com it!"). But don’t British companies use the .co.uk
or .ac.uk suffixes in their URLs? Well, they do, but the best and
most savvy ones also have a .com suffix so that their sites will
be mirrored in the U.S.—e.g., ingenta.com or catchword.com.
In addition to those two splendid services, there’s xrefer.com, an excellent,
Oxford-based dot-com. This free service masterfully integrates the best
British ready-reference sources via cross-references and cross-links. There
are many excellent, free American dictionaries and encyclopedias that work
mostly in a stand-alone fashion (like the Merriam-Webster Dictionary),
and rarely in an aggregated mode (like Bartleby.com). xrefer goes farther
than any of the other aggregators of ready-reference sources by integrating,
not just aggregating, 50 or so awesome sources through cross-referencing
One of the basic tenets of professional searching is that you must
search several sources to get sufficiently comprehensive results. Within
a certain discipline, Database A may cover many journals that Database
B doesn’t, and vice versa. That’s the reason that more and more professional
online services finally offer multiple database searching simultaneously—a
feature that Dialog introduced nearly 2 decades ago. It’s also the reason
for the popularity of the metasearch engines that simultaneously run the
query against several search engines, as well as for the fairly wide support
of the CrossRef project (http://www.crossref.org)
among publishers that link to each other’s full-text archives. Much of
the cross-linking between these archives is based on SFX software (http://www.sfxit.com),
the brainchild of Herbert Van de Sompel, a professor at Cornell who developed
it while still a doctoral student.
xrefer.com is a different story, as the company is building this ready-reference
mega-database in house from the files supplied by the original content
creators, weaving their local web of cross-references and cross-links.
Complement and Corroborate
In ready-reference services, simultaneously using multiple sources
not only enhances the information from one source with further details
from another, but also corroborates the facts reported by one source. The
Bartleby.com site—which deserves praise for pioneering the creation of
a high-quality reference suite—offers searching in one fell swoop. Among
many others, it contains the most current editions of the American Heritage
Dictionary; Brewer’s Dictionary of Phrase and Fable; the just-updated (in
February) Columbia Encyclopedia, sixth edition; various Roget’s thesauri;
the CIA’s World Factbook; and two quotation dictionaries.
So, a search in Bartleby.com about, say, the centaur Nessus brings up
an entry about him from the encyclopedia, as well as another about Hercules,
a quote from Antony and Cleopatra from Bartlett’s Familiar Quotations
("Now the shirt of Nessus is upon me"), and several entries from the Dictionary
of Phrase and Fable—along with irrelevant hits.
The irrelevant hits are retrieved because, for example, your search
term appears on the top of an unrelated page showing the previous entry
word (the so-called "guide word"). That’s why the entry about "nest" is
retrieved for Nessus (see Figure 1)—because
it happens to follow the entry about Nessus, even though it’s on a separate
Another useful suite, the Infoplease reference collection, doesn’t make
this mistake when searching the same Columbia Encyclopedia, but it shows
the entry about Hercules twice—for unknown reasons—along with an impeccable
definition for Nessus from the Random House Webster’s Unabridged Dictionary.
(See Figure 2.)
In xrefer you may also find false drops that are unavoidable, especially
in single-word searches, such as those results that list people whose first,
middle, or last name is Hercules, not to mention the C-130 Hercules military
airplane. However, these false drops are easy to spot in the well-laid-out
short-entry results list, so you can skip them and focus on the obviously
relevant items. When you look up the detailed entries, you’ll find a list
guaranteed to be related to the one you’re looking at, either from the
same source or from one of the 50 other sources. The first five related
terms are shown automatically along with the source, and the rest may be
displayed optionally. Those entries will then have their own related terms,
and thus you are taken on an intellectually fascinating journey, from which
you can disembark anytime, anywhere.
The key to this guaranteed relevance is that the cross-references are
selected by humans, the editors of those dictionaries; encyclopedias; and
guides. Assuming that the editors did a good job (and with these sources
you can rest assured they did), the users will be offered actual related
links. But the editors who decide the related terms for the indexes in
the back of the printed editions do so only within the boundaries of the
dictionary or encyclopedia they edit, don’t they? Yes, but here’s where
the creators of xrefer.com demonstrate their cleverness. They’ve hatched
the idea of creating—in their parlance—xreferences (i.e., cross-references
and cross-links among various ready-reference sources).
Because the company has filed for a patent, the process of creating
the cross-links is not explained. Therefore, I’m only speculating. The
digital formats of the dictionaries and encyclopedias probably include
the indexes of subject terms, personal names, and location names that typically
appear at the back of the printed versions. Finding exact matches, close
matches, and see-also references and creating a matrix of these is what
computers are meant for.
Presumably, beyond the special characters that control the typography
in printing as well as denote the function of the word (adjective, verb,
etc.) within the entries (whether they follow XML, SGML, or any other syntax),
these formats also enhance this relationship matrix significantly. Again,
extracting and mapping cross-references and cross-links from 500,000 entries
is a piece of cake for software, once the algorithm is defined by humans.
Obviously, the more ready-reference sources xrefer.com licenses, the larger,
smarter, and richer this matrix of cross-references and cross-links will
be. And the collection is already a jolly good one.
The xrefer Collection
There are more than 50 ready-reference sources in xrefer, and each
one is better than the next. The publishers include Oxford University Press,
Penguin, Bloomsbury, Grove, and Macmillan. That would make any good reference
librarian salivate; the titles themselves would make his or her heart beat
faster. Among the sources you’ll find are such gems from Oxford University
Press as its English Reference Dictionary; Paperback Encyclopedia; and
the dictionary series of Music, Art, Law, Geography, Business, Biology,
Earth Sciences, and Quotations.
Bloomsbury offers its Bloomsbury Thesaurus, Bloomsbury Thematic Dictionary
of Quotations, Bloomsbury Biographical Dictionary of Quotations, Bloomsbury
Guide to Art, and Bloomsbury Dictionary of Contemporary Slang. Macmillan’s
Encyclopedia 2001 and the Macmillan Dictionary of Women’s Biography are
rounded out with much-respected sources from its imprint, Grove: the New
Grove Concise Dictionary of Music and the New Grove Dictionary of Jazz.
If that weren’t enough, Penguin checks in with The Penguin Encyclopedia
of Places, The Penguin Biographical Dictionary of Women, The Penguin Dictionary
of Sociology, The Penguin Dictionary of Psychology, and The Penguin Business
If all this sounds like a dream team, it is. And this isn’t the whole
lineup, just a sampler. Up until now, only British ready-reference sources
were included in xrefer, although xrefer.com made a deal with Houghton-Mifflin
and launched three of its dictionaries, including the excellent American
Heritage Dictionary of Idioms. Incorporating American dictionaries and
encyclopedias opens another wide spectrum for xrefer’s expansion and cross-linking.
It’s quite telling that xrefer reveals more than 200 hits for the term
"Oxford," including many that have entries about someone who studied in
Oxford. The knee-jerk reaction to and the common association with Oxford
is England, Oxford University, education, Oxford University Press, the
Bodleian Library, and well, Cambridge, just as the xreferences would show
you. (See Figure 3.) Many who visit the
xrefer Web site will perhaps also start associating Oxford with Web publishing.
As for Cambridge, next month I’ll discuss it in this column. No, not Oxford’s
archrival, but Cambridge Scientific Abstracts, Inc., which apparently has
implemented a very smart Web-publishing strategy that may take aggregation
and integration to yet a higher level.
Péter Jacsó is associate professor of library and information
science at the University of Hawaii’s Department of Information and Computer
Sciences. His e-mail address is email@example.com.