Information Today
Volume 18, Issue 4 — April 2001
Table of Contents Previous Issues Subscribe Now! ITI Home
• Internet Insights •
xrefer: An Xcellent British Dot-Com
This free service integrates the best U.K. ready-reference sources
by Péter Jacsó

With the demise of so many dot-com companies in the past year, the term "dot-com" has practically become an expletive that’s acceptable even in the University of Oxford’s revered Bodleian Library (as in, "We lost to Cambridge, dot-com it!"). But don’t British companies use the or suffixes in their URLs? Well, they do, but the best and most savvy ones also have a .com suffix so that their sites will be mirrored in the U.S.—e.g., or

In addition to those two splendid services, there’s, an excellent, Oxford-based dot-com. This free service masterfully integrates the best British ready-reference sources via cross-references and cross-links. There are many excellent, free American dictionaries and encyclopedias that work mostly in a stand-alone fashion (like the Merriam-Webster Dictionary), and rarely in an aggregated mode (like xrefer goes farther than any of the other aggregators of ready-reference sources by integrating, not just aggregating, 50 or so awesome sources through cross-referencing and cross-linking.

Cross-Referencing, Cross-Linking
One of the basic tenets of professional searching is that you must search several sources to get sufficiently comprehensive results. Within a certain discipline, Database A may cover many journals that Database B doesn’t, and vice versa. That’s the reason that more and more professional online services finally offer multiple database searching simultaneously—a feature that Dialog introduced nearly 2 decades ago. It’s also the reason for the popularity of the metasearch engines that simultaneously run the query against several search engines, as well as for the fairly wide support of the CrossRef project ( among publishers that link to each other’s full-text archives. Much of the cross-linking between these archives is based on SFX software (, the brainchild of Herbert Van de Sompel, a professor at Cornell who developed it while still a doctoral student. is a different story, as the company is building this ready-reference mega-database in house from the files supplied by the original content creators, weaving their local web of cross-references and cross-links.

Complement and Corroborate
In ready-reference services, simultaneously using multiple sources not only enhances the information from one source with further details from another, but also corroborates the facts reported by one source. The site—which deserves praise for pioneering the creation of a high-quality reference suite—offers searching in one fell swoop. Among many others, it contains the most current editions of the American Heritage Dictionary; Brewer’s Dictionary of Phrase and Fable; the just-updated (in February) Columbia Encyclopedia, sixth edition; various Roget’s thesauri; the CIA’s World Factbook; and two quotation dictionaries.

So, a search in about, say, the centaur Nessus brings up an entry about him from the encyclopedia, as well as another about Hercules, a quote from Antony and Cleopatra from Bartlett’s Familiar Quotations ("Now the shirt of Nessus is upon me"), and several entries from the Dictionary of Phrase and Fable—along with irrelevant hits.

The irrelevant hits are retrieved because, for example, your search term appears on the top of an unrelated page showing the previous entry word (the so-called "guide word"). That’s why the entry about "nest" is retrieved for Nessus (see Figure 1)—because it happens to follow the entry about Nessus, even though it’s on a separate Web page.

Another useful suite, the Infoplease reference collection, doesn’t make this mistake when searching the same Columbia Encyclopedia, but it shows the entry about Hercules twice—for unknown reasons—along with an impeccable definition for Nessus from the Random House Webster’s Unabridged Dictionary. (See Figure 2.)

In xrefer you may also find false drops that are unavoidable, especially in single-word searches, such as those results that list people whose first, middle, or last name is Hercules, not to mention the C-130 Hercules military airplane. However, these false drops are easy to spot in the well-laid-out short-entry results list, so you can skip them and focus on the obviously relevant items. When you look up the detailed entries, you’ll find a list guaranteed to be related to the one you’re looking at, either from the same source or from one of the 50 other sources. The first five related terms are shown automatically along with the source, and the rest may be displayed optionally. Those entries will then have their own related terms, and thus you are taken on an intellectually fascinating journey, from which you can disembark anytime, anywhere.

The key to this guaranteed relevance is that the cross-references are selected by humans, the editors of those dictionaries; encyclopedias; and guides. Assuming that the editors did a good job (and with these sources you can rest assured they did), the users will be offered actual related links. But the editors who decide the related terms for the indexes in the back of the printed editions do so only within the boundaries of the dictionary or encyclopedia they edit, don’t they? Yes, but here’s where the creators of demonstrate their cleverness. They’ve hatched the idea of creating—in their parlance—xreferences (i.e., cross-references and cross-links among various ready-reference sources).

Because the company has filed for a patent, the process of creating the cross-links is not explained. Therefore, I’m only speculating. The digital formats of the dictionaries and encyclopedias probably include the indexes of subject terms, personal names, and location names that typically appear at the back of the printed versions. Finding exact matches, close matches, and see-also references and creating a matrix of these is what computers are meant for.

Presumably, beyond the special characters that control the typography in printing as well as denote the function of the word (adjective, verb, etc.) within the entries (whether they follow XML, SGML, or any other syntax), these formats also enhance this relationship matrix significantly. Again, extracting and mapping cross-references and cross-links from 500,000 entries is a piece of cake for software, once the algorithm is defined by humans. Obviously, the more ready-reference sources licenses, the larger, smarter, and richer this matrix of cross-references and cross-links will be. And the collection is already a jolly good one.

The xrefer Collection
There are more than 50 ready-reference sources in xrefer, and each one is better than the next. The publishers include Oxford University Press, Penguin, Bloomsbury, Grove, and Macmillan. That would make any good reference librarian salivate; the titles themselves would make his or her heart beat faster. Among the sources you’ll find are such gems from Oxford University Press as its English Reference Dictionary; Paperback Encyclopedia; and the dictionary series of Music, Art, Law, Geography, Business, Biology, Earth Sciences, and Quotations.

Bloomsbury offers its Bloomsbury Thesaurus, Bloomsbury Thematic Dictionary of Quotations, Bloomsbury Biographical Dictionary of Quotations, Bloomsbury Guide to Art, and Bloomsbury Dictionary of Contemporary Slang. Macmillan’s Encyclopedia 2001 and the Macmillan Dictionary of Women’s Biography are rounded out with much-respected sources from its imprint, Grove: the New Grove Concise Dictionary of Music and the New Grove Dictionary of Jazz.

If that weren’t enough, Penguin checks in with The Penguin Encyclopedia of Places, The Penguin Biographical Dictionary of Women, The Penguin Dictionary of Sociology, The Penguin Dictionary of Psychology, and The Penguin Business Dictionary.

If all this sounds like a dream team, it is. And this isn’t the whole lineup, just a sampler. Up until now, only British ready-reference sources were included in xrefer, although made a deal with Houghton-Mifflin and launched three of its dictionaries, including the excellent American Heritage Dictionary of Idioms. Incorporating American dictionaries and encyclopedias opens another wide spectrum for xrefer’s expansion and cross-linking.

It’s quite telling that xrefer reveals more than 200 hits for the term "Oxford," including many that have entries about someone who studied in Oxford. The knee-jerk reaction to and the common association with Oxford is England, Oxford University, education, Oxford University Press, the Bodleian Library, and well, Cambridge, just as the xreferences would show you. (See Figure 3.) Many who visit the xrefer Web site will perhaps also start associating Oxford with Web publishing. As for Cambridge, next month I’ll discuss it in this column. No, not Oxford’s archrival, but Cambridge Scientific Abstracts, Inc., which apparently has implemented a very smart Web-publishing strategy that may take aggregation and integration to yet a higher level.

Péter Jacsó is associate professor of library and information science at the University of Hawaii’s Department of Information and Computer Sciences. His e-mail address is

Table of Contents Previous Issues Subscribe Now! ITI Home
© 2001 Information Today, Inc. Home