The Ebb and Flow of Reference Products
Volume 38, Number 4 - July/August 2014

The More Things Change …

Controversies about accuracies have swirled for a long time. A very traditional authoritative tool, the Dictionary of American Biography, for years included an entry based on a hoax book claiming a nonexistent diary of Horatio Alger. Although the hoax was repaired in later supplements to DAB, the printed indexes crossing the main work and supplements were difficult to track and use. Recent controversies have surrounded the Diccionario Biográfico Español and itstreatment of Franco supporters. A comparison of accuracy in science topics in Wikipedia vs. the Encyclopedia Britannica, published in Nature in December 2005 [1] with several follow-ups, reported a small but similar amount of serious errors in both encyclopedias. All reference sources are biased, and all will have some errors as long as humans have any involvement in their creation. Errors (or hoaxes or biases) may be noted, commented on, and corrected more quickly and effectively in the digital era, but long-time observers are not ready to declare the erosion of quality based on quantity of inaccuracies. The biggest difference between free, socially managed sites and editorially controlled sites lies in the frequency of data correction and, for better or worse, in the stability of content presented.

Publishers have used the "updated" label with no or minimally updated content in both print and digitized sources. The Johns Hopkins Guide[s] to Literary Theory and Criticism in 2001 and 2005 provide evidence of minimal change across editions. Most language dictionaries fail to provide much added value in subsequent editions.

Products change publisher or change names with the same publisher. For example, Passport GMI is still recognized by many librarians under its old name of Euromonitor. URLs often change through sale to another publisher or reorganization of server addresses. The internet is the ultimate speedy and affordable vanity press, but the quality of content in vanity press items is probably not different across the print vs. online eras.

Volatility in the publishing industry may contribute to an appearance of erosion among reference sources. Titles change publishers; a recent notable example is the Statistical Abstract of the United States, a victim of government agency cost-cutting, picked up by ProQuest to the gratitude of many librarians. But these economic realities affect all publications, not just reference sources.

Veteran collection managers can point both to presses they no longer buy on autopilot as well as presses that now command attention in a subject area. So what may seem like a reduction in quality is often merely a moveable feast independent of the print vs. online debate. We will continue to count on expert librarians to guide the rest of us in tracking this evolution.

Grey literature still exists in the online world. Depending on one's frame of reference, informal items such as blogs that are not indexed in traditional indexing and abstracting services or any indexing or full text inaccessible behind a paywall can be considered "grey." Some users consider any item not online as "grey" or, another favorite label, "fugitive."

Proper indexing of content plays a major role in determining whether a reference source is useful and therefore valuable. An excellent controlled vocabulary managed and applied by humans is considered by many as an indicator of a quality reference work. Machine indexing of uncontrolled full-text keywords is easy to produce but does not always provide the subtlety needed or desired, such as with The New York Times. Errors creep in with either situation.

When publishers share their metadata with discovery tool producers, some of the problem of siloed information, trapped in the platform of each publisher, subsides. However, the haystack issue rears again at the expense of the nuance from product-specific schema. If publishers do not share metadata, discovery tool users will not connect to those products. Then, as usage figures drop, publishers may see libraries cancel subscriptions.

Keeping track of which reference sources are available to a user community—print or online, free or owned/licensed, by subject—is increasingly difficult. LibGuides work better than catalogs for easily managing tools in a variety of formats, but uneven quality across guides may result when the task is assigned to multiple topic specialists. Challenges for libraries include choosing whether to include records for free sources in the catalog, managing record dumps if available for titles in packages, managing titles for which records are not available, tracking whether records are available in the catalog or in the discovery system but not both, and publicizing the resources offered. Sources may be useless if our users can't link into them.

The Future of Reference Sources

Studies such as "Quality of Health-Related Online Search Results" (Kitchens, Brent, Christopher A. Harle, and Shengli Li, Decision Support Systems, Vol. 57, January 2014, pp. 454–462) identify low-quality web sources. Alert publishers at all levels—commercial, governmental, other not-for-profit—can track similar studies to find topics that merit production of higher-quality reference sources.

Many librarians would like to recommend a blend of human and machine indexing for best results. Human indexing includes the development, maintenance, and application of a subject thesaurus or controlled vocabulary to link across synonyms and concepts. Structured wayfinding is critical, and breadcrumbs are useful. Most products would benefit from continuous improvement, asking a small focus group to quickly recommend the options to include in drop-down and limit menus. For example, Knovel needs a way to limit to dictionary entries (Figure 11, p. 51), and Oxford Reference obscures its format limits to the novice user (Figure 12, p. 51). Linked metadata and semantic indexing may help overcome some of the hurdles of poorly developed sources. Indexing of blogs, where comments matter, and linking of blog entries to and from the original full text under discussion would assist users in tracking context and error corrections.

Linking across platforms should increase. The new Web of Science platform provides a "look up full text" button that links into Google Scholar, and Google Scholar results can be set to provide links into the Web of Science (Figure 13 and Figure 14, p. 51). These seamless linkages provide users with opportunities to flow across sources and perhaps land in better-vetted ones without noticing.

Finding and reusing data is the next frontier in reference. Semantically embedded data is getting indexed in ProQuest's "deep indexing" of figures and tables (Figure 15, p. 51) and SpringerImages. Both offer a fabulous way of bringing out data included in journal articles. Adding data analysis tools would assist users in reaching conclusions based on selected data found in reference sources. Publishers must build and manage local and meta-indexes of data and documents stored in institutional and discipline-based repositories. Next on the list is a method of indexing the externally linked data stored in supplementary files or in data repositories, as well as the data embedded in the article. Retrieving isolated data presents the same contextual issues for users that we see with keyword searching. Data is most useful when available in both human- and machine-readable formats for findability and portability.

Developing content and interfaces to facilitate access from mobile devices is critical to publisher longevity. In addition, publishers must allow remote authentication to work with apps so users do not have to pay for material in an app available to them through a library website, such as the Oxford dictionaries. If users can't read authoritative content on their mobile devices, they will turn to other content that is readily viewable. Adding these functionalities will help keep online reference sources useful and used in the future.

Final Thoughts

As seen above, newer and better types of reference products are possible in the online environment. Librarians are counting on creative minds to continue developing new reference tools and added functionality.

As always, users must apply their own standards of credibility and trust in an information source Librarians can only hope that users will continue to prefer authoritative and vetted sources for higher-education and research-level activities. And of course, we continue to hope that users will remember to turn to their librarians when frustrated or, preferably, before, when they require an authoritative source. The challenge for librarians is to continue educating users in how to recognize quality and authority when it really counts.

So, has the quality of reference sources eroded in the online era? The answer has to be a resounding "Yes and no." Accuracy aside, the linking functions, the ability to access sources easily, and the opportunities to point out and correct errors quickly and thoroughly weigh on the side of enhancement rather than erosion. Machine indexing, semantic tools, and linkages will result in increased findability and perhaps offer more options for users to catch and report discrepancies. Echoes of "Why didn't I know about these sources earlier?" will continue to haunt librarians. But for those occasions when facts are required, reasonably accurate items are still findable and widely available to our users.


1. Giles, Jim, "Internet Encyclopaedias Go Head to Head," Nature, Vol. 438, Dec. 15, 2005, pp. 900-01 ( See also Supplementary Information with the test articles and reviewers' comments at, the response by Encyclopedia Britannica, Inc. at, and the further response by Nature at

Denise Beaubien Bennett is an engineering librarian at the University of Florida’s Marston Science Library. In her spare time, she serves as both the general editor of the Guide to Reference as well as the division editor for the Science-Technology-Medicine sections of the Guide.


