Searcher
Vol. 9 No. 5 May 2001
FEATURE
Playing Twenty Questions to Test Low-Cost, Free, or Subscription Databases for End-User Online Service
Nicholas G. Tomaiuolo Reference Department, Elihu Burritt Library
Central Connecticut State University
Table of Contents Previous Issues Subscribe Now! ITI Home
Since 1999, when I last looked at low-cost options to traditional databases, libraries have increased their spending on databases from $17 million to over $50 million per year1. Depending on the individual library's mission, these databases advance research, consumer awareness, patient care, financial knowledge, and general education, among other things.

Several years ago database producers began putting the power of searching, document identification, and document delivery in the hands of end users, and that trend continues. Everyone can search ERIC, PubMed, CARL UnCover, and other bibliographic services without charge, use the bibliographic information, and selectively pursue the full text on their own, either through the search service (e.g., CARL) or at their libraries. The phenomenon of promoting periodical databases to end users generates two interesting possibilities. Libraries could easily make these databases available for their patrons by linking to them on their Web pages and circumvent the expense of costly database subscriptions. The other possibility is that end users will discover these services and build their own virtual libraries at their desktops circumventing the library. The debate that remains, however, is to what degree the low-cost/free Web services compare with the traditional electronic database resources that most librarians prefer.

Some readers may be suspicious of free services. How long will these services be around? Are the services experiments by database producers to "hook" end users and subsequently sell them the services? For example, will Britannica.com go from free to subscription due to the online advertising slump reported by newsbytes.com on March 13th? That's what we hear. To what degree is their publication coverage and currency stable? Well, it shouldn't really matter. If a resource is free today, and the end user's query warrants using it, then we should not quibble over the particulars of the resource. Naturally, librarians may not wish to set themselves up for the dilemma of how to handle their patrons and their budgets if a free resource goes south or begins charging, but opportunities are meant to be exploited. So let's get that free bibliographic information or full text while we can.

It's well known that CARL, ERIC, Electric Library, and PubMed are free to search, that each provides a reliable search engine and displays results in usable bibliographic citation formats. Northern Light's Special Collection, another widget in the searcher's toolbox, has recently been joined by three other free/low-cost services: FindArticles.com, Contentville, and XanEdu. To determine the comparable effectiveness of these bibliographic/full-text services with each other and with traditional subscription databases, I played my own version of "Twenty Questions." Using my 20 test questions, I measured how these services stood up in terms of searchability, overlap of results, and "bang for the buck."

These 20 questions were searched against each service. I discovered that the results depended on the search interfaces, capability of the search engines, depth and breadth of journal coverage, and other assorted variables. But, in conducting the searches, I tried to fully deploy all of each service's search capabilities to ensure that the strengths of each service were exploited. The search results are intended as a rough guideline in regards to the usability of the services. Table 1 (below) provides information concerning the features of each database.
 

Database Assessments

Category #1: Free Searching and Free Full Text

FindArticles
FindArticles, the only no-strings attached free service, covers 300 "reputable" magazines and journals dating back to 1998, providing full text for almost every article (I have noticed getting some hits where no article was viewable). The data comes from a selective feed of Gale Group's InfoTrac articles. Coverage in Gale Group's own InfoTrac service is massively larger even in terms of archival coverage of the same sources.

Although around for over a year, FindArticles still seems like a prototype effort, which may account for the quirkiness of its search engine. I read the "Search Tips" carefully, but the efficacy of my searches was, more often than not, guided by my intuition, rather than the well-written, but sad to say often erroneous search help documents.

For example, when reading the help for Boolean searching, I saw a reference to an Advanced Search option. Unfortunately, I couldn't find this option, and my e-mail to FindArticles elicited a form reply that referred me to its FAQ. The "Search Tip for Boolean Searching" states that entry of WORD1 +WORD2 (i.e., airbag +safety) mandates that Word2 must appear in results with Word1. However, this didn't seem to be the case. Figure 1 shows I retrieved 61,724 hits for a search formatted in obedience to FindArticles' help screen. Figure 2 shows 399 were retrieved with the entry +WORD1 +WORD2 (i.e., +airbag +safety) a strategy that FindArticles' help did not mention. This means if the end user does not understand basic searching nuances, the retrieval will be confounding. [Professor Péter Jacsó, who provided a comprehensive review of this database2, zeroes in on another FindArticles software snafu in "Software Makes LookSmart LookDumb" from January 2001's Information Today.]

As a searching purist, I'd say if a database software doesn't perform well, it isn't worth a penny; but one can hardly disregard a database that doesn't charge a cent for access to articles in the Journal of the American Academy of Child and Adolescent Psychiatry, American Journal of Economics and Sociology, Online, Lancet, Topics in Early Childhood Special Education, National Review, and the New Statesman (not to mention Wrestling Digest and Trailer Life) because of a couple of quirks. As we all know, even the most expensive subscription databases have bugs.
Testing, Testing

The categories:

Free

FindArticles
[http://www.findarticles.com]

Free Searching/pay-per-view full text

Northern Light Special Collection [http://www.northernlight.com/power.html]

Contentville
[http://www.contentville.com]

Subscription for end users

XanEdu
[http://www.xanedu.com]

Library subscription favorites

Ebsco Academic Search Elite 
[http://www.ebsco.com]

Gale Group's InfoTrac Expanded Academic ASAP
[http://www.galegroup.com]

The searches:

Arts and Humanities

  • The moai of Easter Island
  • Ship of Theseus problem
  • "Eyes wide shut" and Traumnovelle
  • Funding for the arts
  • Doppelganger in film
  • Staging Shakespeare's plays
Social Science
  • Classroom activities for gifted children
  • Domestic violence and alcohol
  • Andrew Johnson's Impeachment
  • Grade inflation in higher education
  • Online shopping and privacy
  • What do lottery winners do with their prize money?
  • Sanctions against Iraq
Science
  • Methylphenidate and children
  • Drilling in ANWR
  • Cell phones and cancer
  • Responsibility for the Shuttle Challenger disaster
  • Needlestick accidents
  • Air bags and death
  • Whistleblowing

Pros
Free service
Full text

Cons
Cantankerous search engine
Occasionally bewildering retrieval
Annoying "ads" accompany search and results pages
 

Category #2: Free Searching and Pay Per View

Contentville
With an array of titles, including American Demographics, Guns and Ammo, InQuest Gamer, Business Week, the New Yorker, Wired, Smithsonian, and Rolling Stone, a searcher can find articles by P. J. O'Rourke, Paul Theroux, and Stephen King on this service. Speeches by Winston Churchill, Edward Everett, Tipper and Al Gore, Victor Hugo, and Daniel Webster, among many others, make Contentville an attractive resource for undergrads looking for "primary sources" (see Figure 3). Most individual items are available for immediate download and cost $2 or $3 (charged to a credit card). Additional features include access to television transcripts and screenplays, but these items are more expensive.

One problem with this service is its spotty journal coverage. I know Steve Martin has written for the New Yorker, but I could not find any of his pithy contributions through Contentville. Of course, Contentville has received a fair amount of publicity recently in advertising its agreement to work within current Tasini case rulings and not use freelance articles, regardless of publisher feeds, without author permission. Perhaps Mr. Martin has his own Web site sales program. Similarly, I would have expected more than 11 hits on my "RU486" search. The 11 retrieved, however, came from the Economist, Time, Family Planning Perspectives, the Progressive, and Science News, which suggested balance in its magazine coverage.

Contentville probably would not rank as a top contenders for the end-user market. Pay-per-view rival Northern Light's broader content base and competitive charges make it superior. Even FindArticles, despite its erratic search software, provides more value. Undergraduates or consumers may find Contentville useful, particularly because of its coverage of specific interest magazines (for example, All About Beer and the Alfred Hitchcock Mystery Magazine).

Pros
Inexpensive pay per view
Eclectic periodical list

Cons
Limited search engine
Depth in coverage of listed periodicals (i.e., questionably selective)
 

Northern Light Special Collection
With its aggressive commitment to offer resources for affordable online viewing, Northern Light's publication list has grown by 40 percent to 7,100 titles since early 1999. The bulk of its Special Collection comes from titles supplied by Gale Group and Bell and Howell Information and Learning, although Northern Light has increased direct dealings with publishers, especially in building up its news wire and news media feeds. As one of the innovators in free searching and pay per view, its charges are much more reasonable than UnCover's. For example, an article I wrote for Searcher some time ago costs $14.50 for fax or desktop delivery via UnCover. Northern Light charges $2.95 for the same article. (FindArticles has it for free and Contentville hasn't heard of me, yet.)

To utilize Northern Light's Special Collection, go straight to its "Power Search" option and select the "Special Collection" radio button, or click on http://www.northernlight.com/power.html. Not all the publications are journals; the Special Collection includes reference books, wire stories, television transcripts, college newspapers, and industry reports. In fact, a great deal of the retrieval for many of my 20 questions came from newspapers and wire reports, though this isn't always the case. In the past, Northern Light used to offer a way to "de-select" certain categories of information, including news wires and newspapers. Northern Light dropped that feature for some reason, sad to say, but a Northern Light representative with whom we spoke said that the company was considering re-instating it due to user complaints.

In keyword mode, the end user is searching the full text of the articles in the Special Collection. This is occasionally problematic and sometimes results in bewildering retrieval, because no proximity operators are available, although you may search phrases in quotes. A title search is more direct, but the end user may miss some relevant documents. Notice the respectable titles Northern Light retrieved for a search on "methylphenidate and children" in Figure 4.

The most valuable aspects of Northern Light's service are that it can be searched without charge with the bibliographic citations retrieved sufficient to locate the article, again without charge. Even if you choose not to pay Northern Light to read an article you find, you still find a citation good enough to tap a library's collection. When the end user chooses to buy, the document normally costs between $1 and $4. To get started with Northern Light, one needs to register for a member account; this requires a credit card. Another great feature is the free "Alerts" that end users can easily set up.

Pros
Inexpensive pay per view
Large database of publications to search against

Cons
Search against full text sometimes yields irrelevant hits; no proximity operators
Title list includes numerous non-scholarly sources
 

Category #3: Subscriptions for End Users

XanEdu
XanEdu from Bell and Howell Information and Learning (BHIL) markets its ReSearch Engine, MBA ReSearch Engine, CoursePaks, LitPacks, and CasePaks to undergraduates, graduate students, and faculty. Three-month subscriptions cost $19.90, 6-month subscriptions $29.90, and a complete yearly subscription, $49.90. In effect, as one of its Web pages states, the "ReSearch Engine gives you access to millions of full-text articles from thousands of the world's leading magazines, newspapers, and scholarly journals. It's current, accurate, and comprehensive information you can access anywhere, anytime."

As a BHIL product, XanEdu's "look and feel" mirror ProQuest products, as does most of its content. But whatever happened to journal lists? As a librarian who pays attention to what indexes and databases claim to cover, I feel that XanEdu's lack of a periodical list is peculiar. This is a conspicuous oversight; Ebsco, InfoTrac, Northern Light, Contentville, and FindArticles all list covered publications.

Nonetheless, retrieval from a XanEdu search usually contains references from respected journals or newspapers. A search on the critical-thinking topic "reflection in action" resulted in full text from Journal of Management Inquiry, Teaching in Higher Education, British Journal of Nursing, Current Anthropology, Christian Science Monitor, Social Work, Journal of Multicultural Counseling and Development, Personnel Psychology, and American Behavioral Scientist.

After the subscriber logs in, the XanEdu ReSearch engine defaults to a "Topic Find" form. If the searcher enters "sanctions and Iraq" in this form, for example, the engine retrieves a directory-like page which, as Figure 5 shows, seems somewhat less than helpful for either an experienced searcher or a novice information seeker. But in my Google search for information about XanEdu, I found a professor's informational Web page mentioning that this is exactly how a teacher would want students to explore a topic.

On one of her Web pages, Professor Dawn Rodriguez (University of Texas at Brownsville) wrote:

One of my students was interested in learning more about crime in the United States, a broad topic, but one of interest in an election year. With XanEdu Research Engine, she was able to explore sub-topics until she found an area of interest to her. She clicked on social science, then criminal justice, then criminal punishment, then juvenile justice. There, she saw that there were collections of articles on each of these topics: Delinquency prevention, history of juvenile justice, juvenile corrections, juvenile court processes, juvenile delinquency, juvenile defenders and police, juvenile probation and parole. Juvenile court processes appealed to her. By skimming the list of titles that the search engine located, she got a good sense of the issues and the range of views on this topic: "Courts designed to stop teens at one mistake," "Punishing choices: how to try teens convicted of major crimes," and "Juvenile justice failures shame our judicial system."

From the list of "hits," she chose an article that provided her with an overview of varying viewpoints about the issue: "Just Punishments: Federal Guidelines and Public Views Compared" from Contemporary Sociology. This article gave her a sense of some important issues in the field. Also, the list of subject terms listed at the top of the article included some terms she could use for continued research. Also, since this article was from an authoritative source a journal in the discipline, not just a newspaper article, she knew that I would be pleased with her choice3.

To paraphrase the instructor, it would be a better strategy to search for "Civil War," find the topic, and explore the possibilities than to search for "Antietam Bridge." This is certainly logical if the end user hasn't determined a topic.

If you've already decided on a specific topic, however, you'll soon discover XanEdu's only prominent gaffe. It takes a bit of poking around to find the keyword search screen; it doesn't appear on the XanEdu home page. As Professor Péter Jacsó stated, "XanEdu was designed for topical searches through controlled vocabulary, but not for author, journal name or plain keyword searches." Jacsó continued, "The idea of controlled vocabulary searching is noble, but the implementation is not user-friendly."4 Actually, from the "Topic" find or "Advanced Find" form, the user need only click on the first directory path, and along with the first citation, the Keyword Search option will appear evident. Plus, if one is "lucky" enough and the ReSearch Engine can't pick up the specified keywords in a directory category, it will go directly to the Keyword Search form. As Figure 6 shows, a keyword search on "Iraq and sanctions" provides more straightforward results than a topic-oriented search.

At $49.90 per year, the XanEdu ReSearch Engine is an excellent value for undergraduates and graduate students. Searchers will find full text, in many cases accompanied by graphics, or full-page PDF articles from a wide range of publications, including many daily newspapers, magazines, and scholarly journals. Librarians might note that since individuals may purchase access so cheaply, it may beg the question of having a smorgasbord of pricey menu items such as Lexis-Nexis Academic Universe and Ebsco Academic Search Elite.

Pros
Relatively inexpensive subscriptions
All articles are full text
Robust search engine

Cons
Retrieval often includes more newspapers than journals
Keyword search not immediately evident
Retrieval "maxes out" at 50 hits
 

Category #4: Library Subscription Favorites

EbscoHost Academic Search Elite 
Gale Group's InfoTrac Expanded Academic ASAP
Undoubtedly, if the searching budget allows thousands of dollars for subscriptions to full-text databases with quality search software, the traditional subscription database vendors are preferable for institutions. The cost-benefit of subscribing to these traditional services should be compared, at least for evaluative purposes, with individual access to new, less-expensive alternatives that are rapidly emerging.

Having used EbscoHost's Academic Search Elite for several years, I have never been excessively impressed. Why, when trying to view an article for a retrieved citation tagged with a "full text icon," should the screen inform the searcher "Full Text rights to this article have not been licensed to EBSCO? For more information, please contact [the publisher]"? Of course, this rarely happens, but when you've paid for access to full text, it doesn't seem fair that you should have to go down that path. Sure, it covers 2,550 publications (1,650 peer-reviewed). Yet I still see students performing basic searches on the topics du jour (i.e., "capital punishment," "medicinal marijuana") and finding full-text articles only one-third of a page in length from peer-reviewed publications such as the Canadian Medical Association Journal or three-quarters of a page long in the Humanist. Though interesting, these items are usually insubstantial.

Gale Group's InfoTrac Expanded Academic ASAP is an excellent fully appointed alternative to Ebsco's Academic Search Elite. Articles from more than 1,500 scholarly, trade, and general-interest publications are indexed. The search help is particularly well written, and I could not find any significant anomalies in the search software.

Is access to Academic Search Elite or Expanded Academic ASAP worth the price? Both are solid products for public and university libraries. Yet one should consider the percentage of overlap that more inexpensive services such as XanEdu or pay-per-view databases might offer.

Pros
Reliable results due to solid search engines
Numerous major and minor publications

Cons
Expensive not viable for the end user
 
Questia: A New Player
Questia made its debut as predicted in January 2001. It currently describes itself as a research service accompanied by a core liberal arts collection of around 50,000 digitized books. The service markets to college students. No journal articles are available yet, but Questia says that it has signed up a number of university presses, such as Oxford, Harvard, and Chicago. When Questia does open up journal access, it promises to include older volumes.

Currently users can explore Questia by topic, author, or title and highlight text, create margin notes, compile bibliographies, and save work in folders. Questia has advertised that its book collection has been carefully selected by 10 collection development librarians, who have "hand-picked" the most valuable texts in each discipline. However, a good part of the collection appears to be out-of-print book titles from university and other publishers. A search can have varied results. For instance, searching the words Budapest, Hungary yielded a couple of useful books, such as Budapest, a Cultural Guide (Oxford, 1998) by Michael Jacobs, but clicking on the link "more like this" provided only the same title. An author search on Graham Greene resulted in six interesting hits, but only one was written by Graham Greene.

A search on Charles Darwin as subject provided some useful texts but left out important contributions by such authors as Michael Ruse and Ernst Mayr. A search for Thomas Hardy's Return of the Native led to only one title, a book about making the film.

The real reason some titles don't appear is probably that publishers would not grant access. Displaying text must be done page by page in a tiny window that can open very slowly. One page can be printed at a time, but not downloaded. Questia has described its service as a "complement" to college libraries and that seems accurate. One hopes students realize this and don't mistake it for a complete academic collection.

Questia is currently engaged in a mega-promotion effort. Multiple ads are appearing in university newspapers. Deans receive solicitations to allow free trials for honors students. Questia has also designed its home screen with students in mind. Take a look at the "Question Marquis in his e-boudoir" on the left side of Figure 7. He wants to spend more time on "passion and dueling" and less time on research, so he uses Questia. Is this kind of silliness going to appeal to college students? It remains to be seen.
We 

Raw Retrieval Data
As most searchers can see from the information presented in Table 2, wide disparities exist in retrieval from these six services. I made every effort to search the databases as efficiently as possible, but in many cases irrelevant items were retrieved as well as an abundance of results from sources that proved less than comprehensive (i.e., brief newspaper articles, etc.). Although I discussed FindArticles' cranky software, I believe I performed my searches accurately, and so when 3,004 hits are reported for a search on "funding for the arts," many of the first few pages of results that I looked at were relevant!

Table 2 shows that FindArticles may provide an acceptable free alternative and definitely a service that the penurious will want to explore. After all, it always found something on the topics I searched.

Contentville, conversely, was not so comprehensive. It retrieved five or fewer hits in 50 percent of the searches. Of course, Contentville's subtitle is "The Cross-Content Search," and valuable items such as transcripts and speeches were retrieved also (but weren't counted in this evaluation). The attractive aspect of Contentville lies in the niche publications it has decided to include.

Northern Light's Special Collection performs well. The downside of this service is that the Special Collection seems to embrace as many transitory newspaper and wire reports as journal sources. Take the last search on "sanctions and Iraq." Page after page of results from this set came from publications such as the Bergen County Record, the Dayton Daily News, Florida Today, and the Providence Journal. Fine newspapers perhaps, but hardly political science journals. A suggestion might be to separate the "Special Collection" into various subdivisions. The "Custom Search Folders" into which the system automatically collates retrieval proved, at times, inefficient in delineating what types of documents were located within the folders.

The XanEdu numbers may be somewhat misleading it never displays more than 50 hits per keyword search. But its retrieval was substantive. Of course, it uses many news sources also during my search on the impeachment of Andrew Johnson, USA Today, Chicago Sun-Times, Gannett News Service, and the San Francisco Chronicle showed up a fair number of times. But in the same results I saw references (and full text) from the Yale Law Journal, Presidential Studies Quarterly, the New Republic, the Nation, the Spectator, and Newsweek.

Ebsco and InfoTrac worked as expected. Both are worthwhile subscription databases and, while there isn't a great deal of redundancy between the two (at least upon inspection of the first 10- 20 hits from each search), both usually retrieve an ample number of citations/articles for most purposes. Please note, however, one conspicuous inconsistency. InfoTrac found 90 articles on "classroom activities for gifted children," but Ebsco only found seven. Other than that case, the two servers generally run "neck and neck."

Another variable I attempted to assess was overlap in retrieval among the databases. In particular, I wanted to ascertain whether end users could find enough material on the low-cost or free services to fulfill baseline research needs. Naturally, the high-profile traditional databases should fare better, but can end users find substantive resources through the other databases under discussion? One way to test the efficacy of the inexpensive services is to compare their retrieval to each other and to their traditional counterparts.

The publication lists are so diverse, a scholar might want to use all the available services. The first 10 hits for the "airbags and death" from Northern Light's Collection was comprised entirely of newspaper articles. Even FindArticles' first 10 hits retrieved more research-oriented material, including articles from Chest and Lancet.
 

Conclusion
Why spend $10,000 to $20,000 per year for a full-text database when you and your clientele can perform searches and view the documents for free (FindArticles), at a small pay-per-view charge (Northern Light), or for a nominal subscription (XanEdu)? Naturally, it depends on the needs of the end user, but more options are emerging, and it "pays" to look at all of them. Northern Light, FindArticles, and XanEdu are attracting end users, and so is Questia, a company that heavily targets end users in its advertising. [See sidebar above and Figure 7.] Will we see more of these services? With the dot-com burnout still underway, the answer to that question remains very uncertain. After all, Britannica.com has announced that it plans to go back to charging. End users would be wise to use these services now while we have them and information professionals should expect the opportunity for real bargains to persist, burnout or no burnout.

Note: Figure 8 illustrates an "end user's free and low cost research page." This page is available at http://library.ctstateu.edu/~bibman/enduseroptions.html.


 

Table One: Overview of Services
  Help Journal List # pubs Full-text Subscribe PayPerView Free Search Full Cit. with
free search
Contentville yes yes 500+ yes no yes yes no
Ebsco Academic
Search Elite
yes yes 2550 mostly yes no no NA
FindArticles yes yes 300 yes FREE FREE yes yes
InfoTrac Expanded
Academic ASAP 
yes yes 1500 mostly yes no no NA
Northern Light
Special Collection
yes yes 7100 yes no yes yes yes
Xanedu ReSearch
Engine
yes ? ? yes yes no no NA


 

Table Two: "Twenty Questions" Search Results
  FindArticles Contentville NLSpecColl XanEdu InfoTrac Ebsco
Arts & Humanities
Doppelganger in film 30 0 132 7 137 132
"Eyes Wide Shut" and 
Traumnovelle
5 1 66 9 29 24
Funding for the arts 3004 117 216 50 665 824
Moai 4 1 42 14 19 24
Ship of Theseus 3 0 7 4 16 16
Staging and Shakespeare 174 7 10 50 68 58
Science
Airbags and death 23 8 18 32 10 14
Drilling and ANWR 53 5 7 50 61 24
Methylphenidate 
and children
222 12 4 50 61 24
Needlestick accidents 20 21 88 26 20 21
Cell phones and cancer 109 5 37 19 11 13
Shuttle Challenger/
responsibility
13 0 145 30 8 4
Social Sciences
Andrew Johnson's 
Impeachment
47 11 15 50 16 60
Classroom activities 
for gifted children
39 0 45 11 90 7
Online shopping 
and privacy
1474 10 6 50 39 21
Whistleblowing 31 11 62 50 107 81
Grade inflation in 
higher education
69 3 132 28 4 5
Lottery winners/
prize money
134 4 23 9 9 16
Domestic violence 
and alcohol
218 1 11 50 46 37
Sanctions / Iraq 762 214 1168 50 1098 1062

References

1. Bowker Annual. 45th edition. R.R. Bowker, New Providence, NJ. 2000, pp. 418-425.

2. Jacsó, Péter, "FindArticles.com Embraces Free Content Trend," Information Today, vol. 17, no. 10, November 2000, pp. 38-40.

3. Rodriguez, Dawn, "The Key to Effective Research: Getting the Right Start" [http://unix.utb.edu/~drodrigu/webdraft.html], accessed March 2, 2001; University of Texas at Brownsville.

4. Jacsó, Péter, "On the Way to Information Xanadu," Information Today, vol. 17, no. 9, October 2000, pp. 38-40.

 

Nicholas G. Tomaiuolo's e-mail address is tomaiuolon@ccsu.edu
Table of Contents Previous Issues Subscribe Now! ITI Home
© 2001