Online KMWorld CRM Media, LLC Streaming Media Inc Faulkner Speech Technology
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research

Magazines > Searcher > July / August 2003
Back Index Forward

Vol. 11 No. 7 — July/August 2003
Tribunes and Tribulation The Top 100 Newspaper Archives (or Lack Thereof)
By Larry Krumenaker
Publisher, Hermograph Press

The newspaper industry has finally figured out a way to join the digital world...and not lose print subscribers. How? Make it more cost effective to get the paper delivered than to use an archive online!

Let me prove my thesis.

I publish the book Net.Journal Directory, the Catalog of Periodicals Archived on the World Wide Web [] and its online version, Net.Journal Finder []. Net.Journal was created when a hobby of collecting inexpensive archive locations mushroomed into this book in the mid-1990s. I had become used to having LexisNexis or Dialog at my beck and call when I worked in broadcasting or corporate libraries. Information withdrawal is a painful experience, brought on when personal subscriptions to either service stretch beyond the limits of one's wallet. (Notice that I said wallet, not credit card; credit card pay-per-view services and even simple Web access were not even on the infohighway radar screen then.) As a budding writer, I wanted to do research in periodical backfiles inexpensively. I began to record archive Web addresses (translation: gophers, telnets, BBS, not World Wide Web — yet).

And if you want to search newspapers, which ones are you most likely to rummage around in? Other than for local stories, you will probably want to search the largest papers, the so-called "papers of record." The Top 100 papers, in terms of circulation, is a list published annually by Editor and Publisher. The most recent one available for this article is a bit dated, 2001. Still, the list doesn't change much from year to year, especially on the low end, low defined as nearer number 100 than number 1.

A couple of Top 100 factoids. First, for the purposes of this article, we will only look at 99. The E&P also includes Investor's Business Daily. Not likely to see Garfield or horoscopes, local news, or much wire service copy there. It's not the regular guy kind of paper, so I dropped it from this investigation, though I'll still refer to the Top 100. All the papers listed are published 7 days a week, except two, which are 6-day papers. All are in the English language except for one in Spanish.

This winter rumors started flying among the e-mail mailing lists used by information professionals about archives being pulled off the big services, specifically Dialog and LexisNexis. On a mission of investigation from Searcher's editor, I got to work.

Dear Reader, remember that this is all subject to change between my work on the article and your reading it. My research used an excerpt from Net.Journal's database of nearly 30,000 titles. I also rechecked all the papers' own Web archives. The trend towards fee-based archives continues...but there were some surprises and some real bargains.

The Big Players

It's interesting to look back at Net.Journal #1 and see that virtually nobody listed in that 1997 book is still online! Despite some tumbles, both Dialog and LexisNexis (A.K.A DialogWeb and LexisNexis, respectively) remain online and still have newspapers; indeed, the Nexis side of LexisNexis began as a newspaper archive. NewsNet, IQ, Knowledge Index, BRS (and what WAS the name of their consumer service again?), all gone. Newspapers archives online? Hardly.

You can still find newspapers in Dialog, LexisNexis, and now also Gale Group's InfoTrac Web, ProQuest, and Factiva (see Table 1 on pp. 30-31). Out of the top 100, 90 appear on LexisNexis, but Factiva has 90 as well. Factiva is the successor to Dow Jones News/Retrieval, another major newspaper archive from Net.Journal Directory 1, and Dow Jones Interactive. Many of Factiva's papers come from its partnership with ProQuest (once UMI, the great newspaper microfilmer). On ProQuest's own service, ProQuest Direct, you will find 55 of the Top 100. Dialog has 52 in its own full-text files and 19 more in File 781, the ProQuest file on Dialog. New to the newspaper archive business comes InfoTrac Web, the Gale Group library service, with 47 titles.

Note that we are only considering full-text archives. We've deliberately left out selected full-text files (SFT), like Business Dateline. That eliminated other periodical archive files on LexisNexis and Dialog and files with papers found on OCLC's FirstSearch service. Though we have striven for completeness, NOTHING is ever completely full text any more. You aren't going to get all the articles anyway, but at least you probably will get all kinds of news areas in so-called full-text archives — politics, business, science, general news, etc. — whereas in the SFT files, you'll get just business or some other highly filtered selectivity.

A casual examination of Table 1 doesn't indicate any particular advantage in terms of coverage dates for any service compared to the newspapers' own Web archives. Sometimes a paper's Web site has a deeper, longer archive, sometimes not. It does show a lot of papers don't have an archive at all!

The Middle Players

There is one Web-only newspaper archive service and several pretenders. One of my favorite Web sites ever is NewsLibrary. Originally a Knight-Ridder service that collected all K-R's newspapers, it made a one-price-fits-all archive service and became one of the earliest, reasonably priced, pay-per-view periodical sites in history. At (usually) $2.95 a pop, it's no better or worse price-wise than the $3 cash-and-carry on LexisNexis, identical to most full-text Dialog files, and Factiva's prices as well. (Niche marketing to libraries, ProQuest and InfoTrac offer all-you-can-eat for too-high-a-cost-for-mere-humans services with no way to make comparisons with them on this scale.) There's always around 100 titles, though the titles have changed from time to time and aren't always Knight-Ridder pubs either, and you can search them individually or collectively for free, check the abstracts and citations, and then pay for what you want.

NewsLibrary is now the property of NewsBank but otherwise remains unchanged, and I hope it will stay that way. NewsLibrary has 61 of the top 100 papers and, based on ease of use and cost, constitutes a good competitor to the Big Ones. Many newspapers use NewsLibrary as their archive operator and don't actually have their own long-term archives on their own Web sites. (This explains the gaps in the last column of Table 1.) But look first to the NewsLibrary column before you give up on a Web-found archive.

There are at least three pseudo-services out there. One is RealCities, a Knight-Ridder property, which would be another NewsLibrary (it charges the same rates) but you can't search the papers en masse. It's really more a Web hosting service for papers. URLs in in the sidebar with an "/mld/" in them are RealCity papers. Another such service is the ProQuest Archiver. Again, about three dozen newspapers use this host service, and you can't search these papers en masse either, nor are they all the same price. The archive links always go to a URL with "pqasb" in the address. Finally, there's one that's just frankly poorly done, by the Web design firm called Alliance Alert. Most of the "state" dot-com services in my listings go to them. They are poorly designed. For example, the archive information page on for the Star-Ledger isn't linked to any other page; the newspaper librarian had to tell me where it was.

One other service that has more than half of the Top 100 newspapers...and many more isn't listed here. The Financial Times of London operates a service called the World Press Monitor, at roughly $14 a month. It contains 500 papers and magazines from all around the world. But most of the papers from the U.S. are available only in selected full text, primarily because these come from the Knight-Ridder Business news wire. Still, if I had to choose two low-cost periodical services for my personal credit card, the World Press Monitor and NewsLibrary would be my choices for the general consumer.

The Individual Players

Table 2 on pp. 33-34 lists the newspapers themselves, grouping them into four colored bands by circulation size. Chances are, if you are a newspaper searcher on LexisNexis or Dialog, you already use one or more of those in the top band, USA Today,TheWall Street Journal, or TheNew York Times, papers with more than 1 million papers sold every day. When you do a comprehensive national search, you probably search the next band, the 500,000 to a million circulators, and maybe some of the third band, the 250,000 and up group.

In a physical newspaper library, there's the current news and the "morgue," where papers go for storage and future research. Online, there are similar depositories, actually three of them.

The first one is simple: Are today's articles viewable? A simple click on a headline answers that question. Most papers allow you to see the current issue's stories and at no cost. Most of the "no, you can't" papers are the big ones in the top 12, in fact half of them. There are only five others in the remaining 87. In some cases, you must sign up and register before you can read articles, but again usually for no money at all. If all you need is current news (though often it is 24 hours old, not all have "breaking news" sections), just about any U.S. newspaper on the Web will do. (To find practically any newspaper or broadcast news source in the U.S. — and outside — nothing beats the inimitable as a starting point.)

Where does news go between today's life and the future morgue? Paper purgatory! That backlogged stack on the morgue librarian's desk has an online equivalent. Here, you'll find yesterday's news, and often more for a short period, usually 7 days. Some Web sites go for as much as 2 or 3 months (and sometimes that's the only archive on the Web site!); others apparently consider yesterday's news not worth knowing. Most often these transition zones between hot stuff and cold clippings are cost-free. Generally, if you register for today's news, you register for last week's, too.

Finally, then, there is the long-term archive and the central part of this search. As noted above, more than half use NewsLibrary...but there are different flavors of that as well. Some archive articles are paid by the piece; other archives charge you for access per unit of time. Whenever there is a "T" in a Table 2 column, this means you spend something like $5.95 for 24 hours of access. Sometimes you can download as much as you want, others have an upper limit, say, 10 articles. Naturally, like print subscriptions, you pay less per unit if you buy in larger quantities. I've listed the Large Economy Size rate in the next column. Have a large limit on your MasterCard? You'd better! Rates go into the hundreds of dollars, up to about $2,000. If you're a business, school, or library, you have to sometimes set up site licenses, no credit card allowed. (Whenever you see NL, this means the NewsLibrary rate of $2.95 each article, maxing up to 1,000 articles for $1,99 is the range of charges).

What gives each Web site, well, character is the various ways it goes about setting up the archive (or setting YOU up). You'll find extra pricing and archive information and some of the oddities listed in Comments. For examples, some sites have two long-term archives, sometimes both free, sometimes one charges a fee. Sometimes the site tells you there's a limit to the purgatory archive, then you do a keyword search and get articles from several years ago — free! The long-term archive would charge you for it. A nice trick to know. Some Web archives are browser unfriendly, others are so slow it would be quicker to run to the store and get a copy of the paper edition.

Another tip for those on very tight budgets. For a hot story — and definitions of hot change from paper to paper depending upon local connections — you may find that a newspaper has linked to earlier stories as background material. In a sense, the reporters have done the archive checking for you. Often stories tagged and linked to a current story carry no charges, though the same stories would cost if you retrieved them on your own from the newspaper's archive.

Some newspapers go heavily into paper recycling and some have found an online equivalent. For example, after 7 days, the Chicago Sun-Times Web archive is history, literally. Others of this ilk might go as long as 60 days before recycling the electrons. For these, there truly is no choice; if you want a story from an earlier edition, you may have to look deep into a stack of yellowing paper out in the garage or head for the university library's microfilm reader.

Paper or Digital?

"They" say that everything is free on the Web. Not quite. Some newspapers are definitely far from free. The Wall Street Journal charges you for the print edition, then an electronic subsidy on top of that, plus possibly a single article cost. But there are some bargains, even freebies, out there amongst the newspaper archives.

What defines a bargain? With most archives roughly comparable in price, certainly at the pay-per-view range, a bargain depends on the size of the archive and how online costs compare to offline costs. Print edition subscriptions also vary greatly in amount. Annual costs can range from a few Jacksons to several Franklins. You can investigate possible effects on your budget by comparing the cost of however many articles you need in a year (or month) with the annual subscription.

But clearly, free is often better than fee, and free is better for 10 years of archive than for one. So I've created Krumenaker's Newspaper Bargain Index. I've divided the print edition annual cost by the non-discounted single article cost (or the cost for 24 hours when access is by time) and multiplied by the range of the archive. As you can guess, free archives are the best, a cost divided by zero is infinity, therefore free archives have an infinite value! Some are more infinite than others, so the NBI can get very metaphysical — infinity + 10 is better than infinity + 2. Sites with no long-term archive have zero value, mathematically or otherwise. In between, the higher the NBI, the more value in the archive.

Of the freebies, the best bargain is the St. Petersburg Times. It's totally free and you go back 16 years! Why they do this, I don't know, but St. Pete's is my Mecca when I search newspapers now. In second place comes the Greensburg (PA) Tribune Review (near Pittsburgh), but it would help a lot more if it didn't have such a mix up of two archives. Other valuable sites, with up to 7 years of free archiving, include the San Francisco Chronicle (#11 in the top 100), the Milwaukee Journal Sentinel, Seattle Times, Jacksonville Times-Union (what is it with Florida papers?), and the Las Vegas Review-Journal.

Among the fee-based archives, the top of the heap mathematically is low-ranked (#86) Salt Lake Tribune, with a long and inexpensive inventory. But don't forget the three archives that go back to the late 1970s: #5 Washington Post, #8 Newsday, and #14 Boston Globe. The first two have 2-week long free archives.

Who's the worst bargain among the fee-based archives? These would be the Tacoma News Tribune, Toledo Blade, Palm Beach Post, San Diego Union-Tribune, and the New York Post. Why? Because the price of the archive comes close to matching the price of the subscription...and some of these have very small archives. They may be good papers, they just don't have cost-effective archives.


My recommendation: if you are just looking for U.S. newspapers, and nothing else, or have a limited budget, go for the FT.COM and NewsLibrary services, keep St. Petersburg in your bookmark list, get an annual subscription to your local big paper, and buy a long-term discount archive rate at the Washington Post. If you have to search other kinds of files as well, LexisNexis has most of the Top 100 papers, as do Dialog and Factiva, and have the means to search them simultaneously with other kinds of data. There aren't many other choices. If your institutional budget has the bucks, one of the other periodical warehouse services, e.g., ProQuest or InfoTrac, will give you a wider choice of periodicals than just going Web, but little of the non-periodical data universe. If you need today's news (or the recent week's), the Web's the best bargain around, for sure, and you can check the news anywhere, from anywhere.
       Back to top