History Lesson: CAB International
by Marydee Ojala
What connection do two British entomologists sent to Africa in the first decade of the 1900s have with a modern electronic database that is branching out into creating futuristic Web communities? Without them, there would be no insect collection at London's Natural History Museum, no abstract journals of agricultural research, no CAB International (formerly known as Commonwealth Agricultural Bureaux), no abstract databases in a wide range of agricultural, veterinary, environment, tourism, forestry, and biological specialties. Granted, those entomologists had no idea they were on the cutting edge of a not-yet-invented information industry. And it's doubtful they would have categorized their journey under the Leisure & Tourism segment of CABI's current product line. When they were doing their African field work in veterinary medicine and finding insect pests they knew nothing about, the computer hadn't been invented and information access was highly limited. 

In many ways, the story of the development of CABI is emblematic of the entire history of the online information industry. Data from scientists in the field were published in learned society journals. In turn, synopses of these articles fed the abstract journals designed to help scientists keep up-to-date with research in their field. When the production of these journals was computerized, the resulting electronic versions of the journals became databases. Search companies developed to provide common access languages to these databases. More recently, database-producing companies have looked to the Web both as a distribution channel for their established information and for innovative ways to repurpose that data.

In the case of CABI, there was the additional component of classification of the new knowledge being contributed by the scientists in the field. This began as a card index and evolved into abstract journals. The BioScience Division of CABI traces its ancestry to the card file. (The second division of today's CABI is Publications.) The U.K.government noticed the success of the BioScience Division and set up other units for the applied life sciences. In 1928, these became the Commonwealth Agricultural Bureaux, with centers around the country. 


It was the late 1960s when the promise of computer production beckoned. According to Chris Ison, Sales and Training Manager for CABI, who has firsthand knowledge of the databases' development, CABI borrowed INSPEC's software to run a trial of the new production system. By 1973, all the journals were computerized. Intriguingly, the initial physical computerization was done at the Mars chocolate factory down the road in Slough and turnaround time was 14 to 16 weeks, an eternity in today's terms. 

The first Online Information meeting, then known as the International Online Information Meeting, was held in 1976 and CABI exhibited. In fact, CABI holds the distinction of being at all the Online Information shows since then—a full 25 years. Going online with Dialog and SDC Orbit was the next logical step. Today, the full CAB ABSTRACTS is online with Dialog as File 50 and CAB HEALTH is File 162. STN assigns CAB ABSTRACTS the acronym CABA. Other hosts, such as DataStar and DIMDI, break up the databases into specialty subsets that can be searched separately in addition to carrying the full databases. The combined database (CAB ABSTRACTSand CAB HEALTH) is also on the Internet as CABDirect (www.cabdirect. org) with records dating back to 1973.

CABI was also a pioneer when it came to putting its databases on CD-ROM. Comments Ison proudly, "We were the first publisher to demo a CD database at an online show. That was in Sydney, Australia." He goes on to note that the CDs are still popular, particularly in developing countries where online access remains problematic. Putting backfiles on CD is, however, expensive and CABI does not plan to retrospectively add data. The exception is forestry, where the TREECD, available through SilverPlatter, goes back to 1939.


Looking at the electronic records brought home the shortcomings of the paper system. Suddenly, the duplicated data and inconsistencies in indexing became evident. "It was in the 15% range," says Ison. "This led to a quality improvement project that included the creating of the CABI Thesaurus. This was fine for the databases going forward, but we couldn't clean up the backfile. For example, if we chose the word "cattle" as the accepted term, we couldn't change older records that might have used "cow" or "bovine." We also decided to allow our indexers to override the system. If an indexer enters a term that's not on the list, a message pops up asking if that's the term the indexer really wantsto use. We track these to see if the new terms should be added to our main thesaurus."

The CABI databases have become models of detailed indexing, with separate fields for descriptors, organism descriptors (this is where you'll find "cattle"), broader terms (often the scientific names are included here), and CABICODES. No wonder, since CABI worked with the Food and Agriculture Organization and the U.S. National Agricultural Library to create authority files. CABI is so proud of its controlled vocabularies that it is providing an indexing toolkit as a standalone product. If you're in a non-profit organization, the toolkit is free.

Last year, CABI introduced the third generation of its production system. In addition to creating a cleaner version of the journals and the database, this system will allow for a thorough cleanup of the backfile. 


The Web opens up new possibilities for CABI (, particularly incommunity building. Specialty sites include leisure and tourism (http://www., agricultural biotechnology (, veterinary medicine (http://www.animalscience. com), nutritional science (, and organic farming ( One advantage the Web brings is the opportunity to include more and different types of information. All the sites include recent news (the cost of reforming U.S. baseball on the site and the slowing of the organic farming boom in the U.K. on the organic-research site, for example). Mostinclude information on relevant books (CABI publishes 40-60 per year), conference proceedings and papers,links to other topical Web sites, a calendar of events, and job listings. Some of the community sites also support an online discussion area.

It should be no surprise that the abstracts subsets are also searchable on the community sites. Although much of the data included on the specialty sites is free, the abstracts require a paid subscription. Still, the cost is quite reasonable, ranging from $120 to $295 for an individual user, and there are free trials offered. What's more, CABI has been working with ingenta, Ovid, and SilverPlatter to provide full-text linking. If you search CABDirect and you or your institution subscribes to a journal included in ingenta Journals, you can link directly to the full text on that host. Those lacking a subscription can use one of the document delivery firms, such as Infotrieve or CISTI, with which CABI has a contract.

The scope of subject matter in CABI abstracts and databases is growing steadily. "Many people," says Ison, "don't realize that we cover topics, such as human nutrition, nutraceuticals, infant feeding. They think we just do agriculture and biology. It's not always obvious how some of the products developed. The Leisure & Tourism database, for example, grew out of our cov-erage of alternative land use." 

Not only does CABI have a rich history, it's got a consistent one. As a non-profit organization, classified as a reg- istered charity in the U.K., CABI has been able to concentrate on doing what it does best—tracking academic scientific information and making it available in electronic form. The move from static ASCII databases to interactive Web communities, while retaining its traditional presence with online hosts, makes CABI an interesting case study both of the history of the industry and its future possibilities.

