Once upon a time, a group of bright-eyed chemistry students were introduced to the mines of chemical literature, a huge and ever-growing ore which mainly resided in nicely bound learned journals and reference works. They learned, if they were lucky, how to mine and use this wonderful resource in the course of their education and future career endeavors. However, this amazing resource became less and less readily available to more and more chemists as it involved actually going into a chemistry library and cracking open those tomes of Chemical Abstracts, Beilstein’s Handbuch, and the like. Gray-beard mentors recounted how, in the good old days, a subscription to Chemical Abstracts was included in one’s membership in that guild of chemists, the American Chemical Society (ACS). Now, access to the mines was being restricted by increased costs and decreased accessibility.
Even more sinister than these trends was the growth of the resources. After World War II, the chemical literature began growing unfettered, producing additional problems besides bigger volume. The currency of the information became more and more out-of-date. Beilstein, specializing in evaluated and verified data and information on organic chemistry, was years and even decades behind in its coverage. Chemical Abstracts, with its far flung army of volunteer abstracters, was often 2–3 years behind.
Help began to arrive in the ’60s and ’70s largely aided by those wonderful machines, computers, and, of course, the savants who knew how to use them. Chemical Abstract Service (CAS) began to provide computer-generated current awareness journals and services. In addition, it eased out the crew of volunteer, part-time abstracters to wonderful pastures of retirement and brought those essential functions in-house in Columbus, Ohio. Both currency and accuracy steadily improved. Beilstein improved its excellent editorial processes, became more current, and even began issuing monographs in English. Even better, the products and services of both CAS and Beilstein, as well as a host of other services from other sources, became available digitally in addition to the venerable printed products.
However, was this the end of the story, boundless information and data for all? Hardly. The majority of the masses of toiling chemists (also on the increase) had not taken the time nor had the opportunity to learn about these wonderful aids to mining. Even while becoming computer conscious in other ways, they were ill-equipped or poorly motivated to learn how to use these wonderful new resources. Therefore, the breed of research librarians, always extant but often ignored by the chemist miners, began to grow, both in knowledge but also by assimilation of a new breed of subject information specialists who not only provided intermediary access to the burgeoning and always expanding information resources but also attempted to educate their customers and clients about the value and effective use of these resources. Goaded by their sage colleagues who accused them of becoming high priests with exclusive keys to the kingdom, a few information specialists began training their chemists and engineers to do their own information access. However, the systems still had a rather high entry level of proficiency and the chemists’ bosses often objected to their spending all that time away from “real work.” Although CAS, through the STN service, kept adding additional files and searching tools, costs of access were also rising, probably affecting academia more than the commercial sphere.
In the mid-’90s, both CAS and the successors to the Beilstein Institute came out with truly end-user oriented products. CAS/STN developed SciFinder, marketed first to commercial users and later to academia, initially as SciFinder Scholar and later as a merged service. However, costs remained high, even for academia with its long-standing academic discounts, and access was further limited for many with restrictions on the number of “seats” for simultaneous users. Beilstein first issued an end-user-oriented service named Crossfire, which later morphed into Reaxys under the newest owner, Elsevier. Reaxys remains a subscription service while SciFinder access has finally arrived in a web version, with subscribing organizations either opting for unlimited, flat-fee access or by the task. However, the “seats” are now a thing of the past.
The impact of these systems has been profound, not only for the additional bells and whistles available to science researchers, but also for information providers. In industry, both in-house information specialists and independent consultants have found both inboxes and customer lists have decreased dramatically. This has definitely been the case for this author, who ironically has never used SciFinder (nor Reaxys) due to their arrival after he departed the corporate castle for his own humble abode 18 years ago. He, and many other veterans, continued (and still continue) to use systems such as STN for their searching. Of course, the cost of using the systems is borne by customers since “casual” or personal use is limited by dwindling resources.
However, the ACS has recently made SciFinder available to all members who care to try it. We (I’m a 51-year member) get 25 complementary “activities” for a period of time. This availability, along with a previous offer of a month’s complementary access to Reaxys, has allowed me to embark on a comparison study (a valuable endeavor that I dearly love but can seldom support) between the two services, which constitutes the remainder of this article. It’s still apples and oranges since the two services have differences in coverage and emphasis, but I’ll give it the old college try. In the process, I’ll also be comparing searching in SciFinder to searching in STN, the system with which I have decades of experience.
First a Look at SciFinder
After viewing some of the training materials on the SciFinder training site (cas.org/training/scifinder), I started searching. In the past, when giving training sessions or demos of online searching, I always started with an author search. My favorite example is Paul von Rague Schleyer, who was one of my professors in grad school at Princeton in the ’60s. He later moved to the University of Erlangen, Germany, and finally went into “semiretirement” at the University of Georgia, having migrated from physical organic chemistry to computational chemistry. His middle name is von Rague and that along with variable editorial policies have produced several listing of articles with him as an author.
Searching the Chemical Abstracts (CA) file with STN, if one expands ™Schleyer P∫, an expand/display list is generated with all the possibilities that can be selected with the “S” command. SciFinder provides three options for searching: Explore References, Explore Substances, and Explore Reactions. With Explore References, the default option is Research Topic. Additional options include, inter alia, Author Name and Company Name. Searching in the author name option by filling in the boxes with Schleyer P generated a list of 21 possibilities, including the full name, combinations of initials for first and middle names, and some misspellings (Schlayer, van Rague), as well as a few listings for a P(aul) J Schleyer. Checking the select boxes and clicking Get References yields 1,505 references (some people just can’t retire; I know the feeling). My reaction to SciFinder’s performance? Just as effective as STN, more intuitive for end users, and more in line with other programs in use.
Next I decided to search an aspect of resveratrol, which is a polyphenol, a bioactive compound present in wines, especially red wines. It’s also a component of Oriental traditional medicines. It has been nominated as one of the antioxidants partially responsible for the beneficial effects of the Mediterranean Diet and has had an up-and-down track record as a possible anti-aging compound. I had not seen any bioactivity or modeling studies but the structure looks like it could be an endocrine modifier, especially an estrogen mimic. Largely ignored in all the fuss about BPA, a monomer indicted as an estrogen mimic, is the fact that several to many natural products, including soy, have estrogenic properties.
Using the Research Topic option, one can type in sentence form or use key terms. Also note that the search can be restricted to publication year and 14 document types. I chose the default, unrestricted option, entered resveratrol and endocrine, yielding 130 references (CA and MEDLINE file abstracts). Scanning a few abstracts did indicate that resveratrol does have endocrine activity. Answer sets can be either analyzed (click the Analyze By button) by 12 reference attributes (corresponding to “roles” in STN Searching, e.g., “preparation,” “biological study”), narrowed by Refining (click the Refine button) with seven attributes, or by including further search terms (Research Topic). Since I perceive the key to endocrine mimic activity to be interactions with endocrine receptors, adding the search term receptor narrowed results to 70 references, and adding estrogen narrowed it further to 45. Adding binding (a key differential function of activity) narrowed to 16 hits, and adding competition (to try to determine comparative activity of estrogen mimics with estrogen itself) further narrowed it to two hits, which gave detailed studies on such comparisons. The hits are relevant CA abstracts sorted either by descending accession number (default), author name, citing references, publication year (of original article), or title. Of course, detailed information and data would be in the original references. If one’s organization has subscription access to a source, the original source would be available directly.
To a “veteran searcher,” one used to command language searching with Boolean logic, the end-user-oriented “text sentence” method seems somewhat strange. Hints are given as to the use of prepositions in the search query (e.g., “effect of resveratrol on endocrine receptors by competitive binding”). My search strategy was by “successive fractions.” With this method, one of three primary searcher modes, one whittles a large topic progressively down to a manageable but relevant retrieval set. I wasn’t sure how many, if any, references would be found, but they would have shown up by means of the cascaded list of references retrieved by the sentence method.
Searching a compound by name as a Research Topic will retrieve references not only to the compound indexed by name but also by the CAS Registry Number (CASRN). However, Research Topic searches are run both in the CAPlus file (Chemical Abstracts) and in MEDLINE, where compounds are more likely to appear by name. As a matter of interest, duplicate references (references to original documents appearing in both CAPlus and MEDLINE) can be eliminated, either by choice or automatically.
In an alternative approach, resveratrol was searched by structure in the “Explore Substances” mode. Compounds can be searched by name, molecular formula, or structure. I’ve used STN Express, the structure drawing program for use in STN searching, and I found the structure drawing tool for SciFinder to be an improvement and more end-user-oriented. Structures can be searched as exact structures (WYSIWYG) or substructure (the structure imbedded in larger structures). Several subclasses of structures can be specified if desired. Drawing resveratrol and searching as a single component retrieved 19 compounds, including the “parent” compound. Extensive details can be displayed including physical properties. To get references (i.e., CA abstracts), one can select all references or categorize with the previously mentioned “roles.” I chose all references for resveratrol and retrieved 12,395 hits. Limiting to the roles of Adverse Effects or Biological Study yielded 604 references, from which 32 duplicate references (from MEDLINE) were removed. Refining by means of endocrine or estrogen?yielded 65 references. Further refining to receptor ?yielded 42; binding yielded 19; and competition or competing or compete yielded three. The results of the latter two sets were quite interesting and indicated that flavonoids (of which resveratrol is one) have relatively strong estrogenic activities. Unfortunately, one key paper is in Japanese, so the data itself was not that accessible.
Hit terms are highlighted in the output. The search history is available in the “breadcrumb trail” listed in a bar near the top of the displays. Conversations with the CAS Help Desk provided the knowledge that retrieval of plurals and truncated stems is automatic (with no way to inhibit). I wasn’t sure about how competition was handled, so I constructed the OR clause in the last refinement. As I can best determine, use of OR clauses in a classical Successive Fractions mode is the best way to perform such a search.
By means of “Explore Substances,” SciFinder is a valuable tool for searching chemical reactions as well as chemical compounds. For those researchers involved in chemical synthesis, such a tool is invaluable. Of course, chemistry involves far more science than just discrete chemical compounds and their preparation and reactions. Most if not all of the material world involves chemistry, and research in a number of areas is documented and accessible through SciFinder and related search programs.
Even though I haven’t conducted research at the bench for decades, it’s obvious that use of SciFinder by the bench chemist can greatly enhance efficiency and the use of information and data. The results can be not only incorporated into the chemist’s electronic workbench, but can also be shared with others in the research group or organization, tasks that were possible previously but more difficult to accomplish.
In my own case, as a semiretired chemist still interested in a number of topics, complementary use of SciFinder will help me further investigate topics like the one outlined above. In addition, there’s one compound I tried to prepare (unsuccessfully) several times when I was still in the lab. The compound has only 15 carbon atoms and the reaction should have worked, but never did. I used a “free” search to see if it does exist. It doesn’t. C’est la vie. However, there are a few other compounds out of my past which I’d like to track with free searches.
It’s yet to be determined how ACS member users will be charged once we’ve used our 25 complementary SciFinder “activities.” I’ll cross that bridge when I come to it.
A new version of SciFinder was released in late June, after my deadline (www.cas.org/products/scifinder/whats-new-in-scifinder; it includes a link for a guided overview presentation). However, I was able to get a preview. Several enhancements have been made including display of all types of search options available on the first page. Limiters applicable to advanced searches have been expanded. As previously, current awareness (“Keep Me Posted”) for the search results can be set up for weekly or monthly runs with the click of a button.
Ever since SciFinder became available, I’ve been asked why I didn’t use it for my consulting work. Even with all of the improvements, I believe that searching with STN provides more power for the wide range of topics that I search. A few dozen files other than CA and MEDLINE are included and a larger number of roles are available to categorize output of references for chemical compounds. Cost was another factor. Even now, with complementary searches, I couldn’t use SciFinder for customer searching. The licensing agreement states that any searches are “only for myself and not for others or other organizations.”
Elsevier’s Reaxys: a Beilstein Reincarnation and More
Reaxys (elsevier.com/online-tools/reaxys/about; elsevier?.com/online-tools/reaxys/training-support) is Elsevier’s workstation program for chemistry researchers. The original basis for the file were the Beilstein (organic chemistry) and Gmelin (inorganic chemistry) databases, which now constitute only a minority of the content. The chemical compound files also include the PubChem (NIH; pubchem.ncbi.nlm.nih.gov) and e-molecules databases (emolecules.com, a database of commercially available compounds).
A new version of Reaxys, described in this link, became?available in April 2013 (cdn.elsevier.com/assets/pdf_file/?0003/126507/Reaxys_whatsnew_2013.pdf; elsevier.com/about/press-releases/science-and-technology/reaxys-?significantly-expands-its-content-and-interface-to-deliver-chemistry-information-to-researchers-across-disciplines). This is the version I used in preparing this article. The literature base has expanded from the core of 400 chemistry journals and chemical patents to more than 16,000 periodicals, including books and conference proceedings, covering a broader spectrum of fields, such as engineering, the life sciences, and environmental sciences. The emphasis of the core journals is on chemical compounds, especially their preparation, reactions, and properties.
The emphasis is broadened in the 16K periodicals added. The coverage begins in 2005, and there are plans to expand the back file accordingly. This expanded file is being implemented in stages. First, the material 2005-present is being added to the file and made searchable by title and abstract words plus keywords. Later, further extraction of structures and data will be performed and will be searchable. Reaxys is available by subscription, I assume primarily by both academic and commercial research organizations.
Searching under the literature button can be done for a number of data elements. I searched for Schleyer, and the selection was more complicated due to variation in entry of author names in the original literature with no standardization applied. The search yielded 612 hits that could be filtered by a number of factors, including bibliographic details and physical and chemical properties of any compounds cited. Fewer hits were anticipated due to more limited coverage of source periodicals, but that will improve with time as the date range expands.
Searching chemical compounds can be accomplished via chemical names, molecular formula, or by chemical structure. The latter can be done by exact structure, substructure, or “similar” structure. Roles include Product, Starting Material, or Reagent or Catalyst. Additional optional attributes can be applied, which is especially valuable for substructure searches. Several open source structure searching programs are available, via applets, and three are built into the system, including MarvinSketch (chemaxon.com/products/marvin/marvinsketch), also used to search the PubChem file that is searched in addition to the Reaxys compound file. MarvinSketch was easy to use (once I updated my Java app), being somewhat different but similar to STN Express and the SciFinder drawing tool.
Both services provide extensive listings of properties data for chemical compounds. In the case of Reaxys, the older Beilstein and Gmelin data was evaluated and peer reviewed by the respective institutes. Once that process ceased in the ’70s, the selection criteria for the material indexed from the 400 core journals was maintained but the detailed editing was not. Data categories include the usual physicochemical data, spectra, as well as Bioactivity/Ecotoxicity which includes Pharmacological data. When searching resveratrol, 133 references were retrieved with pharmacological data, which was further limited to estrogen. Examining a reference in detail gives an original reference citation, an abstract, and keywords.
I also searched the “missing” compound from my lab past, and, as expected, it doesn’t exist in Reaxys either.
I was assisted in learning how to search Reaxys by a personal Webinar guided by Christine Flemming, product training manager for Elsevier.
Which to Use?
So now we cut to the chase. The answer to the big question depends on a number of factors. Both services cover a range of the chemical literature, and the results of searches can be effectively integrated into the workstation of the researcher and shared with other colleagues in the organization. If a researcher is primarily concerned with all aspects of chemical compounds, reactions, and properties, Reaxys may fill the bill. However, SciFinder provides similar capabilities and, even considering the expanded periodical base, covers a greater scope of original literature (including more patents) In addition, the literature entered covers more than just defined chemical compounds. The literature is not only abstracted but indexed beyond title and abstract words and keywords to provide better recall and precision. Finally, the matter of cost is literally the bottom line. If your organization subscribes to both services, it’s your choice. To independents like me, as an ACS member, SciFinder is the choice, at least until I reach the decision point of pay as I go. As stated previously, any consulting work I do for a client will be done with STN.
Systems like these will enable researchers of many stripes to perform research. As a former bench chemist (deprived of tools like these) and a veteran searcher who helped develop many of the predecessor systems through use and criticism, along with clients who were research chemists and engineers, I can well appreciate (and envy) the power now available to researchers in many scientific fields. However, I must admit that for many searches, including many done for my former clients, I prefer to use STN directly.
Buntrock, R. E., “SciFinder Redux and Related Chemical Information Developments,” Online Searcher, Vol. 37, No. 1, Jan./Feb. 2013, pp. 38–40.
R. E. Buntrock, “The Effect of the Searching Environment on Search Performance,” ONLINE, Vol. 3, No. 4, ?Oct. 1979, pp. 10–13.