Online KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools Intranets Today ITIResearch.com KMWorld Library Resource Literary Market Place OnlineVideo.net Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer



Magazines > Information Today > June 2004
Back Index Forward
 




SUBSCRIBE NOW!
Information Today

Vol. 21 No. 6 — June 2004

NEWSBREAKS UPDATE
The Latest on Factiva, Ingenta, Google, and More
By Paula Hane

May was busy, with several events offering excellent learning and networking opportunities: WebSearch University, Enterprise Search Summit, Streaming Media East, a NISO workshop on metadata, and a number of state library association conferences. But it was a bit quieter than usual on the news front. Library and information vendors seemed to be holding back their big announcements for two events in June: the SLA and ALA annual conferences. These two gatherings, with their huge exhibit halls and thousands of attendees, offer excellent opportunities for vendors to roll out new products, showcase technologies and applications, and meet with and entertain customers.

This year, the editors at Information Today, Inc. will provide live blog coverage of the SLA conference, as we did for Online Information last December. We will include what's new on the exhibit floor, what's hot from conference sessions, photos, and general impressions of the overall SLA experience. The "Live from Nashville" blog (http://www.infotodayblog.com), sponsored by ProQuest, will have postings from the conference June 3—10, 2004.

Factiva Update

In March, Factiva launched its iWorker Search Technology, a new algorithm-based product platform. Barbara Quint provided a look at the preview, with its new interfaces, in her March 1 NewsBreak (http://www.infotoday.com/newsbreaks/nb040301-2.shtml). The patent-pending system seamlessly matches simple keyword searches to the filtering capability embedded within Factiva's proprietary taxonomy. In addition, the search experience is personalized; the user sets preferences for a specific region and industry, which influences the results relevance.

The company recently introduced Factiva iWorks, a new product designed for information workers (outside the corporate library) within corporations and enterprises. Factiva iWorks lets organizations integrate the functionality of Factiva's iWorker Search Technology within any computing environment. With the new product, Factiva says it's directly addressing the needs of the information worker, who has been trained to search by typing a few keywords into a little white box on a free Web engine.

Factiva iWorks provides information workers with a current-awareness tool. It does not access the full archive of Factiva content but features just a 90-day archive of Factiva's collection of more than 6,000 continuously updated sources. The product offers integration with work flow, with multiple possible access points: in a browser toolbar, Microsoft Office 2003, or a module for a portal or intranet.

Enterprise pricing for Factiva iWorks starts at $1,600 a month for up to 50 users. Individual subscription pricing is available via registration in Microsoft Office 2003. Access costs $9.95 for 10 articles per month, or $2.95 per article. After entering a query, unregistered users get headline results and are prompted to register when they select a headline.

The company claims that more than 60 percent of Factiva's content is not available for free on the Web. The statistic comes from a 2002 white paper, "Free, Fee-Based, and Value-Added Information Services," written and edited by Mary Ellen Bates and Donna Andersen. The methodology is included in the paper (http://www.factiva.com/collateral/
files/whitepaper_feevsfree_032002.pdf)
.

Bates recently updated the white paper, though it had not been published at press time. The findings were the same. One key comment is well-understood by those of us in the industry: "The free Web, therefore, is seriously lacking in important business content, and the information that is available is difficult to access. When knowledge workers search only the free Web for information, it is likely that they will fail to turn up critical facts."

Out on the Web

While the traditional vendors were gearing up for June announcements, things were anything but quiet over the last month on the Web-search scene. News from and about Google continued to dominate. The big news was the SEC filing (finally, after months of speculation) of Google's IPO registration as well as the information revealed in the filing document about the company and its rivals. But Google also made news with a major upgrade to its Blogger software and the launch of its Google Blog (http://www.google.com/googleblog), which offers "insight into the news, technology, and culture of Google." (Puh-leeez! As if we don't hear enough about Google and the "Googleplex"!)

Google Reaches Out

Of greater interest and importance to researchers were Google's recently announced partnerships with traditional information industry companies, which continue its initiatives to include scholarly content. Ingenta, PLC, a provider of online publishing services to academic and professional publishers, announced the successful implementation of full-text indexing by Google. Ingenta joins organizations like IEEE, OCLC, and others that now have content indexed by Google.

Google had been indexing the freely available metadata on Ingenta.com, ensuring that article titles, keywords, author names, and abstracts appeared in search results for Google users. But as of March, Ingenta enabled full-text access for the crawler (the "Googlebot") so that all words in articles, not just abstracts and keywords, are indexed and searchable on Google. According to the announcement from Ingenta, after enhancing the indexing, the Ingenta.com site's usage jumped dramatically, "with Google referral traffic contributing to a record 5.4 million user sessions on Ingenta.com in April."

Not all Ingenta publishers have even been included in these initial results. Ingenta had switched on full-text crawling as a trial for a handful of publishers, including CABI Publishing, Professional Engineering Publishing, FD Communications, Inc., and American Ceramic Society, and said it will now be adding more publishers.

Google users who click on a search result are presented with an abstract page on Ingenta.com, where they are either authenticated for full-text subscriber access by virtue of IP address or user name/password, or they're offered pay-per-view access.

Ingenta senior product manager Kirsty Meddings said: "Becoming aware of Google's initiative to index more scholarly content, Ingenta saw the opportunity to increase the visibility of our publishers' material. Ingenta coordinated directly with Google to put these benefits into effect, avoiding the need for any of the publishers to become involved with the technical details. This relationship is a natural extension to Ingenta's role as intermediary between publishers and third parties."

Jumping on the Google Train

Extenza, another U.K. company, announced that Google is indexing the e-journal content (in either Adobe PDF or full-text HTML) held on its Extenza e-Publishing Services journal hosting platform. In making the announcement, the company stressed the twofold benefit of the arrangement: It helps users find that important piece of data they're seeking, and it helps publishers by driving utilization and traffic to their content, with potential revenue benefits. Extenza's customers range from society and not-for-profit publishers to commercial publishers.

If you're not familiar with it, Extenza e-Publishing Services is part of Extenza, a division of Royal Swets & Zeitlinger. Extenza not only provides conversion and hosting services for publishers but also helps librarians manage e-journal subscriptions, enables access, and delivers usage statistics. The company recently announced an alliance with ProQuest to offer a broad portfolio of e-journal and database services for publishers and libraries. The companies said the agreement delivers "a complete distribution and hosting solution for publishers, simplifies access for end users, and streamlines e-journal management for libraries."

CrossRef Update

CrossRef, a 300-member publisher trade association that provides a cross-publisher reference-linking service, announced a pilot project called CrossRef Search that enables users to search the full text of scholarly journal articles, conference proceedings, and other sources from nine leading publishers. (See Barbara Quint's NewsBreak at http://www.infotoday.com/newsbreaks/nb040503-1.shtml.) Not surprisingly, Google is supplying the search technologies, while CrossRef is providing the reference links to publisher Web sites. While Google incorporates CrossRef content connections into its general Web search engine, users who go to publisher Web sites and click on the CrossRef Search icon reach just the scholarly subset.

Separately, CrossRef announced that it now has 307 publisher members. According to the organization, CrossRef's rate of growth has nearly doubled in recent months, due to the new fee structures that took effect in January. More than 50 publishers have joined CrossRef since the start of this year. CrossRef has also signed on several new libraries and affiliates in 2004, including Nerac, a Connecticut-based research and information discovery service. In addition, Forward Linking is now live on the CrossRef system and available for testing. The service is on schedule for official launch this month.

Finding or Losing?

All of these publisher and vendor deals with Google raise the sticky issue of searching subsets versus the entire mass of indexed Web content. Will users of Google's general Web search engine really benefit? Will the scholarly articles rise high enough in search results to actually be found, or will they be buried in obscurity many thousands of results down? Placement is certainly an issue. Wouldn't it be more productive to search within slices of content?

Barbara Quint pointed out the visibility problems in Google Print, Google's own beta book search service
(http://www.infotoday.com/newsbreaks/nb031222-2.shtml)
. She suggested a sub-domain for these book records: "One called 'Library' comes to mind."

OCLC, which has been testing the opening of WorldCat records to Google access since June 2003, has a similar problem with visibility. And the bibliographic records in WorldCat are pretty slim by Google's indexing standards. (See the NewsBreak at http://www.infotoday.com/newsbreaks/nb031027-2.shtml.)

According to a status report on the OCLC site: "Current page rankings for records are not indicative of final page rankings that will be in place when all records have been properly indexed. OCLC and Google continue to work on improving the ranking of WorldCat records."

To locate WorldCat records on Google, use the following:

• "ISBN" and ISBN number
(e.g., isbn 9630525119)

• Search term plus "find in a library"
(e.g., cats "find in a library")

• Search term plus "worldcatlibraries"
(e.g., cats "worldcatlibraries")

OCLC has said that it will decide this month whether to expand, continue, or discontinue the pilot project. Stay tuned for a report on this as well as commentary on the issue of scholarly content in the Google catalog.

Washingtonpost.com's Award

Despite constant media attention, Google doesn't always grab the top spot. The winners of the 2004 EPpy Awards were recently announced by Editor & Publisher and Mediaweek magazines at the Interactive Media Conference & Trade Show. Winning in the category of "Best Internet News Service [with] Over 1 Million Monthly Visitors" was washingtonpost.com. The site took the award over both Google News and FT.com. According to a posting by veteran journalist Jonathan Dube on CyberJournalist.net, not only did the audience applaud loudly, but the "real buzz" came after MarketWatch.com president and CEO Larry Kramer addressed the crowd and said he was disappointed to see Google News as a finalist in the category and that Google News "is just not journalism." Kramer reportedly emphasized that journalists have "a responsibility to provide the right filters."

Interestingly, washingtonpost.com also won in the category of "Best Internet Entertainment Service [with] Over 1 Million Monthly Visitors" and was a finalist in several other categories. Kudos to this excellent resource.

Open-Access Update

The heated debate continues in the open-access (OA) space. A good way to stay informed is with Peter Suber's Open Access News (http://www.earlham.edu/~peters/fos/fosblog.html). For a flavor of some of the ongoing discussions, see the American Scientist Open Access Forum (http://amsci-forum.amsci.org/archives/september98-forum.html). Be forewarned if you sign up for e-mail: These are very active resources.

The U.K. Parliament's Science and Technology Select Committee continued its inquiry into the pricing and availability of scientific publications. Following on his coverage in the April issue of Information Today, Richard Poynder reported in a NewsBreak on the third evidence session held on April 21 (http://www.infotoday.com/newsbreaks/nb040503-3.shtml). There was a very definite divergence of opinion. Librarians clearly stated that there was a crisis, while U.K. academics said there was not and expressed skepticism about OA publishing.

According to Poynder, the librarians expressed concerns about "excessive pricing, inflexibility over the 'bundling' of electronic journals, inequitable copyright agreements, and restrictions on long-term access to digital material." That's no surprise to those of us who are following the backlash among U.S. librarians and academics.

The Select Committee's final oral session on May 5 took evidence from U.K. research councils. (The uncorrected transcript is available at http://www.publications.parliament.uk/
pa/cm200304/cmselect/cmsctech/uc399-iv/uc39902.htm
.) The committee will issue its report this month, after which the U.K. government has 2 months to respond. Watch for our ongoing coverage.

Meanwhile, Thomson ISI announced that journals published in the new open-access model are beginning to affect the world of scholarly research. Of the 8,700 selected journals currently covered in Web of Science, 191 are OA journals. A study by Thomson ISI on whether OA journals perform differently from other journals in their respective fields found that there was "no discernible difference in terms of citation impact or frequency with which the journal is cited" (http://www.isinet.com/oaj).

Thomson's First-Quarter Results

Speaking of Thomson ISI, its parent, Thomson Corp., announced its first-quarter 2004 financial results. CEO Richard Harrington said the company was "off to a very solid start" for the year, reporting that revenues were up by 9 percent, though profits were down. He noted that Thomson was seeing signs of improvement in areas that had previously been weak, especially in demand for financial services. The company expects full-year 2004 revenue growth to be in the "mid-single-digit range." Let's hope this outlook holds for other companies in our industry.

In a Webcast with press and analysts, Thomson outlined the following priorities for 2004:

• Invest in high-potential market segments

• Acquire companies selectively—specifically, those with strong content to leverage in existing operations

• Pursue international growth, especially in Europe and Asia/Pacific

• Refine the front-end customer strategy, both to identify new customers and to target products and services for sub-segments of Thomson markets

• Build tailored, integrated information solutions

• Leverage assets across the organization to provide better products and operating efficiencies

For the latest industry news, check http://www.infotoday.com every Monday morning. An easier option is to sign up for our free weekly e-mail newsletter, NewsLink, which provides abstracts and links to the stories we post.


Paula J. Hane is Information Today, Inc.'s news bureau chief and editor of NewsBreaks. Her e-mail address is phane@infotoday.com.
       Back to top