The Latest on Search, Content, and
By Paula Hane
April and May are wonderful months. The greening, flowering, and warmer temperatures
cheer us, and sending in our tax returns provides great relief. Many folks
enjoy Earth Day activities on April 22. This year's National Library Week,
held April 18—24, marked the fourth year of the Campaign for America's
Libraries, a public education effort sponsored by ALA to promote the value
of libraries and librarians in the 21st century. To help libraries showcase
their many services that week, Thomson Gale offered them free access to promotional
resources as well as to a number of databases.
Information Today, Inc. is sponsoring three conferences May 11—12 in
New York: WebSearch University, Streaming Media East, and Enterprise Search
Summit. The latter is a brand-new event targeted at those who are tasked with
implementing site-search functions within their organizations. In advance of
the summit, anyone interested in enterprise search is invited to test-drive
some leading technology solutions by visiting a unique implementation that
searches content on the ITI sites. The new Enterprise Search Center launched
on April 5 and will be available throughout this year
Enterprise Search Update
Enterprise search continues to be hot. Recently, Endeca announced the availability
of Endeca ProFind 4.1, the latest version of its enterprise search and navigation
platform. Endeca also teamed up with Stratify, a provider of unstructured data-management
software, and Taxonomy Warehouse (part of Synapse Corp.), a provider of industry-specific
taxonomies, to bolster its capabilities for managing and adding structure to
traditionally unstructured documents and content, such as e-mail, Word files,
PowerPoint presentations, Adobe PDF files, etc.
Speaking of taxonomies, ProQuest Information and Learning announced that
Convera (formerly Excalibur Technologies) will use ProQuest's taxonomy to classify
and organize information in its proprietary RetrievalWare software products.
RetrievalWare solutions provide searching across more than 200 forms of text,
video, image, and audio information in more than 45 languages.
Entopia, Inc. recently launched a new Software Development Kit that will
allow application developers to integrate Entopia K-Bus into any application.
The company says K-Bus offers "information discovery" functions, including
enterprise search, expertise identification, content visualization, social
networks analysis, and content connectivity.
In a recent Forrester report, "The Future of Enterprise Search," principal
analyst Paul Sonderegger said: "Search will mature beyond helping people find
what they're looking for to helping people understand what they've found. The
key is making the most of structure in content and creating it where it doesn't
While enterprise search will continue to be a major growth area, it seems
likely we'll also see some consolidation among the companies that compete in
this space. At last fall's KMWorld & Intranets conference, more than half
of the knowledge management exhibitors identified themselves as being in the "search,
taxonomy, and classification" markets. There may be some key technology acquisitions
as the enterprise search companies work quickly to integrate all the functionality
that customers demand, including entity extraction, advanced linguistic technologies,
taxonomies, and classification.
More or Less?
Thomson Gale said it has finished loading more than 500,000 backfile investment
reports into the Investext Plus database. This new content, added at no cost
to existing subscribers, extends the database backfile to 1982.
On the other hand, sometimes there's less content to report. Thomson Gale
also said it has been informed by the British Medical Association (BMA) that
the full-text of 29 BMA health titles may no longer be offered through InfoTrac
or the Thomson Gale Resource Centers. The BMA decision was effective April
16. Gale will retain BMA data in the backfiles and will continue to abstract
and index the titles going forward.
Searchers have always had to keep up with additions and deletions in database
content. Now, a similar task confronts users of content-rich Web sites and
services. Do you really know what's included? Reports indicate that Reuters
is pulling back a lot of the free business news it has made available to some
Web sites and portals, such as Yahoo! Finance, MSN Money, Forbes.com, CBS MarketWatch.com,
and Quote.com. Headline feeds on some sites will link readers back to the Reuters.com
site. Eventually, some top information will be available by subscription only.
On the other hand, Thomson Financial is replacing Reuters content with content
from MarketWatch.com, a multimedia publisher of business news and a provider
of financial information and analytical tools. The Wall Street Journal reported
that Thomson was concerned that Reuters' material was going to Thomson's institutional
clients. Thomson Financial is teaming up with MarketWatch.com to develop a
new, tailored online news service that will be delivered via Thomson ONE. The
partnership will bring together MarketWatch's news coverage by financial journalists
and Thomson Financial's proprietary content and market analytics to create
a service that's focused on real-time market, industry, and U.S. company news.
The Thomson news service will be available exclusively to Thomson ONE customers
as well as to clients of Thomson affiliates. The announcement said that Thomson
Financial and MarketWatch together will build an expanded staff and invest
resources in a journalistic effort that's 100-percent committed to Thomson
ONE news content.
A recent newsletter from research and advisory firm Outsell, Inc. summed
it up: "Suddenly, the advantage Reuters, Bloomberg, and Dow Jones had over
Thomson by owning news-gathering organizations to complement their financial
information coverage is eroded, further fueling the already hot battle among
the four companies for supremacy in the large and content-dependent institutional
financial market sector. If Thomson and MarketWatch.com prove they can work
well together and the partnership terms and pricing are not cockeyed, this
could give Thomson a sliver of advantage in some bake-offs for accounts."
So it's probably not a coincidence that Dialog, a Thomson company, announced
that real-time financial and business news and analysis produced by CBS MarketWatch
is now available through four of its services: Dialog NewsRoom, Dialog NewsEdge,
Dialog NewsEdge Live, and NewsEdge Insight.
It seems impossible to have a month go by without some news from Google.
The search giant recently modified and enhanced its home page and results pages,
added personalization features and Web alerts, and, most significantly, launched
a free e-mail service.
Google Personalized Web Search and Google Web Alerts, both debuting on Google
Labs, are designed to let searchers specify what interests them and receive
customized results. Google Personalized Web Search uses preferences to deliver
results. Searchers can control their level of personalization using a slider
and see the results change dynamically as the level changes.
Google Web Alerts provides automatic updates for Web users. After specifying
keywords they want to track, users can receive daily or weekly e-mails with
links to new Web page results plus top Google News stories that are related
to each query. Users can still choose to receive only News Alerts. In addition,
Google News now features images in search results and displays thumbnail images
of photos that relate to news stories.
Google also announced that it's testing a preview release of Gmail, a free
search-based Web mail service with 1 gigabyte (!) of storage capacity per user
(http://gmail.google.com). Built on the Google search engine, Gmail can quickly
recall any message an account owner has ever sent or received, thus eliminating
the need to file messages in order to retrieve them. Gmail automatically groups
e-mail and all replies together in the proper context. The service also includes
textual ads matched to the content of the displayed e-mail.
Rich Wiggins reported on the news and ensuing buzz about Google's April Fools'
Day announcement of Gmail
(http://www.infotoday.com/newsbreaks/nb040405-1.shtml). The nature and timing
of the announcement caused initial doubts of Gmail's authenticity, but over
the next few days the
coverage focused on the privacy issues raised by the targeted ads.
Google Gets Flak
Within a week, the World Privacy Forum and 27 other privacy and civil liberties
organizations had written a letter asking Google to suspend its Gmail service
until the privacy issues are adequately addressed. The letter also asked Google
to clarify its written information policies regarding data retention and data
sharing among its business units (http://www.privacyrights.org/ar/GmailLetter.htm).
The organizations voiced concern that scanning confidential e-mail to insert
third-party ad content violates the implicit trust of an e-mail service provider.
The scanning creates lower expectations of privacy in the e-mail medium and
may establish dangerous precedents. Other concerns include the unlimited period
for data retention that Google's current policies allow and the potential for
unintended secondary uses of the information Gmail will collect and store.
Then, Sen. Liz Figueroa, D-Calif., who called Gmail a "Faustian bargain," said
she would introduce legislation to block the service. Some media outlets said
the privacy issues were overblown. At press time, there were conflicting reports
about whether Google was considering changes to Gmail to placate the privacy
concerns. In The Wall Street Journal, Google co-founder Sergey Brin
said that the idea was "being batted about."
Interestingly, Wiggins noted that the limited testing of Gmail by Google
staff and invited friends means outsiders aren't experiencing it firsthand.
He wrote, "It occurs to me that Google made a huge mistake by failing to let
members of the press try out Gmail."
Speaking of potential privacy problems, at press time, Amazon had just rolled
out the beta version of its new A9 search engine. Its most prominent feature
is providing a user's search history. More on this news to come.
Though other search engine news was somewhat eclipsed by all the Googling,
there were some other important developments. Yahoo! introduced Yahoo! News
Search 2.0, which now lets users search more than 7,000 global news sources
in 35 languages, a significant enhancement over the previous Yahoo! News. Other
improvements include a new related search feature that offers suggestions for
refining queries, more frequent crawls to update the news, and sorting of results
by relevance or date.
Finally, if you didn't get enough chuckles on April Fools' Day or you just
need a break from serious news, I recommend a recent piece in The Onion, a
satirical weekly publication (http://www.theonion.com/news/index.php?issue=4014&n=1).
Rich Wiggins pointed out an article titled "Yahoo! Launches Soul-Search Engine." Written
like a formal news article, the piece supposedly details Yahoo!'s latest foray
into the competitive search market. Here's a wonderful "quote" from Yahoo!
CEO Terry Semel: "Capable of navigating the billions of thoughts, experiences,
and emotions that make up the human psyche, the new Yahoo! soul-search engine
helps users find what's deep inside them quickly and easily. All those long,
difficult nights of pondering your place in this world are a thing of the past." Hmmm,
I wonder how far-fetched this really is?
Meanwhile, specialized search engines, which offer more focused results and
specialized content and features than the big guys of search, continue to fill
important niches. This is the notion of narrowcastingnarrowing a search
to a specific industry or topic. Whether you're looking for legal, biomedical,
or scientific information, searching content that has been editorially chosen
and is often not reachable by a general search engine will provide faster access
to better results.
A classic example is GlobalSpec, the specialized online resource for engineering.
It recently launched a new interface; added more powerful search functionality;
and introduced a specialized search engine it's calling The Engineering Web,
which it says provides "engineering context and relevancy" as well as access
to hidden Web resources. The engine searches more than 100,000 engineering
and technical Web sites and provides searching of specialized content the company
says is not available on any other engine: application notes, patents, material
properties, and standards.
In addition to the Web resources, GlobalSpec's proprietary SpecSearch allows
users to search by specification more than 60 million parts in 1 million product
families from more than 10,000 supplier catalogs. The company built both the
search technology and taxonomy and now has 1 million registered users. (See
the NewsBreak at http://www.infotoday.com/newsbreaks/nb040329-1.shtml.)
Since engineering is so information-intensive, I guess it's not surprising
that this field continues to draw resource-development initiatives. Elsevier
Engineering Information announced the launch of Referex Engineering, a specialized
electronic reference product hosted on the Engineering Village 2 platform.
Referex Engineering draws on more than 300 of Elsevier's book titles to provide
engineering professionals with a fully searchable reference database that offers
both breadth and depth.
The company says that Referex Engineering is designed on a concept of "layering
content" to create the breadth and focus that researchers, professional engineers,
and academics require. By layering broad-based handbooks, professional reference
works, and how-to guides with specialized monographs and scholarly texts, Referex
Engineering has created a foundation of information that allows searchers to
quickly find solutions to their reference needs.
Another new research tool allows biomedical and life science researchers
to search the MEDLINE database more productively and efficiently. Vivísimo's
ClusterMed organizes the long list of results returned by PubMed into hierarchical
folders with meaningful categories. This allows researchers to home in on the
most relevant results quickly. Vivísimo developed proprietary biomedical
knowledgebases and algorithms for the sophisticated text processing of the
PubMed records. ClusterMed is licensed to companies on a yearly subscription
for local server installation. A demonstration site is available at http://vivisimo.com/clustermed.
(See the NewsBreak at http://www.infotoday.com/newsbreaks/nb040405-2.shtml.)
Content in (Ad) Context
Finally, one of the more interesting commentaries I've read recently is by
analyst John Blossom of Shore Communications, who wrote about how non-publishers
are monetizing content. Wal-Mart, Procter & Gamble, and Ford are among
the companies that are providing their own private contexts for print and online
content and are launching their own publications for customers. Now, mass-media
publishers are finding themselves under the gun to hang on to desirable shelf
space and Web clicks. Basically, advertisers are creating their own contexts
in which to place ads. Librarian alert! The need to educate readers to bias
issues is rising to a new level of urgency.
For the latest industry news, check http://www.infotoday.com every Monday
morning. An easier option is to sign up for our free weekly e-mail newsletter,
NewsLink, which provides abstracts and links to the stories we post.
Paula J. Hane is Information Today, Inc.'s news bureau chief
and editor of NewsBreaks. Her e-mail address is email@example.com.