ONLINE, September 2000
Copyright © 2000 Information Today, Inc.
The past few months have seen a flurry of announcements of major size increases from several search engines. It seems that the industry as a whole is making renewed efforts to try and keep up with the growth of the Web.
Excite launched an updated and changed interface known as Excite Precision Search (EPS). The new emphasis is on the "clean and simple" approach with more direct access to its search engine results. Results from other databases are now under the headings of Category Search, News Search, Photo Search, and AV Search. Relevance ranking now puts more weight on link analysis, especially from established sites. A couple search features are gone: the suggested searches and the ability to find related Web pages using a "more like this" option. The language limits were removed from the main search screen and are only available on the advanced search. Unless another language limit is used, only English-language pages are searched. There is no ability to search in all languages.
Google has been extremely busy since the last update. They have added up to three news headlines to the top of their search results. The cached pages are now called "cached" again. They announced a significant increase in size, claiming a more than one-billion-record database. However, of that, only 500 million are fully indexed Web pages while another 560 million records are simply URLs that Google has found linked from other pages, but has not visited and indexed. So it is really an effective size of half-a-billion indexed Web pages, which is still a significant increase over its previous size.
Google continues to cluster results by site, displaying up to two pages on its main search results. However, if a search retrieves less than one thousand clustered hits and the searcher pages down to the last page of results, clicking on the "repeat the search with the omitted results included" will provide results that are not clustered by site.
HotBot has finally added the ability to uncluster results on its advanced search page. Look for the new option that disables its "Best Page Only Filter." Initially, turning the filter on or off seemed to make no difference, but that problem appears to have been corrected. HotBot's GEN3 implementation seems inconsistent, with results varying from day to day. They may still be deciding whether or not to implement to full GEN3 database.
Inktomi acquired Ultraseek, a site search and intranet search engine, from its creators at Go.com. Ultraseek was an Infoseek product from the days before Disney bought Infoseek and integrated it into Go.com.
Inktomi GEN3, the 500-million-record Inktomi database, had become available through three Inktomi partners: iWon, Snap, and HotBot. These three will only retrieve records from the 300-plus-million portion if the number of hits found in the smaller 110-million portion are below a certain number.
iWon clusters results on its main search page and will only display one page per site, with no option to see the pages. Using iWon's advanced search, the results are not clustered by site, so many more hits can be found using the advanced search. iWon has also added several additional databases. LookSmart supplies their directory, replacing the Inktomi directory engine. Other results now come from Direct Hit, RealNames, and Fact City.
LookSmart has redesigned its home page to emphasize its partner sites. It also announced an agreement with Gale to load the full text of magazine articles and make that content available through some of its partners. That should occur sometime this summer.
Lycos switched from using its own crawler-built search engine database to the Fast database for results on both its main search page and the advanced search. Some of the international versions of Lycos still use the older Lycos database. The main search screen continues to first display results from the Open Directory.
WebTop from Bright Station also announced reaching the 500-million-record milestone. However, since WebTop does not index the full text on each of those Web pages, their effective size is considerably less than the other major search engines.
Yahoo! has dropped Inktomi as its back-end search engine for delivering "Web page" results and is now using Google instead. The Yahoo! Version of Google is using a smaller version of the Google database and does not find as many pages as the main Google site. It also clusters results by site and will only initially show one page per site.
The Numbers: The July statistical comparison of search engines at Search Engine Showdown (http://searchengineshowdown.com/) puts iWon and Google at the top of the size comparisons, demonstrating the difference that their half-billion record databases give. However, AltaVista, Fast, and Northern Light were not as far behind in actual results found as might have been expected.
Greg R. Notess (firstname.lastname@example.org; http://www.notess.com) is a reference librarian at Montana State University.
Comments? Email letters to the Editor to email@example.com.
Copyright © 2000 Information Today, Inc. All rights reserved.