[Internet Search Engine Update]

ONLINE, March 2000
Copyright © 2000 Information Today, Inc.

Subscribe


AltaVista has been busy adding more portal content the past few months. It also added another international mirror, AltaVista UK at http://www.altavista.co.uk. CMGI, AltaVista's parent company, has announced plans for an IPO for AltaVista. An AltaVista IPO has been in the works several times before, but this time it appears much more than it will actually happen. If it does go through, the infusion of cash should allow for some interesting improvements. Only time will tell how much development emphasis will be on the search engine side of the business and how much will be on portal content and partnerships.

Ask Jeeves is being sued for patent infringement by two MIT professors.

Dogpile, one of the Go2Net multiple search engines, has added Google to the list of search engines that it queries.

Fast has greatly expanded the size of its database and added advanced search features. In December, it released a database that claimed 210 million URLs. Then, mid-January saw the release of a 300 million record database--the largest a search engine has ever offered. It also shows a huge increase in database size in a little over a month. At the same time, the increased size has not caused any obvious degradation in general search speed.

Prior to the January launch, each time Fast debuted a new larger database, the collection gradually shrank as duplicates were removed. The absence of a refresh spider contributed to the shrinkage. With the January launch, the company states that a refresh spider will be used, so the database should hold its own and even grow slightly while Fast prepares its next major size increase. The company plans to offer a 400 million record database by April 2000.

On the search features side, Fast's main site at All the Web, http://www.alltheweb.com, saw the addition of an advanced search page in December. While command Boolean and nested operators are not available, the advanced search does offer menu-driven Boolean, field searching, and language and domain limits. Several boxes are available for adding additional terms or phrases connected to the search as "Should Include," "Must Include," or "Must Not Include." It has 25 language limits, and although only one language at a time can be selected, it defaults to searching all languages. A domain filter gives the option to limit results to one or more particular domains as well as another option to exclude specific domains. Newly available field searches are title, link, URL, and linked text (which works like the anchor field on AltaVista). The advanced search has display options for 10, 25, 50, 75, or 100 results at a time.

Go (Infoseek) has an advanced search form, linked as "search options," which offers its field searching and limits through drop-down menus. It also offers display options for 10, 20, 25, or 50 records at a time, although the option for 50 only gives 25. Go no longer supports the alt: field search, which used to allow the searching of text within an image alternate text tag. Go will also continue to use its logo for a while longer.

Google has added RealNames hits to its results. These appear on the same line as their first regular result with a superscript RN next to them. Try a search on IBM for an example of how it is displayed. Google also has added a specialty search engine for Apple and Macintosh computer topics. Back in November, Google started putting advertisements on its site. Rather than the typical banner ads or various standard ad buttons, the ads were simply formatted text and were displayed near the top of the results page. More recently, these have disappeared, so Google is ad-free again, at least for now.

HotBot has a new beta version up and running at http://beta.hotbot.com. At first look, there is little difference between the beta and the regular version, but the fine print at the bottom shows that HotBot is using technology from E-Cyc, http://www.e-cyc.com/--apparently in place of similar technology from LexiQuest. Both LexiQuest and E-Cyc supply linguistic software that provides related terms. On HotBot, these related and more specific terms show up on the search results page near the top under the heading of Refine Your Search.

Lycos has made a major investment in Fast Search and Transfer. Lycos has been using Fast's MP3, FTP, and Rich Media (sounds, images, and videos) databases for some time. In mid-January, Lycos Pro (its advanced search) switched over to a Fast interface and the 300 million record Fast database. While it has yet to release an official announcement, a search on Lycos Pro finds exactly the same hits in the same order that a search on Fast at All the Web retrieves. In addition, there is a note at the bottom stating "Portions powered by Fast."

MetaCrawler has been relaunched by Go2Net with a new interface and the addition of a music and auctions metasearch options. It also now includes Google in its group of Web search engines.

MSN Web Search has switched back to an Inktomi database. After announcing a switch from Inktomi to AltaVista in January 1999 and then finally making the change in September, Microsoft did an about face in December. AltaVista was replaced by Inktomi. The MSN advanced search even added some features not yet available on other Inktomi search engines, such as the ability to turn site clustering on or off and the ability to sort "equally relevant results" by date, depth, and title.

Northern Light passed the 200 million mark for URLs in its database of Web pages toward the end of November 1999.

SpeechBot, an experimental audio search engine from Compaq, is available at http://speechbot.research.compaq.com. It indexes popular U.S. radio shows. Instead of using the more labor-intensive approach to first creating transcripts and then indexing those matched to the appropriate spot in an audio file, SpeechBot aims to automate the process by using speech recognition technology to create searchable text words. The automated approach does not match the actual words spoken exactly and is not as reliable as indexed transcripts, but the technology does offer a good starting point for automating the process as well as offering searchable access to radio shows that are not indexed elsewhere.

Oingo has moved out of beta. It now offers its "meaning-based search" technology royalty-free to portals and other content providers. It remains to be seen who may take up Oingo on the offer. Meanwhile, its site continues to use the Open Directory and AltaVista database to showcase its product.

Simpli.com offers another technology for narrowing search terms to the proper context. It is currently available in beta for public view on its site. In an interesting user interface twist, a searcher enters a term and then the drop-down menu to the right offers the alternate choices for context.

Webtop from Dialog (http://www.webtop.com/) is another approach to a general Web search engine. It offers several different search options, such as copy and paste and drag and drop for grabbing a block of text to use as the search string. Output also features links to companies and news. The general Web database is relatively small, and results do not display basic information like date, URL, or file size. More surprisingly for a Dialog product, it does not even give a count for how many hits it retrieved.

WholeWeb.net is undertaking an ambitious goal that will compete directly with Fast's rate of growth. WholeWeb.net has no public search site at this point, but it has announced the ambitious goal of indexing one billion Web pages by June 30, 2000. Its demonstration database allows the user to determine the weight of search terms and, in the advanced search, which ranking criteria to use.

The Numbers: The statistical comparison of search engines at Search Engine Showdown (http://SearchEngineShowdown.com) run on November 29 found Northern Light back in the lead and surpassing the 200 million record milestone. Also, at that time, the dead link analysis found Fast and AltaVista with the most dead links. A special supplement on January 12 moved Fast back into first place after it launched its 300 million record database, followed by Northern Light and AltaVista. But keep an eye out for WholeWeb.net...


Greg R. Notess (greg@notess.com; http://www.notess.com) is a reference librarian at Montana State University.

Comments? Email letters to the Editor to editor@infotoday.com.

[infotoday.com] [ONLINE] [Subscriptions] [Top]

Copyright © 2000 Information Today, Inc. All rights reserved.
Comments