Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology DBTA/Unisphere
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Magazines > Online > January/February 2003
Back Index Forward
 




SUBSCRIBE NOW!
Online Magazine
Vol. 27 No. 1 — Jan/Feb 2003
DEPARTMENTS
Internet Search Engine Update
by Greg R. Notess
Reference Librarian, Montana State University

Internet Search Engine Update goes up on the Web at http://www.onlinemag.net as soon as it is written, approximately one month before the print issue mails to subscribers.

AlltheWeb now uses a keyword-in-context (KWIC) display in their search results. It also has a new field search command of site:. This is an easier-to-remember version of the older url.host: and url.domain: field searches and can be used for top-level domains and regular domains. For example, either site:edu or site:company.com can be used and can be combined with other search terms. AlltheWeb has also announced that its site is now fully XHTML and CSS compliant.

AltaVista has made some major updates. It now includes indexed PDF files, joining Google and FAST as the third search engine to offer access to these information-rich files. Searchers can use the filetype:pdf syntax or the advanced search page to limit to PDFs. The main page and logo have been redesigned. It has fewer ads, having removed pop-ups and pop-under ads in August, as well as the graphic banner ad from their home page. It plans on increasing the freshness of the database by refreshing about half of the results that users retrieve on a roughly daily basis. It has increased the size of the database slightly to a bit under 1 billion Web pages and 250 million images. Its international focus has expanded with the introduction of Prisma suggestion technology into French, German, Italian, and Spanish and the expansion of the News search to German.

BoardReader is a search engine that searches Web-based discussion forums, which are often not indexed by other search engines. BoardReader accepts phrase searching and truncation with an asterisk. Results include a cached copy, the date, and the number of replies.

GigaBlast, a new search engine launched last summer, is now offering a site search product and has launched a Swedish/Scandinavian version at www.gigablast.nu. While the Swedish version uses the same database, it adds a Swedish pages limit. There is also more attention to the design of the site, but the advanced search does not have as many options.

Google now claims to provide access to over 3 billion Web documents. Researchers [http://cyber.law.harvard.edu/filtering/google/] have also discovered that the www.google.fr and www.google.de international versions have excluded certain Web sites to avoid legal problems with laws in those countries. Two new country domains have been added—Poland and Thailand.

Inktomi has sold its enterprise search software (formerly known as Ultraseek) to Verity, leaving Inktomi to focus almost exclusively on Web searching. It has also launched a new database that it claims includes 3 billion records, added spell checking, changed to a keyword-in-context (KWIC) display for some records, has greatly increased the freshness of the database, and aggressively removed dead links. It has introduced a relevance technology to help provide better results for ambiguous terms such as york or mexico so that top-ranked results will not be for new york or new mexico.

MyWay is a new portal from the Excite Networks. MyWay.com boasts that it has no banners or pop-ups. The portal content is similar to that at Excite and iWon, and the search engine and directory come from Google and Google's version of the Open Directory. It is one of the few Google partners to include the cached links in the results.

Teoma has improved its phrase searching so that it now does exact matches, also adding an OR operator that must be in all upper case letters. Without user-specified nesting, the processing of a simple x y OR z gets treated as (x AND y) ORz. Teoma has added a spell check feature, in beta, for common English words but not proper names. It has updated the database, expanded it by 60 percent to about 350 million records. It now uses site collapsing so that only the first two hits per domain are listed with others under a "More results from" link. The results now use a keyword-in-context (KWIC) display, and stop words are searched if occurring within a phrase search. It has added field searches using the prefixes of intitle:, inurl:, and site:. An advanced search page should be available soon to make these even easier to use.

The Wayback Machine has launched a "document compare" feature that uses DocuComp technology to compare two historical Web pages and highlight the differences. Look for the "Compare Archive Pages" in tiny print in the upper right hand corner after the search box on a search results page to try out this feature.

Yahoo! finally announced a renewal with Google for search engine results. While the "Powered by Google" logo is gone from the top, the results actually rely more heavily on Google than previously. A few directory category matches and sponsor matches come first, but then comes a new section labeled "Web Matches." This replaces the old "Web Sites," which were entries from the Yahoo! directory, and the "Web Pages," which were from Google. The new "Web Matches" mix the two, putting them in Google relevance order. Those items in the directory will use the directory summary and title rather than Google's and have a small red arrow that links to the category. The advanced search has also changed significantly. It now looks much more like the Google advanced search. A direct link to the Yahoo! directory itself is now available [dir.yahoo.com].


Greg NotessGreg R. Notess (greg@notess.com; www.notess.com) is a reference librarian at Montana State University and founder of SearchEngineShowdown.com

Comments? Email the editor at marydee@infotoday.com


       Back to top