Online KMWorld CRM Media, LLC Streaming Media Inc Faulkner Speech Technology
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Magazines > Searcher > November/December 2004
Back Index Forward
 




SUBSCRIBE NOW!
Vol. 12 No. 10 — Nov/Dec 2004
FEATURE
Open WorldCat Pilot:A User's Perspective
by Nancy O'Neill

OCLC's Open WorldCat Pilot [http://www.oclc.org/worldcat/pilot/] "is an initiative that integrates library records into popular Internet search sites and tests the effectiveness of the Web in guiding users to library-owned materials. The goal of the pilot: to make libraries more visible to Web users and more accessible from the sites where many people begin to look for information."1

The project aims to "open" WorldCat records to present and potential library users through the familiar Web search engines Google and Yahoo! Search. Enabling Web users to locate materials they need quickly and easily in libraries near them will promote library use and reinforce the value of libraries. Ultimately even people who don't often use libraries may come to consider libraries as a first source of information. If you believe in libraries, as OCLC obviously does, what's not to love about Open WorldCat?

For the Open WorldCat Pilot, OCLC extracted a 2-million-record subset from the 55 million records in the WorldCat database. OCLC selected the items most frequently cataloged by libraries; specifically, they selected records with one hundred or more libraries listed as holding the item.2 It is important to note that the pilot uses "limited fields" of the records. Gary Price asked in ResourceShelf, "Why doesn't OCLC make subject headings viewable and hyperlinked?"3 Perhaps that is a problem the search engines could help solve. Limiting the fields included could also make it difficult to distinguish between formats like VHS and DVD.

According to the Open WorldCat Pilot Quick Facts, "WorldCat records began to display within Google search results in December 2003 and within Yahoo! search results in May 2004. Inbound links from Open WorldCat search results have grown from 39,000 in February 2004 to more than 1 million in the first half of June 2004."4

Approximately 12,000 libraries are participating in the project, including the academic, public, and school libraries originally included automatically, plus state, federal, and special libraries that have asked to join. Libraries may choose to opt out of the program by notifying their OCLC regional service provider or by completing an online form on the FAQ page. Libraries not part of the pilot but that contribute records or holdings information to WorldCat may complete an online form to participate. OCLC cooperatives invite libraries that do not contribute their cataloging records to WorldCat to join the pilot by joining the OCLC cooperative.

There are several ways of searching Open WorldCat items in Google and Yahoo!. Searching works the same in Google and Yahoo! with two exceptions, which we discuss below. The most intuitive is using the phrase "find in a library" plus the title of the item or the subject to be searched: find in a library: da vinci code. Alternatively, you can search using the phrase "worldcat libraries" and the title of the item or the subject searched: worldcat libraries social ecology. With enough promotion, the phrase "worldcat libraries" could someday become a recognizable way for the casual user to search as well as great brand recognition for OCLC. The phrase "worldcatlibraries" works equally well, although Google persists in asking "Did you mean worldcat libraries?" and Yahoo! returns the search as "Find in a Library." OCLC also suggests using "wcpa," a phrase that appears in the URLs of all records retrieved, e.g., wcpa da vinci code. It's not exactly intuitive and it does not work in Yahoo!. The remaining search, "site:worldcatlibraries.org [title]," is described at length below. It too does not work well in Yahoo!.

Ericka McDonald, manager of OCLC WorldCat end-user services, was extremely helpful in answering questions about the project. McDonald provided much of the g information that follows.

Authors may be searched by linking the name to a title or subject, e.g., worldcat libraries murder in mont
parnasse and greenwood or worldcat libraries basketball and lee
. Google retrieves WorldCat records for both these searches, but Yahoo! retrieves only the subject search for basketball and lee. OCLC recommends using the search syntax "site:worldcatlibraries.org [author name]" to locate author records. Searching site:worldcatlibraries.org dennis lehane did retrieve several records for Dennis Lehane (author of Mystic River and other novels) in Google, but retrieved only one record in Yahoo!. Using the other recommended searches combined with the author name "Dennis Lehane" located WorldCat records in both Google and Yahoo! but only for one title and, in one case, for two different titles in two different formats. The "site:" search used in Google retrieves several records for different titles by Lehane, but the "site:" search in Yahoo! retrieves only one record for one of Lehane's works. Generally the "site:" search in Yahoo! retrieved only one record for anything.

To locate several records for the same item, OCLC suggests using "site: worldcatlibraries.org [title]," e.g., site:worldcatlibraries.org war in a time of peace and halberstam. The "site:" search is designed to retrieve all the records WorldCat has for a particular title and therefore increases the chance of finding an item record that lists the user's local library. This syntax limits the search to records
harvested from OCLC's server. For instance, "site: worldcatlibraries" works well in Google, but does not work well in Yahoo!. Also the "site:" search does not appear to necessarily improve the chance of locating a particular item record in which the local library appears. Searching worldcat libraries silas marner in Google retrieves WorldCat records for the book as the first two retrievals, but neither record listed Santa Monica Public Library as having the book. Searching site:worldcatlibraries.org silas marner retrieved 69 WorldCat records for various editions of the work, as well as books about the work. Record 30 on page three of the results listed the Santa Monica Public Library. Searching site:worldcatlibraries.org silas marner in Yahoo! retrieved only one WorldCat record.

OCLC plans to use Functional Requirements for Bibliographic Records (FRBR) to change the records that they make available for harvesting; this should make the "site:" search syntax unnecessary. FRBR provides a new way of defining relationships between bibliographic items, their creators, and their subjects. It embodies the basic laws of cataloging and offers ways to further develop and enrich existing catalogs. According to McDonald, it creates a work level record that incorporates various versions and editions of items. The aim is to allow users to locate whatever version of an item a local library owns, an essential improvement. The search engines will not have to support or implement FRBR, just continue to harvest the content the same way they presently do. Deb Bendig, manager of Discovery View of WorldCat, says that OCLC is trying various approaches to employ FRBR in WorldCat but has not yet set a target date for its implementation. Additional information about FRBR is available on the OCLC Research Projects Web site at http://www.oclc.org/research/projects/frbr/default.htm.

Even though a "site:" search may be successful, it does not necessarily improve one's chances of finding the desired item in the local library. I persevered to locate Santa Monica's record for Silas Marner because I was testing the system; the casual searcher would probably give up after one page of records. We'll see if FRBR solves the problem.

Web users can search for just about anything: books, magazines or journals, videos, compact discs. Searching for a specific format, such as DVD, can be frustrating due to inconsistency in retrieval. "Find in library" did not retrieve a Wild Strawberries DVD record in Google, but "worldcat libraries" did: worldcat libraries wild strawberries dvd. Both find in library wild strawberries dvd and worldcat libraries wild strawberries dvd retrieved the WorldCat record in Yahoo!. I had the same results searching for a Rear Window DVD: find in
library rear window dvd
did not work in Google but did work in Yahoo!, and worldcat libraries rear window dvd worked in both search engines. I located the DVDs of Wild Strawberries and Rear Window without too much difficulty in both Google and Yahoo!, but I could not retrieve a WorldCat record for a Citizen Kane DVD using either search engine, even though records in the full WorldCat database indicated it would have met the criteria of the pilot. The first WorldCat record listed 600 libraries holding the Citizen Kane DVD, the second record listed 250 libraries with the item, and the third 177 libraries with the item. So why didn't it appear in my search?

McDonald provided no answer as to why I could not find the DVD, but she suggested that I search for it using site:worldcatlibraries.org citizen kane visual material. That search strategy worked beautifully, retrieving 18 WorldCat records. The only problem is that users have to know that the term "visual material" covers records for VHS or DVD. And who would ever think to search that way? The question remains: Why can I locate other DVDs without using an arcane search syntax, but not this one?

Curiosity about finding almost no DVD holdings for my own Santa Monica Public Library led to an embarrassing discovery. Santa Monica Public Library stopped entering its DVD holdings in WorldCat when it outsourced provision of DVD records to a DVD vendor. As troubling as that may be for a Santa Monica Library client, it may reflect a practice followed by many libraries that outsource some or all of their cataloging. When Santa Monica obtained DVD cataloging records from a vendor, instead of cataloging them in-house, we apparently had no simple way to upload the vendor records into WorldCat. Santa Monica's Technical Processing Department explained that re-entering the records in WorldCat using CatMe amounts to recreating the entire record, a process too labor intensive to be cost-effective.

Fortunately, Santa Monica has since resumed in-house cataloging of audiovisual materials. However, many libraries outsource at least some of their cataloging and, if they find that they cannot easily upload the vendor records, the consumer will be denied the ability to locate items in a popular formats. As libraries respond to the need to become more cost-effective, it seems that OCLC, equally cost-conscious, may not have provided the technology used by many online catalog vendors to make loading vendor records easy.

I put the question to Cynthia Whitacre, of the OCLC Cataloging Partners Program, who told me that OCLC has more than one avenue to work with vendors on supplying catalog records for easy uploading into WorldCat. Two of which are Cataloging Partners [http://www.oclc.org/catalogingpartners/partners/default.htm], a recent program, and PromptCat [http://www.oclc.org/promptcat/about/vendors/], an established program. Whitacre says that OCLC actively works with vendors to assure that records can be added to WorldCat, and I admit that the vendor partner list [http://www.oclc.org/promptcat/about/vendors/] is pretty impressive. Coincidentally, Santa Monica's former DVD vendor recently signed up with OCLC.

According to information on the Open WorldCat Pilot page, "In most cases, the Open WorldCat pilot will provide users with detailed library information in as little as two clicks."5 Using either Google or Yahoo!, enter a simple search string "find in a library" plus the title of the item. OCLC says that the WorldCat record should appear as the first hit, and it usually does.

Click #1 takes users to a page where they can enter their ZIP code to locate the nearest library that has the item. Click #2 retrieves the list of libraries in or around that ZIP code. If the local library's catalog is linked, click #3 takes users either to a library Web site or, in the best cases, directly to the library's online catalog record for the item searched. In the sample searches provided by OCLC, a search for the title Benjamin Franklin: An American Life took only two clicks to reach ZIP code 90401 and a Santa Monica Public Library appearance in the list of libraries; click #3 retrieved my library's online catalog record. I think I'm in love!

Another sample search provided by OCLC, The Da Vinci Code, using the same ZIP code, does not list Santa Monica as one of the nearby libraries even though Santa Monica actually has multiple copies of the book. A Newsbreak6 by Searcher editor Barbara Quint on Open WorldCat led to this article when she called me to ask why. This particular title is an anomaly. Santa Monica's record for the book is not in WorldCat although it should be. Finding no record for Santa Monica, the user is directed to the next closest ZIP codes — Beverly Hills Public Library, where click #3 goes directly to the online catalog record, and El Segundo Public Library, where click #3 takes you to the online catalog search screen but not to the individual record. Instead, you must perform the title search again. Of the other libraries listed, click #3 takes the user to the El Camino College catalog login page, the Harvard-Westlake Upper School catalog search page, the Woodbury University catalog search page, and a broken link for UCLA that goes nowhere. Obviously OCLC can't change the way various online catalogs operate, but these varied entry points are inconsistent and inconvenient.

Most consumers want to go directly to the item record in a local library catalog. So why the inconsistencies? According to McDonald, the hotlinked library name should always take the user to the library OPAC. In most cases, it drops the user at the home page for the catalog, where the user has to re-enter the search. In some cases, the link takes the user to the record for the item in the OPAC. The ability to form this "deep link" depends on a couple of factors: (1) how the library has configured the link in its FirstSearch Administrative module and (2) whether the OPAC supports deep linking. OCLC is trying to get libraries to configure deep links in FirstSearch. To do this, many libraries will need to turn on this capability in the OPACs or ask the local system vendor to do it. This is a high priority for OCLC, and it is working closely with member libraries and local system vendors to improve these links.

Migell Acosta, Santa Monica's Principal Librarian for Information Management and an OCLC Members Council Delegate, set up Santa Monica's linkage. He believes that click #3 would take users directly to the item record in most online catalogs, if OCLC provided better directions to libraries on setting up the Open WebCat Pilot linkage. Gale Group's InfoTrac offers linkage to library catalogs and provides excellent examples of how to set up that linkage. Acosta emphasized that OCLC is working to make the technical end easy for libraries with pilots being created to work out problems. OK, I'm convinced.

But as a consumer I still find it annoying to land on a library catalog search screen and have to re-enter a search. Even more frustrating is landing on the library Web site, then struggling to locate the online catalog link, before having to re-enter the search. If Open WorldCat frustrates customers, it certainly won't help attract them to libraries.

Unfortunately, the record retrieved for The Da Vinci Code did not suggest the Los Angeles Public Library, even though Santa Monica's 90401 ZIP code is surrounded by neighboring Los Angeles Library branches. According to the Pilot FAQs, "Initially OCLC is using the postal code for the street address associated with each OCLC institution symbol.... The postal code entered by the user does not have to exactly match the library postal code; concentric radiuses of geographic proximity are employed to locate libraries near the postal code. These radiuses are 20 kilometers (12 miles), 50 kilometers (31 miles), 100 kilometers (62 miles), "region" and "worldwide." If at least 10 libraries are not found within the radius, the search expands out to the next radius."7 The Central Los Angeles Public Library, the institution associated with the OCLC institution code, is about 15 miles from Santa Monica with a 90071 ZIP code; that could explain the omission. Nevertheless LAPL branches are much closer to Santa Monica than Beverly Hills or El Segundo, and those branches will all have copies of The Da Vinci Code.

In another inconsistency, searching The Da Vinci Code in the 90024 ZIP code for West Los Angeles retrieves a Los Angeles Public Library record, but searching the same title for the 90025 ZIP code for West Los Angeles does not retrieve Los Angeles Public Library entries. Interestingly enough, the West Los Angeles Regional Branch of the Los Angeles Public Library is actually located in the 90025 ZIP code. Local users may know that the ZIP codes are adjacent and recognize the location of the Regional Branch. On the other hand, L.A. is a big county with lots of ZIP codes. I doubt explanations of the esoteric algorithm for linking ZIP codes and libraries will do much to relieve user frustration. McDonald assured me that OCLC has received feedback on this problem and is working on the ZIP code recognition program. I should mention that search results may also retrieve libraries that are nearby but are not open to everyone. If the searcher can't take out the item located at, for example, a local university, how satisfied will the user be? And how likely to try Open WorldCat a second time?

Search results can be inconsistent as well. The following search syntax usually worked: "find in catalog," "worldcat libraries," "worldcatlibraries," and "wcpa." (Most consumers would not use "wcpa" unless directed to do so, but a few might notice that it forms part of the http://www.worldcatlibraries.org/wcpa/ URL in all WorldCat records.) In an informal and unscientific test, I used all four approaches for a variety of materials in both Google and Yahoo!. I tried to search as a library user rather than as a librarian, so I sometimes avoided adding terms I thought would produce better results. Rather than use the prescribed search "find in a library: [title], I shortened it to "find in library [title]." Most users have been repeatedly chastised by search engines for using common words like "a"; not many users will think to add a colon after "find in a library." I searched a variety of document formats, although I expected results would be less satisfactory because of the variety of ways libraries catalog periodicals and non-print materials. Still, users search for both print and non-print materials, so the project should encompass all formats. The generally successful searches for DVDs came as a pleasant surprise.

My Unscientific Tests

Here are the ground rules I followed when searching:

• I used the same 14 searches in Google and Yahoo!.

• I used the same Santa Monica Public Library ZIP code (90401).

• I checked that each title searched has a record in WorldCat and that the record lists more than 100 libraries holding the title.

• I checked that Santa Monica Public Library has a record for the item in WorldCat and that the record is one with more than 100 libraries listed

• I used titles most likely to be purchased by many libraries (with two exceptions).

• I refreshed the screen between each search in each search engine.

• I looked at only the first page of the results.

I did search for two books that were not in the Santa Monica Library collection but recently had been requested by Santa Monica clients. Clients searching for more esoteric titles will find WorldCat's extensive listings extremely useful. These clients tend to be a bit more sophisticated about library collections. They usually know if their local library might not have such materials and will turn to a Web search.

For a complete breakdown of the 14 separate searches I did and to see how Google and Yahoo! results compared, go to this URL on the Information Today, Inc. Web site: https://www.infotoday.com/searcher/nov04.oneill.shtml. A caveat provided by the WorldCat Pilot page applies: "Please note that Web search-engine content is dynamic, so your results may vary."8

The search results were mostly satisfactory. The most glaring problem is the fact that the record(s) retrieved are not always those that show the holdings of the local library. Obviously, libraries enter records for various iterations of an item, and those items become different records. Searching Open WorldCat retrieves only a few records for an item, not all of them. OCLC recognizes this as a problem.

McDonald also introduced me to the search syntax mentioned above that retrieves WorldCat records for several different iterations of a title. In Google, search "site: worldcatlibraries [title]." Average Google users are unlikely to use this search syntax unless given specific directions, and it may retrieve more records than the user wishes to check. Searching The Da Vinci Code sample in Google as site: world catlibraries.org da vinci code produces two pages limited to WorldCat records that include three records for the book (one a Spanish translation); five records for the audio book; and five records for books about the book. Santa Monica Public Library is listed on the record for the Spanish language version and on two of the records for the book as a subject. A title search for David Halberstam's War in a Time of Peace as site:worldcatlibraries.org war in a time of peace retrieves about 185 WorldCat records that are variations on the title. Adding the author's name produces a perfect search. Searching site:worldcatlibraries.org war in a time of peace and halberstam retrieves only two records, both WorldCat records for the book. One of the records lists the Santa Monica Public Library as having the item.

According to McDonald, OCLC received such positive feedback on the pilot, originally scheduled to end in June, that it will extend it into the fall. However, she added that OCLC is already working on details for transitioning Open WorldCat from a pilot into a permanent membership benefit.

Chip Nilges, OCLC Director of Content Services, elaborated on Open WorldCat's future. OCLC's time frame for going into production is October/November. The pricing model will be part of a library's subscription to WorldCat on FirstSearch. Nilges explained that they consider it another way to access WorldCat: OCLC supports access via Z39.50, FirstSearch, and now a variety of open Web partners. OCLC intends to make members' collections visible and available to information seekers, from library portals and on the open Web. Does the pricing model mean that clients of libraries that subscribe to FirstSearch, but do not make it available to the public, will have access to Open WorldCat through the search engine partners? "If your library has subscription access to WorldCat on FirstSearch, its holdings will display in the search engine partners. We're treating this as a feature of FS WorldCat and will fund it through standard price increases for that service, just as we do other enhancements. Of course, all of this is new to us, as well, so we'll need to keep an eye on traffic and other expenses over time," says Nilges.

Hats Off

Overall, the Google searches were more successful than Yahoo! searches. The "site:worldcatlibraries.org" retrieved only one WorldCat record in Yahoo!, and the "wcpa" search doesn't work. But Yahoo! does retrieve WorldCat records when the appropriate search syntax is omitted. (See the "Life Without Open WorldCat" sidebar on page 55.)

In the long term, OCLC wants to increase the amount of content available; increase the number of partners by including other search engines, booksellers, and sites dedicated to books; enable interlibrary loan requests through remote user authentication; and develop new user statistics and configuration tools for libraries.

The Open WebCat project might advance faster if the search engines would put more effort into dealing with non-Web content. Both Google9 and Yahoo!10 seem interested in opening up new content avenues, and both have the research and development staff to deal with mechanisms for better searching of original non-Web content. But neither seem willing to take one obvious step and open up a home page tab for library material. Such a tab might reach beyond OCLC to open access movement sources, government archive collections, bibliographic indexing and abstracting services, and so on.

Grumble as we may, OCLC's Open WorldCat Pilot has the potential to achieve its goals and more. It may not yet have earned a standing ovation for its performance, but let's give a rousing cheer for the initiative — a special "hats off" to Google and Yahoo! as our new library partners — and encourage OCLC to move from pilot to permanent.

Life Without Open WorldCat, Or "Real People" Searching

In the December 2, 2003, ResourceShelf, Gary Price posed this question: "Where will a typical Open WorldCat record appear on a results page based on an average user query (2.4 words)?"

Good question. I experimented in both Google and Yahoo! by searching some of the items I used to test the Open WorldCat pilot, but without using the suggested search syntax. (Caveat: These results are based on one search per item. Since search engine rankings change continually, results will probably vary from search to search.) A search for da vinci code located no WorldCat records in the first 20 pages of Google results. But wait! The same search in Yahoo! showed the WorldCat record as the fifth item on the first page of results. Searching atkins for life (a title suggested by the WorldCat pilot) produced no WorldCat record in 20 pages of Yahoo! results, and the Google search fared no better.

OK, how about a DVD search? Searching wild strawberries dvd in Google located no WorldCat records in 20 pages of results, but Yahoo! provided the WorldCat record as number 68 on the fourth page of results. A patient searcher might get that far ... maybe. The DVD for Rear Window searched as rear window dvd turned up as a WorldCat record in Yahoo! on page nine as record number 173. No luck in 20 pages of Google results.

Two journals searched as new england journal of medicine and architectural digest produced the same mixed results. Yahoo! returned the WorldCat record for New England Journal of Medicine as the 31st item on page two of the results, but Google returned no WorldCat record in 20 pages of results. The Architectural Digest search found no WorldCat record in 20 pages of Google results, but Yahoo! returned the WorldCat record as number 68 on page four of the results.

It's not impossible to locate a WorldCat record in Yahoo! with "an average user query (2.4 words)," but most searchers will simply not persist past the first three pages of results, if that. What we want are search results that pop to the top on every search even when the user doesn't use a special syntax.

Now when were Google and Yahoo! Search going to put up that "LIBRARY" tab on their home pages again?

 

 


Footnotes

1 Open WorldCat Pilot: Using WorldCat to increase the visibility of libraries on the Web [http://www.oclc.org/worldcat/pilot/].

2 Ibid.

3 Price, Gary, "Web Search — Yahoo!," NewsBreaks: Two Million Open Worldcat Records Hit the Yahoo! Database, ResourceShelf, Wednesday, July 7, 2004 [http://www.resourceshelf.com/
2004_07_01_resourceshelf_archive.html]
.

4 Quick facts about the Open WorldCat pilot [http://www.oclc.org/worldcat/pilot/facts/default.htm].

5 How the Open WorldCat pilot works [http://www.oclc.org/worldcat/pilot/how/default.htm].

6 Quint, Barbara, "Yahoo! Search Joins OCLC Open WorldCat Project," InfoToday Newsbreaks, July 6, 2004
[https://www.infotoday.com/newsbreaks/nb040706-2.shtml].

7 Open WorldCat Pilot: Frequently Asked Questions [http://www.oclc.org/worldcat/pilot/faq/default.htm#link11].

8 Open WorldCat Pilot: How It Works [http://www.oclc.org/worldcat/pilot/how/default.htm].

9 Zeitchik, Steven, "Google looks to add book content," Publishers Weekly, November 2003, p. 3. InfoTrac OneFile Gale Group Databases. Santa Monica Public Lib., CA 11 August 2004 [http://www.infotrac.galegroup.com].

10 Webb, Cynthia, "Yahoo! Search will roll out Content Acquisition Program," The America's Intelligence Wire, 3 March 2004. InfoTrac OneFile Gale Group Databases. Santa Monica Public Lib., CA 11 August 2004 [http://www.infotrac.galegroup.com].

 

 


 

       Back to top