Using Qualitative Assessment Protocols to Evaluate a New Library Discovery Service
by Sarah Hartman-Caverly, Crystal Knapp, Michael Lamagna, and Rachel Perzchowski
Located in the Learning Commons, Delaware County Community College’s library services unit supports the research and pedagogical needs of 9,000 full-time equivalent students and 141 full-time and 725 adjunct/part-time faculty. Library services provides access to electronic resources via approximately 60 databases and the library online public access catalog (OPAC). We adopted EBSCO Discovery Service (EDS) in September 2012, implemented and tested in December 2012, and deployed a beta release in June 2013.
|The testing protocols were not only useful for identifying problems with our discovery service implementation, but they were also useful for reporting those problems to the vendor.
One of the questions that we faced during implementation was how to qualitatively evaluate the usefulness of the discovery service relative to the library’s OPAC and individual databases. While usability testing of web-scale discovery systems is well-documented,¹ we were interested in evaluating the accuracy, comprehensiveness, and usefulness of the discovery service by comparing its results sets to those produced by our OPAC and by the native interfaces of its constituent databases.
To meet these assessment goals, the electronic resources manager developed four testing protocols: Basic Search, Advanced Search, Catalog Search Comparison, and Database Search Comparison. Each protocol provides step-by-step instructions for librarians and Learning Commons information desk staff to follow while executing searches in the discovery service, applying limits and filters, and evaluating the relevance and appropriateness of the results. The Catalog Search Comparison and Database Search Comparison protocols were designed specifically to provide a basis for comparison between the results from the discovery service and the results from the native OPAC or database search interfaces.
This approach to assessment leveraged the expertise of librarians as subject specialists with deep knowledge of the research needs of our patrons, as well as the experience of the information desk staff in providing triage reference services. The use of testing protocols enabled us to involve many stakeholders and to scale up the testing process in order to produce a large sample of results in a short time frame. Finally, the use of these four testing protocols provided an organized and standardized format to evaluate the new discovery service.
Basic Search and Advanced Search Protocols
The Basic Search and Advanced Search protocols were used to determine whether the discovery service was producing and ranking results as expected. The Basic Search protocol documents which index the participant used for the initial search: Keyword, Title, or Author. The Advanced Search protocol requires participants to use a Boolean search string for the initial search. Beyond these initial search options, the protocols are the same. As for the evaluation of search results, the protocols require an examination of only the first two pages of results, based on the assumption that most end users will not investigate beyond the second page.
First, the number of results of each source type (academic journal, book, reference, etc.) was recorded, and the participants were asked to note whether this distribution of results is relevant to the search. The next two steps tested the search limiters that patrons use most often: Full Text and Catalog Only. After applying the Catalog Only limit, participants confirmed that the link to the catalog item resolves correctly to the respective item record. Lastly, participants applied two limits or filters of their choice to the original search and noted any discrepancies in the results.
The information desk staff and reference librarians participated in the Basic Search and Advanced Search protocols. Staff at the information desk provide triage reference services by assisting patrons with known-item searching, so they used the Basic Search protocol. Reference librarians handle a wide variety of queries, so they used both the Basic Search and Advanced Search protocols. It was beneficial to have both the staff and librarians complete the Basic Search and Advanced Search protocols, since these groups have different but complementary experiences assisting patrons. The staff and librarians are familiar with the subjects and materials that are most commonly requested by our patrons, and they are familiar with patrons’ search behaviors. This familiarity allowed participants to evaluate the discovery service results based on expectations from experience with these search techniques. Participating in testing also gave staff and librarians the opportunity to familiarize themselves with the new interface before assisting patrons. Through their testing, we were able to ensure that the results for common searches would be relevant and accurate.
Catalog Search Comparison Protocol
The Catalog Search Comparison protocol was designed to evaluate the results of the Catalog Only search limiter of the discovery service and compare the search results to those from the OPAC’s native interface. The Catalog Search Comparison protocol was adapted to account for differences in the search options between EDS and VTLS’ OPAC, Chamo. For both interfaces, participants were asked to record their search term(s) and note whether they were using the Advanced Search or the Basic Search option. Participants were also asked to note whether they applied any search field limits, such as Title or Author, and to note whether they selected a Boolean option to “find all terms” or “find any terms.”
In examining the search results screens, participants were asked to note the total number of search results that each interface produced and to compare the results and the relevancy rankings for the first two pages of results. For any discrepancies between the search result sets, participants noted the item titles and the media types. Additionally, they were asked to comment on the metadata shown on the discovery service search results page, including item location names, link text, URLs, item availability, and other descriptors.
Reference librarians, the library technology specialist, and the electronic resources manager all participated in the Catalog Search Comparison protocol. As with the other search protocols, this particular protocol served to familiarize the reference librarians with the discovery system and to help them become adept at knowing when to use it versus the native catalog in their instruction and reference interactions. For the electronic resources manager and library technology specialist, this protocol served as a quality control test to ensure that the catalog data was accurately represented in the final discovery service.
Application of the Catalog Search Comparison protocol quickly brought to light significant problems with the catalog dataset within the discovery service. Because of the testing done via this search protocol, the electronic resources manager was able to identify and resolve these issues before the full release of the service. Additionally, working through the Catalog Search Comparison protocol gave librarians and staff confidence that the relevancy ranking of the search results in the discovery service was comparable or superior to that of the native catalog interface.
Database Search Comparison Protocol
The final protocol was designed to determine whether the new discovery service generates and ranks results as expected compared to directly searching the native databases that it indexes. As part of the Database Search Comparison protocol, librarians limit searches in the discovery service to a specific database and then perform the same search in the native database interface. Ultimately, this was to test the abstracting and indexing (A&I) function of the discovery service when displaying results from existing EBSCO databases and content provided from other e-resource vendors.
Participants performed searches through the discovery service, noting whether they were using the Basic Search or the Advanced Search interface. They were also asked to note if they ran a Keyword, Title, or Author search. Finally, the complete search string was noted to ensure reproducibility of the test. Once the search was performed, participants selected an individual database from the content provider limit found in the discovery service interface. The same search was then performed in the native database, again noting whether the search was performed through the Basic Search or Advanced Search interface. The total number of results for the database-limited search through the discovery service was compared to the number of results from the native database interface. After searches were performed in the discovery service interface and in the native database interface, participants compared the first two pages of results to note similarities in resource titles, source types, and relevancy rankings. The purpose of this portion of the protocol was to determine whether patrons would receive different search results or relevancy rankings based on which interface they were using to find sources.
The next portion of this protocol asked participants to note the title and resource type, as well as the presence or absence of a link to the full-text e-resource for results that appeared only in the discovery service. The final section of this testing protocol asked for librarians to examine detailed database records for some results to identify missing or confusing metadata fields in the discovery service. Librarians were invited to note any additional observations based on their experiences searching the discovery service and the native database interface.
Librarians were selected to participate in the Database Search Comparison protocol for a number of reasons. Because of their experiences working with patrons at the information desk and through library instruction, librarians have an understanding of how students are using the existing databases, as well as what topics students are researching as part of course assignments. Librarians also understand the various search strategies that patrons use when searching for information and what interface features are needed to support those strategies. Finally, librarians understand the importance of A&I, as well as metadata issues that librarians could encounter when searching in the discovery service versus the native database interfaces.
Results and Discussion
A total of 65 test protocols were completed during a 4-week period: 14 Advanced Searches, 20 Basic Searches, 17 Catalog Search Comparisons, and 14 Database Search Comparisons. Six librarians, three adjunct librarians, four information desk staff, the library technology specialist, and the electronic resources manager completed protocols for a total of 15 participants. The majority of the completed protocols documented positive search experiences with the discovery service, in which relevant results sets were returned for the test search. Some Basic Search and Advanced Search protocols revealed issues related to the relevancy ranking of search results within the discovery service, particularly when search terms were too general to yield a highly relevant set of results from the broad scope of content which it indexes. Applying the Full Text limiter revealed some coverage metadata and linking problems: Some results linked only to abstracts; some did not link directly to the article; and some linked to the journal, but they required a login or subscription. Librarians also identified instances in which the distribution of source type results was skewed and instances in which a certain source type (for instance, a reference entry) was anticipated but absent from the results set. These discoveries led to considerations as to which information needs are best met by the discovery service and when it might still be appropriate to search the library OPAC or individual databases directly. Applying limits and filters in the discovery service to refine the original search string provided opportunities for librarians to reflect on whether these tools influenced the results set as intended; furthermore, experimenting with limits and filters gave librarians insight into the effectiveness of different search strategies in the discovery service, which will inform their reference and instruction work.
The most valuable discovery from the Catalog Search Comparison protocols was the fact that only 75% of our MARC records were successfully ingested and indexed in the discovery service; prior to testing, we received no indication that there was a problem with the catalog file ingestion process. Through trial and error, EBSCO was able to identify a character compatibility problem in our MARCXML file that terminated the catalog data ingestion process. Once this character was corrected in the MARCXML file, the entire file was processed, and all subsequent update files were ingested and indexed in full. Additionally, several item records that are intentionally masked in the native OPAC interface were displaying as results in the discovery service because they had been inadvertently included in the MARCXML export. Comments from the Catalog Search Comparison protocols revealed that participants expected to find item information that was not included with the initial data import. The Catalog Search Comparison protocol also uncovered a numeric shelf location code that was not translated into its “human readable” form in the discovery service, as well as some other usability concerns. These issues with the catalog data export, ingestion, and metadata display were subsequently corrected. There were also positive discoveries; for instance, the availability display of catalog items in the discovery service was tested and found to update accurately within minutes of a circulation transaction.
The Database Search Comparison protocol produced interesting results. While the EDS and native EBSCO database interfaces produced consistent results in the majority of tests, there was one case in which there was a notable difference between the two search result sets. A search for “post-traumatic stress disorder” limited to full-text articles from 2008 to 2013 from the Psychology and Behavioral Science collection produced 1,608 results through the discovery service, but it produced only 1,580 results through the native database search interface for that collection. Both searches were conducted through the Advanced Search screen. The librarian did not note any difference between the first two pages of results from both interfaces. This test result was an outlier for comparison with native EBSCO database interfaces.
The differences between results from EDS and non-EBSCO content providers produced a range of observations. While EBSCO notes its data-sharing partnerships and agreements with other content providers, there appeared to be some issues when comparing the search results. Particularly of note was the difference in the number of search results between EDS and those from Credo Reference. For instance, a search on the term “Guantanamo” produced 25 results through the discovery service and 298 through the native Credo Reference interface. Other instances of this difference in search result numbers were found consistently with Credo Reference.
The Database Search Comparison protocols shed additional light on circumstances in which searching an individual database provides more useful results than those produced by the discovery service. These include instances in which an individual database provides specialized search tools (for example, the “Any author is nurse” limit in EBSCO’s Cumulative Index to Nursing and Allied Health Literature; CINAHL) or when a search term is general enough to produce volumes of irrelevant results when used in an interdisciplinary environment such as the discovery service. However, it can produce a relevant results set in a subject-specific database (for instance, the author name “John Green”). Librarians can use this assessment of the discovery service alongside the traditional research tools of the library OPAC and article databases in order to develop pedagogical strategies for demonstrating the discovery service in information literacy classes. It will also be useful for helping students decide what research tool is best suited to their information needs in reference transactions.
Testing Protocols as a Solution for Qualitative Discovery Service Assessment
The testing protocols provided a meaningful way to engage subject librarians and information desk staff in testing and evaluating the discovery service implementation. Librarians appreciated the balance struck in the protocols between requiring a structured, step-by-step search path and having the freedom to select their own keywords, limits, and filters. Asking librarians to evaluate the search results sets for relevance, noting both false positive irrelevant results and false negative results that they expected but were not present in the set, proved to be an effective way to harness their subject expertise, deep knowledge of our electronic resource collection, and knowledge of patrons’ research habits and information needs. The experience they gained using the discovery service as a result of the testing protocols will inform the ways in which they introduce this tool to students in the classroom and in reference interactions. The protocols are also an unintentional pedagogical tool—one librarian is considering adapting the Advanced Search protocol into an in-class activity for information literacy sessions.
The testing protocols were not only useful for identifying problems with our discovery service implementation, but they were also useful for reporting those problems to the vendor. The protocols provided a standard, structured format for recording the search experience with EDS, which allowed the electronic resources manager to identify at what point problems arose in the search process. Even the simple fact that the completed protocols were dated provided useful information for data forensics, as the vendor could use the protocol date to determine the discovery service system state (what databases were enabled for indexing and what version of the catalog export was indexed, etc.) at the time a given problem manifested. The protocols could also be used to test and verify solutions, since the electronic resources manager could recreate a search experience step by step and look for the desired changes to the interface or search results set.
Structured testing protocols are useful in coordinating the qualitative assessment of a newly implemented web-scale library discovery service. Standardizing data collection enabled the electronic resources manager to involve more stakeholders in the testing and evaluating process without compromising the analysis of the results; this, in turn, generated a greater volume of useful testing data. While test results generally showed that the discovery service produced relevant search results sets, the protocols did uncover problems with catalog data indexing and display, as well as with discrepancies between the discovery service search results and those generated by searching the native databases. The protocols became a useful tool for discussing these problems with the vendor and testing solutions as they were implemented. The testing experience was also invaluable in providing librarians and information desk staff with insights into how they can incorporate the discovery service into their interactions with patrons. The use of structured test protocols is an effective method for involving a diverse array of stakeholders in qualitative assessment of library discovery systems.
1. See, for instance:
Comeaux, D. J. (2012). “Usability Testing of a Web-Scale Discovery System at an Academic Library.” College & Undergraduate Libraries, 19(2–4), 189–206. DOI: 10.1080/10691316.2012.695671
Fagan, J. Condit. (2012). “Usability Test Results for a Discovery Tool in an Academic Library.” Information Technology and Libraries, 31(1), 83–112.
Gallaway, T., and Hines, M. (2012). “Competitive Usability and the Catalogue: A Process for Justification and Selection of a Next-Generation Catalogue or Web-Scale Discovery System.” Library Trends, 61(1), 173–185.
Hessel, H., and Fransen, J. (2012). “Resource Discovery: Comparative Survey Results on Two Catalog Interfaces.” Information Technology and Libraries, 31(2), 21–44.
Majors, R. (2012). “Comparative User Experiences of Next-Generation Catalogue Interfaces.” Library Trends, 61(1), 186–207.
Williams, S. C., and Foster, A. K. (2011). “Promise Fulfilled? An EBSCO Discovery Service Usability Study.” Journal of Web Librarianship, 5(3), 179–198. DOI: 10.1080/19322909.2011.597590