Searchers like to find things. Librarians and information professionals may be unique in that they enjoy the searching process, not just the finding bit. As Roy Tennant wrote in Library Journal in 2001, paraphrasing Herb White, “Only librarians like to search; everyone else likes to find” (roytennant.com/column/?fetch=data/77.xml). But most librarians and information professionals do like to find the perfect citation, book, statistic, or other piece of information for their clients and themselves.
But what happens when we don’t find anything, when we get zero results, a null set? Sometimes we can be so focused on finding one or more perfect answers, it can be easy to forget that, for some searches, an empty result set is exactly what the user wants, or that “no results” can lead to a more refined question. There are several strategic approaches to the null set, including checking the search statement, evaluating the database scope, and accepting zero results as success.
One common cause of the null set is an error in search terms. A simple misspelling can lead to zero results. Many library databases do not include the autocorrect feature common to web search engines. The first potential error to check, therefore, is misspelling. Google and search engines are teaching us that, as long as we enter something close to the common query, we will get either autosuggestions or even “corrected” search results. That conditioning has had an impact on searchers of other databases.
With autosuggest, as a searcher types, matching queries appear that can be cho sen or at least can help correct spelling. The best implemented autosuggestions, at least on retail sites, will give matching suggestions based on available inventory. With more general web searches, the suggestions can range from exactly what is needed to ones that are humorously wrong. Sometimes autosuggestions can even be distractions, causing the searcher to browse results for a different topic. However, while the autosuggest can make it easy to fix search query errors for popular searches, for precise searches, autosuggestion can lead to an inaccurate result set as it tries to find a more popular search query than the one entered.
While autosuggest helps with accurate spelling or phrasing, autocorrect takes it a step further. Web search engines will autocorrect common misspellings in a query and post a note at the top that they are “Showing results for” the corrected version while giving an option (in a smaller font) to search it as entered. Most of the time, these autocorrections are shown with the message at the top but sometimes are done automatically with no notice to the searcher, especially on a mobile device.
Given the way Google and Bing autocorrect and autosuggest, when searching a database with those features, a savvy searcher should be aware of when these features need to be turned off or disabled. Expecting zero or few results but getting more? Be sure to check if some unintended autocorrection, synonymization, or grammatical variant results were returned.
Try searching a simple typo such a s interperonal spacing at Google, Bing, Yahoo, or even DuckDuckGo. The results sets all default to the correction of interpersonal spacing with a notification at the top of the results about the correction. Trying the same search in Web of Science, EBSCO host Academic Search Complete, and ProQuest Dissertations & Theses Global results in zero or one result (which includes the typo). All three gave the suggestion of the correct spelling but not the autocorrected results.
When a library database does give some matches, like a number of library discovery tools do for this search, it can be especially confusing for patrons and librarians alike. Running interperonal spacing in a Primo discovery instance gives a handful results, many of which had the correct interpersonal spacing spelling visible in the results. These are displayed because the mistaken spelling occurs somewhere in the record, but that spot is not visible in the brief results. Nor is any recommended spelling correction displayed for this search. An unwary searcher could mistakenly assume that the displayed results are the entirety of matches available rather than the several hundred matches for a correctly spelled query. Having made that mistake a few times, I tend to remind myself and my students to be sure to double-check query spelling and query words more closely when using library databases.
When searching library databases that lack the helpful autosuggestions or autocorrect, use search engines to help find the correct spelling and popular usage. Just open a new tab or window, go to Google, and start entering the query word. Then copy and paste the correct autosuggestion back into the library database search box. Of course not all the autosuggestions are correct, but I tend to find better and more accurate suggestions from the web search engines than from the library databases that do have autosuggest. Especially for names of people, products, and organizations, the web suggestions can be more accurate because the engines are looking at the most common usage on websites, blogs, and social media sites and from officially designated accounts.
Suggested search terms on library databases draw on a much more limited set than the web search behemoths do. So entering the search hary potter in a Primo search box results in the suggestion “Did you mean: mary potter?” That could be a helpful suggestion for someone looking for information on the painter, the founder of a religious order, or any of the other people with that name, but it does not lead to the popular books about Harry Potter.
EBSCO host gives the expected suggestion on the hary potter s earch, but it has its own problems with sugges tions. The interperonal spacing search entered into EBSCO host Mental Measurements Yearbook database suggests the correct interpersonal spacing query. Unfortunately, in that database, it results in zero hits as a phrase, even if you search the correct spelling in full text. Here, the lack of synonymization can hurt, since searching synonyms such as personal space or proxemic does give some results on the topic. So zero results should also inspire consideration of different search terms in addition to checking for spelling issues.
As searchers get more and more accustomed to entering approximate spellings and still getting accurate and useful results from web searches, the more surprising it can be when another database fails to do so. As we get ever more used to the way Google delivers results and handles queries, searchers should be increasingly wary about search assumptions when using library and other databases that work differently.
Google continues to use the vast size of its databases and the billions of searches per day to study searcher behavior. Google tweaks the results algorithms so that they bet ter match the searcher’s intent without requiring users to choose the correction. Other smaller databases usually do not have the resources to do that. Whether or not the other databases claim to be “like Google” (often meaning only that they have a single search box), expect the actual search processing to be quite different. In any search, an empty results set should encourage a searcher to double-check the query spelling and to also consider the scope of the database itself.
Scoping the Scope
The general perspective about Google also impacts the evaluation of the scope of search results. Many think that Google knows all and searches all information. Consequently, for those who make that assumption, searching elsewhere is not needed. Information professionals know that every da tabase has its limitations, even Google. Understanding and evaluating what content may not be covered and how indexing practices vary can explain a null search results set.
Many databases try to be comprehensive within their scopes. Thus library discovery services aim to search entire library collections. Google aims to index all non-spam web content. Since neither Google nor discovery search engines are comprehensive, consider how the actual database scope has impacted the zero results set.
Assuming the query was entered correctly, searchers can assess whether the database scope even includes the content sought. Sometimes it obviously does not. Trying to find an article from a citation and accidentally click on a database such as IBISWorld or Counseling and Psychotherapy Transcripts? You should get a null set for a result since the database scope does not include article content. Searching for a novel in Knovel? Good luck. Trying to find chemical reactions in ERIC? Unlikely.
Yet even beyond the obvious scoping for the correct type of content within a search system, a null set of results can lead to questioning other aspects of scope. When using ad vanced search features such as controlled vocabulary and other fielded searching, a null set could be a sign that the indexing practice has changed. Searching PubMed for the diagnosis of Hereditary Breast and Ovarian Cancer Syndrome as a major MeSH heading gives no results prior to 2012. Digging into MeSH shows that the heading was only added in 2012—which is why no results show up earlier. Then a follow-up text search for ™Hereditary Breast and Ovarian Cancer Syndrome∫ OR ™hboc syndrome∫ for the earlier articles can be used. In this case, a search that brings back no results, or fewer than expected, can suggest that the database structure may not support the specific search statement. Or, as in the example above, the language may have changed across time.
Beyond field searching syntax changes and the resulting limitations, another aspect of database scope these days is the extent to which the information content is indexed. Google typically indexes the full text of webpages. Google Scholar usually indexes the full text of the articles and patents in its database.
Library databases such as PubMed, PsycInfo, and Com pendex are only indexing the metadata for articles (citations, abstracts, and index words) and do not include every word in the full text of the articles. Other databases may only have limited or no textual data beyond the metadata (video, im age, and audio databases for example). Knowing the scope of what textual data is indexed can help expand a search query to get beyond no results.
Web image and video databases often end up giving matches that may have only one or some of the search terms associated with the result. Try a search with two unrelated terms such as chopin flycatcher in Google Images and see if you do not find some images for only one term or the other (and many that seem completely unrelated to both). While a null set might be the expected (and even desired) result, the imprecision of image searching makes that difficult.
The Wanted Null
Certain types of search requests more obviously are hoping to end in zero results. Patent searching is a good example. When searching for prior art, ideally nothing else already patented will be found that exactly duplicates the new pat ent. In many areas of scholarship, researchers hope to find no examples of their exact research proposal. In a similar way, doctoral students aim to find a topic that no one else has researched and written up previously. The trademark searcher hopes to find no brand or similar trademark already in use. In all these cases, the aim is to try for a comprehensive targeted search and find nothing.
These types of searches are often followed up with other searches to find tangential information. The patent searcher seeks related patents to demonstrate how the prior art differs. The researchers and doctoral students search for the similar studies for the literature review. So the null set is wanted, but it then becomes a starting point for further searching.
In other situations, the null set can help lead to a better question. For the student who wants to write a paper on the impact of climate change on a hyperlocal geographic region (and the requirements are to use peer-reviewed academic articles), a search with no results can lead to a discussion of how to broaden the topic to a more appropriate level for the assignment. The market researchers seeking information on the market size for a not-yet released product should find no results for that product and can then be guided to searching for the market on related products.
In the library environment, when a known item title search gives zero results, it implies that it is not available in the local collection. Then the search pivots to finding where the document can be obtained, through interlibrary loan, purchase, or other means. The null set is a very useful result when it is accurately reporting lack of access.
Database producers often give the impression that offering some results is always better than none. That is what first led to the web search engines’ implementation of autocor rect. Other databases will give results by OR’ing the search terms when nothing is found with a default AND processing. Primo occasionally offers the message “Your initial search resulted in few or no results. The results below were found by expanding your search.”
Perhaps we contribute to the negative sentiment around null sets with our own language. How often for a null set do we say that no results are a “failed search” or that the search “failed” to find any results? Maybe for some null sets, it is time to say that the search “succeeded” in finding no results.