The Cloud Catalog: One Catalog to Serve Them All

As a whole, public libraries are the single largest supplier of books in the U.S. No single other outlet can compete with public libraries—not Amazon, not Barnes & Noble, not Walmart or Costco, not all your local bookstores. But you’d never know it to look at us on the web. Type Kate Atkinson’s recent book A God in Ruins (or virtually any other title you want) into Google, for example, and records for Amazon and Barnes & Noble pop right up within the first page of results, along with hits on the author’s and publisher’s websites and dozens of reviews. But although most public libraries carry this book, no library site is anywhere to be found among the first pages of results. For the average reader looking for this title, the library never even shows up as an option, much less the best option, for getting the book at the best price.

Our websites don’t measure up very well. While the new book “discovery” sites such as Goodreads and others attract millions of readers each month, even our largest public libraries fail to attract a fraction of that traffic. For example, while Goodreads ranks 67thamong most visited websites in the U.S., with 21.4 million unique visitors per month from the U.S. and 47.6 million from the world as a whole, OCLC’s WorldCat—our largest collective catalog and perhaps the closest thing we have to Goodreads—was ranked 3,748 of all websites in the U.S. and attracted just 487,884 visitors in April 2015, which is less than 3% of the traffic going to Goodreads. (See Figure 1 on page 41.)

So how do institutions that supply nearly half of all the books read in the U.S. end up so invisible on the web? Well, it’s not from a lack of customers, that’s for sure. According to the IMLS (Institute of Museum and Library Services), collectively, public libraries in the United States had 170,911,488 registered members in 2012, the most recent data available. That number is more than half the total population of the U.S. and almost six times the number of members on Goodreads. Although the IMLS does not keep data on “virtual library visits,” we can wager that the vast majority of those registered borrowers go to the library catalog—either from home or in the library—to find books, check out books, and place holds on titles in circulation.

Those are pretty impressive stats, and if all of those people were searching one library catalog, it would be enough to make libraries the largest book site on the web—by many orders of magnitude. The problem is, of course, that all of these people are not searching one catalog, they are all searching for many of the same books in thousands of different catalogs, each maintained separately—and at great expense—by the more than 9,000 public libraries in the U.S. Worse still, all these library catalogs are embedded in library automation systems completely isolated from the web search engines. The catalogs all use an archaic MARC record format developed a half-century ago in the early 1960s, a format totally unsuited for modern web technologies. Little wonder, then, that libraries are so invisible on the web, even though we serve a much larger group of readers than anybody else in the book market.

The solution is obvious—ditch those 9,000 old, outmoded library catalogs and funnel all of our readers through one great catalog built on the web. If we could get everybody to participate, such a catalog could provide information on the more than 891 million books and other materials held by public libraries in the United States alone.

Building a brand new catalog on the web would give us an opportunity to start afresh. It would allow us to take all of the records and great information about books and authors we librarians and the book industry have been building on and around our catalogs for the past couple of hundred years and combine it with the rich content, social, and delivery functions on the internet to create the most comprehensive source of book information and discovery on the web. Bigger than Goodreads, bigger than Amazon, bigger than anything that currently exists, it should become the reader’s first choice when looking for a book or information about a book, when sharing a book with a friend or reading group, or when looking to get hold of a book, whether by borrowing, buying, or free downloading.

The Specs

So what will this brand new catalog/book discovery service look like? Well, first of all, it will need a name, so let’s call it the Cloud Catalog, at least for the duration of this article. If something better comes up, we can always change it later.

Secondly, while we certainly know enough to spec out a beginning set of features, the Cloud Catalog would always be under development, just like everything else on the web. If we really could get 170 million people searching in one place, we wouldn’t just have a catalog—we’d have a community—actually more like a metropolis—of people interested in books, for which all kinds of services could be developed. We could easily incorporate new sources of content and functions as they became available. The Cloud Catalog would be a dynamic resource capable of changing and developing and adding new content and services as it grew.

That said, there is a fundamental set of characteristics the Catalog must include in order to succeed.

It Has to Be One Catalog

The Cloud Catalog must bring together readers currently searching in thousands of separate libraries and funnel them into a single site. The Catalog would contain records for each library’s holdings, and readers could access the Catalog off their library’s website, just as they do today. But instead of being directed to the local ILS, they would go to the Cloud Catalog shared by all participating libraries. Even though the Catalog would include hundreds of millions of records and be shared by millions of patrons, the default view would show only the local library’s holdings, plus possibly the hundred thousand public domain downloadable books now available. Readers would not have to search through the whole of it just to find out what was on the shelf at their local library or what their computers could reach in seconds. However, if they didn’t find what they wanted locally, they could easily expand their search to the entire catalog.

Aggregating readers onto a single website is the fundamental concept behind the Cloud Catalog. By pulling together our readers and having them search in a single location, libraries become the largest book site on the web, garnering all of the power and opportunities that kind of an audience can provide. Without it, our patrons are dissipated over thousands of different websites, and libraries remain invisible on the web.

It Doesn’t Have to Serve Them All, But It Must Serve a Whole Lot of Them

It would be nice to launch the Cloud Catalog with all 9,046 public libraries and 170 million library members in the U.S., but it would be unrealistic to count on that kind of participation. All we would really need to get started is enough readers to attract attention and make sure the social functions worked well. Goodreads is doing just that with fewer than 15 million users, or less than 10% of all library members in the U.S.

It Must Work Like a Catalog Overlay or Discovery Layer

We wouldn’t want to replace 9,000 library management systems, much less destroy the library automation market, so the Cloud Catalog would be built as a “‘discovery layer.” It would use the “catalog overlay” technology already tested and proven by BiblioCommons, Encore, EDS, and others to link into the underlying local library management systems to get availability and status information. It would allow patrons to place holds, request items from other libraries, and perform other actions requiring the involvement of the local library system.

It Must Be Fully Customizable

Librarians should be able to set their own “look and feel,” turn on and off search functions and other features, and otherwise configure the thing to work the way they and their patrons want. However, if the customization includes content enhancements of value to readers everywhere (reviews, bibliographic links, etc.), the Cloud Catalog would support sharing with the entire user community.

It Can’t Just Be About Library Books

It must be a one-stop shop for books. Besides information on all titles held by public and academic libraries, it would carry information on all forthcoming books, information on commercially available titles not held by libraries, and information on electronic and audio formats available free or for sale. In short, the Cloud Catalog should be the one source readers go to first when they want to find a book, regardless of who has it, what its format is, or whether it is in-print, out-of-print, or not yet published.

It Must Allow Readers to Get Books Any Way They Want

If the user’s library owned a title also available commercially, the patron would be offered the option to buy as well as borrow. If the library didn’t have a copy, the patron would be offered the option to request it from another library, recommend that the library purchase it, or purchase it for himself. The purchase option should allow the reader to choose from a variety of vendors, including Amazon, Barnes & Noble, and independents. Finally, if an ebook version of a title were available, the Catalog should link to it and allow the reader to download it, whether from the library’s own collection or another repository such as the HathiTrust or Project Gutenberg or even an offering from a commercial supplier.

It Must Be Social

The Cloud Catalog would feature all the social elements made popular by the new “book discovery” sites. Patrons could review titles with reviews aggregated across all the libraries using the Catalog. Because those reviews would draw from a population of 170 million people or more, you can rest assured that books in the Cloud Catalog would be heavily reviewed. Readers could catalog their own collections and create shelves to group their titles into whatever categories they liked—To Be Read, Have Read, Not Worth My While, etc. Readers could share those shelves and other details of their reading interests with friends and others they permitted to “follow” them. Readers could choose to export some or all of this information to other social media sites. The Cloud Catalog would have all of the software needed for online book discussion groups, whether managed by interested readers or facilitated by librarians. Last, but not least, the huge audiences in the Cloud Catalog would be a strong magnet for authors wanting to promote their books and connect with their readers through live author chats, free giveaways, and other communication and promotional channels pioneered by commercial book sites.

It Must Be Easy to Use

The Cloud Catalog could be divided into various views, each showing a different section of the bibliographic universe, depending on what the reader wanted. Readers might start out with a view of their local library holdings—the default view linked from the library website, but also linking to formats and versions available elsewhere. For example, a search on the library’s Huckleberry Finn would also provide links to versions and formats available elsewhere, such as ebook formats available from the HathiTrust, Project Gutenberg, Google Books, etc. An ebook tab would let users see an ebook view of the catalog with titles available from their libraries, plus those available for free from the HathiTrust and Project Gutenberg, as well as commercial titles available from Amazon, Google, and Apple. If a search in local library holdings doesn’t turn up a title, the user just clicks on a tab at the top of the search results to see what is available in other branches, or on another tab to see what is available from other libraries in the consortium or other libraries in the world. Click on the Forthcoming Books tab to get a view of what is due to be published in the next 3–6 months with the option for patrons to tell their librarians what they would like the library to order.

It Must Link the Libraries’ Physical Collections With the Rest of the Book World

Current library catalogs provide information on the titles the library has, but tell us nothing about related items we don’t own. The Cloud Catalog would include information on almost everything available, whether the library had it or not. Using a traditional library catalog, the library would actually have to own all the titles in a series for the patron to find them all, but with the Cloud Catalog, all the patron would have to do is scan the ISBN on the book (or type in an author/title search) and the record for the title would pop up. Then the user would simply click on the Series link to go to a complete list of all the titles in the series, including forthcoming titles not yet published.

Likewise, if a patron discovered an author, she would click on the Author link to bring up a brief biography along with a complete bibliography of everything the author had written. The Cloud Catalog could also help people find different editions and formats of titles the library might not have, e.g., public domain downloadable titles. The same principle would apply to finding books on related subjects or finding books on recommended lists or award-winning titles.

It Must Have One—And Only One—Good Bibliographic Record Per Title

The Cloud Catalog would include one record per title, and all the libraries owning a copy of that book would share that record. When records were improved or enhanced, the improved version would be immediately available to everybody. The efforts of thousands of library staff managing records in local systems would be redirected to the Cloud Catalog, where improvements made would benefit everyone. We could “FRBRize” the catalog by creating records for works that link various editions and formats. We could link the data in our records to other relevant information on the web. In a catalog of this size, we could build and maintain sophisticated authority records critical for helping users distinguish one author, place, or subject from another. Libraries could customize the Cloud Catalog records if they chose by adding content that would only appear in their view of the Catalog.

None of These Records Would Be in MARC

Unlike current catalog technologies built on MARC records and tied to local library automation systems, the Cloud Catalog would be built on the web using record structures such as BIBFRAME or Schema or others that are easily discoverable by search engines and with data structures that allow for easy linking with related information scattered all over the web.

It Must Support Many Different Types of Records

The Cloud Catalog would follow the lead of Goodreads, Amazon, and other web-scale catalogs and include comprehensive records for series, authors, fictional characters, records for books set in particular locations, and a variety of other lists and book data that would better help our readers find what they sought—regardless of whether their libraries had them or not—and help them to see how those titles fit into the wider bibliographic world.

It Must Allow Librarians to Curate Records

Who will create and maintain all these wonderful new records and book data in the Cloud Catalog? We would. One great thing about a catalog that would serve all the public libraries in the U.S. is that not only would you get our 170,000,000 readers, you’d also get the 47,000 librarians and 90,000 other staff who work in public libraries across the U.S. to help take care of it. That’s more than 137,000 educated, dedicated, and experienced people … most who love books, and many who would like nothing better than to curate series, author, or other book data in areas that interested them. Wikipedia and others have already developed the infrastructure that would allow librarians and library staff to collectively curate a catalog. The people at Goodreads and Open Library have already applied that technology to large-scale book catalogs. So the people are there, the mechanism is there. We only need to create some standard templates for the various types of records to ensure some consistency across the results and hire a few editors to watch over our work. We would finally have a place where we could showcase the knowledge and value of librarians and library staff on a very grand scale. And since there’s nothing more dedicated than a dedicated reader, as Goodreads proves, we might include patrons in our curating process.

It Must Allow Us to Simplify Acquisitions and Cataloging

Acquisitions librarians would search the catalog for titles they wanted to purchase. When a librarian found a record for a book she wanted to buy, she would simply click a button on the record to indicate that the library was purchasing a copy, and the record would then automatically show up as “on order” in the library’s view of the Catalog. No cataloging required. Meanwhile, the Catalog would automatically forward the order to the library’s vendor of choice or allow the order to be downloaded into the acquisitions system in the library’s ILS. When the book arrived, a small stub record would be automatically downloaded into the local ILS so the library could circulate it. A process that now requires several staff to find, copy, upload, and modify records, along with a variety of other convoluted procedures, could be reduced to one click of a button by the acquisitions librarian.

In Time, It Should Be Global

It would be a waste to develop something the size and scope of the Cloud Catalog just to use in the United States when readers all over the world have the same needs and the same problems. So the Cloud Catalog should be available in multiple languages and support multiple national bibliographies. The staffs of libraries in different countries would need to work to build out their sections of the Catalog, but they all could build on the same platform, and readers could extend their searches across the world.

It Might Not Be as Difficult as It Sounds

All of this sounds a little daunting, doesn’t it? But remember what Daniel Burnham, the famous Chicago architect, said: “Make no little plans; they have no magic to stir men’s blood … ” And clearly, this is no little plan. But, in truth, nearly every element of the Cloud Catalog has already been developed and tested and is in active use somewhere on the web.

OCLC, Google, Amazon/Goodreads, and Open Library have all developed massive book catalogs that could provide the bibliographic foundation for the Cloud Catalog. Books in Print, the major book wholesalers, and the publishers themselves can provide data on forthcoming books and titles libraries don’t have, as well as supplemental content to flesh out library records. Curated author, series, subject, and other specialized records already exist in a number of catalogs, including Amazon, Goodreads, and Open Library. Google has developed algorithms that can, on-the-fly, pull together basic information on a number of well-known authors. Wikipedia provides the model to create and maintain records collectively.

The social elements of the catalog have been fully developed by Goodreads and many of its competitors, as well as by a number of vendors specializing in the library market, including BiblioCommons, ChiliFresh, and others. Web-compatible bibliographic record structures are being developed by the Library of Congress and Zepheira and their BIBFRAME initiative, and OCLC using the Schema approach. Catalogs that co-locate multiple editions of a work are commonplace. OCLC is trying to do the same thing in WorldCat, not always as successfully—but then It has more records to deal with. Finally, many book sites offer readers multiple ways to get a title, including the options of buying the title from Amazon or any of 12 other online stores, downloading an electronic version for purchase or for free with a public domain book, or finding it in a library. OCLC also offers multiple fulfillment options, including purchasing or borrowing or requesting an interlibrary loan as part of its WorldShare Management System, as do other library system vendors.

The tools are all there on the library side as well. The catalog overlay or discovery layer technologies the Cloud Catalog would use to connect with local systems are available from several vendors including BiblioCommons, Ebsco, OCLC, ProQuest, and others. So are all the tools to allow the Cloud Catalog to “ingest” library catalog and holdings data and those that allow for live, real-time interaction between the Catalog and the underlying local library management systems. Electronic acquisitions systems that allow librarians to transmit orders directly to publishers and book vendors from the Cloud Catalog are available from a variety of library automation vendors—or could be created from scratch using standard electronic ordering protocols. The concept of using the discovery layer as a cataloging utility was pioneered by OCLC and has now been adopted by a number of other vendors, including Intota and Encore.

Some of the technologies and content we’d need—both on the retail and the library side—are freely available, linking our print records with their equivalent electronic editions. Book retailers would hardly complain if links in our records start sending our readers in their direction. Likewise, the concept of “librarian crowdsourced” author, series, subject, and other records is just an idea that we are free to borrow. A lot of other components of the Cloud Catalog fall into that category—concepts free for the taking that we would need to flesh out.

Some of what we would need, e.g., the book and holdings records, is proprietary, however, and owned by others, who will want compensation for its use. Rather than re-creating those millions of records from scratch—a task that makes cleaning the Augean Stables look like child’s play—a much better idea would be to match our records with those of a great book catalog already on the web such as Amazon/Goodreads, Google Books, or even OCLC ‘s WorldCat.

No matter which existing catalog we choose, the owner of the data will want something in return. Luckily, we have a great bargaining chip, namely the 170 million generally well-educated, book-buying, and content-consuming readers we could bring to the table. The demographic we represent is something nobody—not even Amazon or Google—can ignore. We could use it to develop partnerships and drive bargains advantageous to both libraries and our readers. And this may be the time to re-examine any stiff-backed “principles” that reject advertising revenues.

What’s in It for the Libraries?

Now, I hear some of you saying, “Nice idea, but how are you ever going to get 9,000 libraries to participate? And who’s going to pay for this thing anyway?” First, you would hope that many libraries would choose to participate simply because the new Cloud Catalog would give their readers a much more comprehensive, useful, and interactive tool than what it has now and it has the potential to put libraries in the center of the book world on the web—where we truly belong. But the Cloud Catalog could also offer libraries real cost savings.

First, the Cloud Catalog would feature book covers and reviews and other content which libraries currently pay commercial vendors to add to their local catalogs. Secondly, libraries would no longer need to subscribe to a bibliographic utility or pay book vendors to deliver MARC records. Finally, and most importantly, the Cloud Catalog could eliminate much of the work—and the costs—of library technical services departments. Maintaining the local catalog would largely be a matter of keeping it in sync with the Cloud, most of which could happen automatically. For the few local titles that might not be in the Cloud Catalog, we’d have a record template that any staff member could use to add a book, where paid editors would review it to make sure it conformed to standards. In fact, there would be almost no more need for library technical services departments and all their attendant costs. The Cloud Catalog will assume those functions. As for all the “librarian curation,” this would fall to the rest of the library staff. The technical management of the catalog and its records—of the type that takes place in library technical services department—would be handled by a small number of Cloud Catalog managers and editors—at a substantial savings to local libraries.

And building and maintaining the Cloud Catalog might not cost that much. According to CrunchBase—a website that tracks such things—it took only $2.8 million to develop and launch Goodreads. If you divided that cost equally among the 9,082 public libraries in the United States, you’d get a total cost of $308.30 per library—a sum that even the smallest of us should be able to afford. Of course, the Cloud Catalog would require some complexity that Goodreads didn’t have to worry about, such as the discovery layer technology that would allow the Catalog to interact with the underlying library automation systems, an acquisitions system that would allow libraries to place orders with vendors, and a lot more records to manage. However, many of these components have already been developed for open source systems such as VuFind and Evergreen and might easily be incorporated. So, even with those additions, the Cloud Catalog should not be that much more expensive than Goodreads.

There’s one other factor we need to consider in calculating costs. Unlike the traditional library catalog (which some have defined as a black hole that sucks money), the Cloud Catalog would have significant potential to generate revenue. With potentially more than half the population of the U.S. using the site, referral fees from book sales from the Catalog could be significant. The Cloud Catalog could also attract to the same advertisers and publishers that use Goodreads to promote their titles now. While anti-advertising librarians would have the option to shut it off, with the audience size we are talking about, those revenues would be nothing to sneeze at.

Largely eliminating the costs of technical services departments, dumping bibliographic utilities, and cutting payments for book covers and reviews would probably give you enough savings to cover the cost of the Cloud Catalog, with plenty of dollars to spare. Add the potential revenues generated from sales and advertising, and libraries could actually look forward to making money off the Catalog, rather than the other way around. Of course, any libraries who want to maintain their own catalogs could load its data into the Cloud Catalog and still maintain the local catalog that came with its ILS.

Who’s Going to Do It?

Who precisely are the “we” I speak of as doing this job? The simple, unequivocal answer is we librarians and our libraries. Certainly, commercial firms can contribute to the project. We will need to buy some of their technology and services. We may consult with them and make use of their records. But in the long run, it must be we librarians who own it and control it on behalf of our patrons and ourselves. You can’t imagine Amazon outsourcing its catalog to some outside vendor, can you? Not a chance. And neither should we, despite the fact we’ve been doing exactly that for the past 50 years.

There are a lot of major players who might be interested in helping us to build the Cloud Catalog. The obvious first choice would be OCLC. It has already put together a massive catalog in WorldCat: It has exposed much of the data in that catalog to the web (although WorldCat records continue to be scarce or nonexistent in web search results); OCLC has included records for forthcoming books in WorldCat; it has developed linked data formats and other pieces we would need. Of course, there’s still a lot OCLC lacks too. Most importantly, it has not aggregated 170 million library users in a single place. Each library that uses WorldCat uses it as a local catalog, not as the collective resource envisioned by the Cloud Catalog. WorldCat has very limited social features—and very few take advantage of those that exist. It uses standard bibliographic records, meager when compared to Goodreads, Amazon, or other commercial catalogs. As a result of these and other factors, WorldCat is not well-used. So while OCLC can certainly bring a lot to the table, WorldCat would still need a lot of development work to get us where we need to be. However, if OCLC were interested, it has already a lot going for it.

Of course, it’s possible that OCLC might not want to play. It has its own particular interests which OCLC may consider do not lie in this direction. In that case, there are other options … options that would just involve additional work. Google, for example, has a huge catalog of all the library works it’s scanned (full text too), plus all the content it’s compiled on forthcoming and in-print titles for its Google Play catalog. And Google just might be interested in helping to develop a book catalog that could provide effective competition to Amazon and Goodreads.

Speaking of Amazon and Goodreads, the pair have developed a massive catalog that attracts millions of people every month. Goodreads has richly developed and heavily used social features that make the crowd curation of catalog records work effectively. Each of these partners would come to us with their own motives, so we would need to find ways to work with them, while preserving our own autonomy and the interests of our patrons.

There are also organizations that could be potential funding partners. One good potential prospect could be the big publishers themselves. The group would clearly be interested in anything that might provide significant competition for Amazon/Goodreads. Some have spent a significant amount of time and money developing their own offerings with little success. So, perhaps they would be interested in a Cloud Catalog that has the potential to direct millions of readers to a variety of retailers to buy their books—and not just one whose name begins with A. There are a wide variety of other potential funders. Successful authors such as James Patterson who would like to see a little diversity in the marketplace have been making grants to independent bookstores and others to help achieve that mission. Then there are the more traditional funders including IMLS and other government agencies, as well as the major foundations.

Finally, remember that we don’t necessarily need to ask for much here. Building the Cloud Catalog does not require a huge organization or gobs of resources. All it took was a handful of people and a few million dollars to get Goodreads off the ground. We can do the same, if we need to.

We have the people, we have the resources, we have the tools, and we have the money. It’s time we picked them up and started building something with them. What are we waiting for?

The views expressed herein are those of the author alone and do not necessarily reflect those of LSSI or the libraries it manages.

Living in a Cloud Catalog World

If libraries could pull this off, we would be looking at a very different literary landscape than the one we know today—in the not too distant future. Consider the following scenarios:

A woman from Orlando might logon to see what her friend in Boston is reading (the two met in one of the Cloud Catalog’s online book clubs) and discovers a title by a brand new author Kirsten Valdez Quade called Night at the Fiestas. The Catalog shows her library has it, but there are already 15 holds on it, so she clicks on the Buy It link and downloads the Kindle edition for $2.99. Her library gets a percentage of the sale.
A teen in Arcadia, N.Y., is browsing books in the fiction section of her library and discovers The Crown of Midnight by Sarah Mass. She scans the bar code with her phone to bring up the Cloud Catalog record for the title. It is highly rated, with more than 500 reviews, and she also learns it is the second book in the Throne of Glass series. She clicks on a link to bring up a series record jointly maintained by three YA librarians in Wisconsin, Ohio, and Tennessee. She wants to start from the beginning—The Throne of Glass. Luckily her library has it, so she puts it on hold. Next she wants to see what else the author has written, so she clicks on a link to bring up a Cloud Catalog author record maintained by a couple of librarians in California. There she finds not only a bibliography of everything Maas has written but also a note that she will be coming out with a new series later this year called A Crown of Thorns and Roses. She clicks to follow the author, so she’ll be notified when the title is published—regardless of whether her library decides to purchase it.
A flamenco guitarist from Des Moines, Iowa, is looking for a book on early Spanish music called The Music of Ancient Arabia and Spain, Being la Musica de las Cantigas written in the 1920s by Julian Ribera. He looks in the Cloud Catalog and finds a reprint of an edition translated by Eleanor Hague and Marion Leffingwell that he would like to get his hands on. The Cloud Catalog shows several libraries have it, but it is also available for sale from a variety of sellers. He thinks he will want to add this one to his personal library, so he clicks to purchase a copy, and the library gets a cut of the purchase. He then clicks on Ribera’s name to bring up his Cloud Catalog author record. There he finds a brief biography of Ribera along with a complete bibliography of his works maintained by a couple of music librarians at UCLA. He finds several other titles of interest.
An acquisitions librarian in Seattle logs in to begin her morning work. She’s just finished a self-published title called A Season in Taos by a brand new author, Kilsby Knowles. Unlike many of these titles, this one was actually pretty good, so she adds a brief review to the Cloud Catalog’s The Best from Indie Authors—a specially curated list of self-published titles maintained by thousands of librarians across the country. Next she begins to work through the 35 titles on her acquisitions list. She finds detailed records for all the titles in the Cloud Catalog and automatically “catalogs” them as she clicks to order each. When she’s finished, her order is electronically transmitted to the vendor of her choice. Meanwhile, the records for the titles she has selected now show up as “on order” in her library’s view of the catalog, and a brief MARC record is transmitted to the underlying ILS to allow for holds and circulation. As a result of this automation, her library has largely closed its technical services department since going over to the Cloud Catalog. The four staff who used to work there have been shifted over to public service to help handle the increasing demand for customer service that has been occurring since the Cloud Catalog made the library the “go-to” place for books.

These are just a few of the things you could do with a catalog that listed most of the books in the world, and served more than 170 million readers. There are literally thousands of other permutations and ways to use it. We have everything we need to implement it, and the only thing we have to lose is our current “also-ran” standing in the world of books.

What Could Go Wrong? Getting It Right

One criticism sure to arise is that if we made a catalog that showed our patrons all the books available in the U.S. (and perhaps beyond), they would just want us to get them, and the resulting ILL costs would bankrupt libraries.

First off, we make a mistake if we think of this thing as “just a big library catalog” and of libraries as the only possible suppliers. To truly serve the needs of our users and compete effectively with commercial interests, this catalog must be a comprehensive book discovery and fulfillment service that allows our patrons to find any book that exists and get it in any way they want. Borrowing it from a library should just be one possible way of getting it. The “Borrow it from your library” button need not be available for all titles, and the “Buy it” button would not show up on titles not commercially available. And, of course, libraries concerned with the cost of ILL could impose fees for that service, just as many do today. If our catalog has helped a patron find the book he wants and provided all the options for getting it at the best possible price, then we have added value and done our job—regardless of whether that particular title was available to borrow free from the library or not.

Secondly, during the past few years, Amazon and others in the online marketplace have managed to significantly reduce the cost of getting many books. An ARL/RLG ILL cost study back in 1993 put the total cost of an ILL transaction at an ARL library at $29.55, which is about what it costs to buy a new trade title today. And it often took several weeks for the material to arrive. However, if you look on Amazon and other commercial suppliers today, it is not uncommon to find books for as little as $0.01 plus a couple of bucks for shipping … and the title often arrives in as little as 48 hours.

Libraries in Northern California looked at the disparity between ILL costs and delivery times and commercial availability, and asked, “Why bother with ILL?” So they created a program called Zip Books, where libraries actually purchase books on Amazon and have them shipped directly to the patron’s home address in place of traditional ILL. When done reading the book, the patron returns the item to the library. At that point the library decides whether to add it to its collection or give it to the Friends book sale. A more detailed description of the program is available here (califa.org/wp-content/uploads/2013/10/Zip-Books-Kickoff-webinar.pdf). This is just one example of how libraries can take advantage of new commercial developments to help reduce costs.

Finally, a catalog which showed all the titles a library had on order—as the one described here—would have the potential to significantly reduce the library’s acquisition expense for popular titles by sharing some costs with the patrons. For example, back in the early 2000s, Riverside County Library piloted a program called FirstReads. Librarians would put up a short list of popular titles they were interested in acquiring several months in advance of publication. For a small, tax-deductible donation—which was $8 in Riverside, but could be anything the library wanted—a patron could sign up to read the book first. As with Zip Books, the library supplier would ship the book directly to the patron’s address when it was published, and the patron would return it to the library once done reading it. The patrons would get the book for much less than they would have paid Amazon or a bookstore, still get to read it when it first came out, get a tax deduction for their donation, and turn it into the library when done. Meanwhile, the library gets a book it was going to order anyway with some or all of its purchase price covered by the patron’s donation. Although ultimately the FirstReads pilot in Riverside failed, it was because librarians found it too difficult to keep the special FirstReads selection lists updated with new material. However, in the Cloud Catalog, all library acquisitions would automatically show up as “on order” and become eligible for a FirstReads or FirstReads-like program without the librarians having to do anything. This would make the logistics much simpler and the chances of success much greater.

Programs such as Zip Books and FirstReads would be optional and are only two examples of the many kinds of products and services that could be developed in a catalog—rather a platform—of the size and scope we envision here. But the key takeaway here is, that far from bankrupting libraries, having one catalog to serve us all might offer us ways to reduce some of our expenses and actually increase our materials budgets, which we desperately need.

Getting There

A number of existing products and services have already accomplished large pieces of what has been laid out here or are trying to achieve some of the same objectives through slightly different means.

Trove

trove.nla.gov.au

Truly an Australian National Bibliography and then some, Trove contains nearly 434 million items, including 20 million books held by Australian libraries, digitized newspapers, journal articles, diaries, pictures, and even archived websites. People using the website are directed to catalogs of libraries that hold the item or can have material brought up directly if it is available online. Book records offer borrow, buy (from multiple suppliers), or access online options. Trove has instituted crowd curation of some bibliographic records, and the public is invited to help clean up copies of their digitized newspapers. It even has some limited social features. All of this was developed by the Australian National Library (Are you listening LC?) with a lot of help from participating libraries and other content suppliers. Heavily used, in 2015 it was averaging more than 1.6 million visits per month.

What it lacks: Its own website is a great supplement to—but does not replace—any library catalog. As a result, searching is a two-step process, with patrons first conducting a search in Trove and then linking into the library catalog to see if an item is available and requesting it. Record curation functions are still pretty rudimentary. It does not offer the acquisitions and cataloging functions envisioned for the Cloud Catalog. It does not aggregate library users on a single platform the way Goodreads and others have done and the Cloud Catalog would do, so the audience is not as large as it could be, but with 1.6 million users per month, Trove is not doing badly.

WorldCat and Related Products

worldcat.org

The largest union catalog in the world includes records for more than 347 million works with more than 2.2 billion holdings from 72,000 libraries all around the globe (oclc.org/worldcat/catalog.en.html). Though it includes some social features, the features are not well-used. Holdings updates are not dynamic so WorldCat does not always provide a true picture of what titles are available where. Though it does not include curated records, OCLC itself does a lot to try to enhance the records. Records do include a borrow or buy function from multiple vendors and will link to an ebook version when available. OCLC has built several complementary products around WorldCat, including WorldShare Management System—its ILS—which includes records for forthcoming titles that can be used to catalog material much as we have envisioned for the Cloud Catalog. OCLC also has a WorldShare Discovery Service which uses WorldCat as the discovery layer, with direct real-time links to the subscribing library’s ILS. Finally, WorldCat has exposed its data to web search engines for a number of years now, most recently using the schema.org taxonomy. Unfortunately, it is still rare to see a WorldCat record or any library record anywhere within the first pages of search results.

What it lacks: Although OCLC calls WorldCat “the World’s Largest Library Catalog,” it actually functions more like “the world’s largest referral service.” Patrons who find their way to WorldCat must then click on another link to see if their library actually has the item and if it’s on the shelf. Links between library holdings and WorldCat are often incomplete and out-of-date, because not all libraries submit their holdings, and those that are included are not updated on a real-time basis. Finally there is the lack of usage. OCLC’s idea has been to expose library bibliographic records to the web in the hopes that the search engines would find the records and put “find in a library” links on major search engines and booksites such as Google, Yahoo, Goodreads, etc. This would then drive traffic to WorldCat and thence to the libraries themselves. As the web traffic chart shows, neither of these strategies has seemed to work out. Goodreads with a much smaller catalog attracts 100 times the traffic. It may be time for OCLC to rethink its strategy.

Libhub

libhub.org

Libhub is a project of Zepheria group working with LC to roll out the new BibFrame data model to replace MARC. The Libhub initiative aims to increase the visibility of library data on the web by publishing library bibliographic records in the BibFrame format and then aggregating them at the Libhub site on the web where search engines—and, hopefully, library patrons—can find them. The theory is that search engines like things with lots of links in them and the dense cross-linking in BibFrame records will attract search engines and cause those library records to rank near the top in search engine results.

What it lacks: While Libhub may work fine for unique items where the library is the only place that holds the item, it does not work so well for more generic searches or for searches on more common titles. Just publishing bibliographic records to the web may not increase library visibility—especially when you consider that OCLC has published WorldCat data to the web for a number of years now with similarly unimpressive results. And anyway, all you get is a link back to a record in the library’s catalog. You don’t get any of the richness of the interaction you find on popular book sites such as Goodreads, any of the cost saving advantages of doing acquisitions and cataloging from a well-curated union catalog such as that described here, nor any of the options for getting a book by buying, borrowing, or downloading it as available in both Trove, WorldCat, and elsewhere.

Commercial Discovery Services

A number of vendors have developed discovery services or catalog overlays where search and other catalog functions are conducted in a separate online catalog or discovery layer. These services were originally developed for academic libraries to allow them to better integrate their article databases and other e-resources with their physical collections. More recently, BiblioCommons has adapted the model for public libraries with the intent of improving the patron search experience by adding social features, integrating ebooks, and other features not supported by the traditional ILS. All discovery services maintain real-time links to the library’s underlying ILS to display current data on availability and to allow patrons to place holds, download ebooks, manage their accounts, and perform other routine ILS functions. Unlike Trove and WorldCat, where patrons must first search a union catalog and then click a link to (maybe) find it in their library, discovery services are fully integrated with the local ILS, and patrons have no idea they are actually working with two different systems—the same model we suggest for the Cloud Catalog. Some vendors have also begun to add acquisitions, license management, and cataloging functions to their discovery layers.

What they lack:Although these discovery services collectively serve millions of readers, none have taken the opportunity to aggregate those users across all of the libraries they serve. The exception is BiblioCommons, which does aggregate patron reviews, tags, and some social features across all their libraries, but in general, discovery services treat each library using them as its own silo with its own records and its own patrons. Also, because each library works with its own set of records, you don’t get the efficiencies you could get from sharing bibliographic records—as in the Cloud Catalog—nor the opportunity for librarians to collectively curate those records. Finally, each of these discovery services is owned and run by a commercial concern and subject to all the vicissitudes of commercial concerns. They can always be bought out, go bankrupt, or simply decide to go in a different direction—and there’s little we libraries can do about it. So while it has produced some wonderful tools we will need, we must own it ourselves.