Online KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools KMWorld Library Resource Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research

Magazines > Computers in Libraries > April 2009

Back Index Forward

Vol. 29 No. 4 — April 2009
Diving Into the Blue: A Look at Michiganís Repositories
by Senovia Guevara

Libraries are constantly evolving and changing expectations for the user and the information environment. More than just being an authoritative source for research, expertise, and quality content in the form of books and journals, the information center is evolving into a more expansive role. New services that preserve and produce content are expanding the role of the library outside of the narrow, traditional community audience. With the growth of information technology awareness and expertise from library staff affecting and expanding new library initiatives, libraries have furthered their impact by way of creating institutional and collaborative online repository projects.

In the book Convergence and Collaboration of Campus Information Services, editors Brian L. Hawkins and Patricia Battin note that, “It has become clear that traditional notions of libraries and information technology organizations are no longer intellectually and economically sustainable. New interrelationships and organizational structures will be necessary to manage, finance and coordinate the choices and opportunities made possible by digital information resources.” Similarly, Abby Smith, who defined the likely role for a research library in the 21st century in a Council on Library and Information Resources report, says:

In its local role, the library will be optimized to meet the needs of its campus community. The library is likely to provide repository infrastructure for stewardship of university-based information assets. Most of those assets will support pedagogy, administration, student life, alumni affairs, and other things vital to the school. A much smaller portion of them will support research. Research will be a far more global phenomenon than local institutions can support on their own. In its networked role, the library will be able to support research and dissemination to the extent that it is tightly networked into the increasing cluster of inter-institutional collaborations that enable the creation and use of scholarly content. These collaborations will be key elements of research cyber infrastructure, an infrastructure that will be a research-and-dissemination platform. In the magic phrase of the digital era, it “will scale,” be ubiquitous, and support a variety of scholarly domains, from astronomy to nanobiology, archaeology to urban design. The next-generation research library must be firmly embedded in that infrastructure because that will be the platform to which scholars will gain access on their laptop library.  

There are many examples of libraries that have actively embraced the role that Hawkins, Battin, and Smith envisioned by focusing themselves on developing innovative institutional and collaborative repositories. The University of California’s eScholarship Repository ( is an excellent example of a repository that allows for the wide dissemination of quality material to users, spreading the impact of the library, the scholar, the work, and the institution itself.

The University of Michigan (UM) Library, where I serve as an information resources assistant, has two such major repositories in place—Deep Blue and HathiTrust. Each project is fairly new, in existence for only a few years. James Ottaviani is the librarian who works with the Deep Blue institutional repository, and Jeremy York works with HathiTrust. Both librarians were interviewed in order to learn more about the projects’ backgrounds and to understand their impact on the community.

A Look at Deep Blue

Deep Blue ( is a DSpace institutional repository that collects and stores content produced by the UM community. The project has been in existence for 5 years, with almost 3 years in production. The project began by incorporating other collections of content from various sources that were online, including material from the Digital Library Production Service. Deep Blue incorporated the variety of content under one umbrella. Currently, the material in the repository totals just fewer than 45,000 entries, with much of the material coming from repeat depositors, and the repository averages one or two submissions daily. There are no limitations to what can be deposited in Deep Blue, as long as the item can be digitized and stored. But the library does provide a set of guidelines that is meant to diminish problems with the deposits.

An additional benefit of the repository to the Michigan community is that the library can act as an agent for the author, offering assistance with interpreting author contracts. Ottaviani noted that more than half of the authors don’t know if they have the right to deposit their work in Deep Blue because they don’t fully understand the author agreements. To assist them, the library has a copyright specialist and an on-staff attorney who can review contracts and inform the authors on whether their material can be deposited into the repository. In addition, the library can act as an agent for the author in contacting and asking the specific publisher whether the material can be submitted.

Ottaviani concluded by noting that, depending on the publisher that published the author’s material, the work may automatically show up in Deep Blue as well. This comes from university library agreements with publishers that empower the library to act as an agent for the authors in the UM community. Ottaviani stated that, “It is the goal to get works out and read by the public, and Deep Blue serves this purpose.” To work toward this goal, the library is building into the library’s content contracts the right to deposit Michigan-authored works into Deep Blue and is working to secure rights published under more-restrictive agreements.

HathiTrust: The Elephant That Never Forgets

With an interest in digital access, Jeremy York’s first job out of UM’s library program was with the scholarly publishing office at the university as a web developer. The HathiTrust project ( was initiated about the same time as York’s employment; York was later hired to create the initial website for the project and to investigate new initiatives. Currently, he serves to coordinate between library technology offices and other groups working with the project.

The precursor to HathiTrust was the Michigan Digitization Project (MDP), which started in 2003, serving as the foundation of what later would be known as HathiTrust. Because of the Google contract, the UM library was allowed to share Google materials with other libraries. In 2007, the Committee on Institutional Cooperation (CIC) made an agreement to start a digital repository, and HathiTrust was born. York emphasized that the HathiTrust repository is about preservation, with the commitment to preserving content in the long-term. In fact, the name Hathi is derived from an Indian word meaning elephant, a symbol for strength, permanence, memory, and stability—characteristics at the heart of the preservation aspect of the trust. 

York noted that the library “has the capability to enter 350,000 volumes a month into the repository, but that number has declined recently to approximately 60,000 volumes as it catches up with Google’s digitization process.” The library has already added content from the University of Wisconsin and will soon work with the University of California to add items from its collection. Although there is none as of yet, international partners are welcome to join the project in order to expand the collection and the impact of the repository.

When it comes to Google, York noted that the terms of the Google settlement will open up materials in HathiTrust to nonconsumptive research; such a large body of text has not been opened up to this sort of research before. When it comes to older books, it is hoped at some point that “digital copies in HathiTrust will be certified as preservation copies and multiple libraries will no longer need to hold copies of the same books.” HathiTrust can assist with duplication and storage problems since “libraries will be able to access copies of books through HathiTrust.” Libraries are already able to ingest bibliographic records from HathiTrust into their own cataloging records, further broadening access to these materials. The project is good for UM since, York noted, the university is one of the founding members of the trust; that puts Michigan in a position to lead the effort along with other partners. It also raises the profile of the university since the project has strong roots at the school.

When it comes to items scanned by Google, York noted that restrictions on the books are in place so that it limits the sharing of content. Although the library can share content with other libraries, it cannot provide the optical character recognition (OCR) of the books to individuals, even if the content is in the public domain. If the texts had been self-scanned, some of the limitations would not be in place, and the library would be freer to share its content. However, York said that the partnership with Google has allowed a tremendous amount of material to be available to the public, and much more (including in-copyright works) will be available as part of the settlement. Libraries will be able to sign up for a subscription to view copyrighted material that has been digitized by Google partners, but a subscription won’t be needed to access public domain materials. (HathiTrust database rights information is available on the project’s website.)

HathiTrust is unique in the sheer volume of materials it contains, with currently more than 2.5 million volumes and 25 partners. York noted that HathiTrust’s collection will also be distinctive from other digital repositories since it is starting a Trustworthy Repositories Audit & Certification process (TRAC certification), distinguishing HathiTrust as a trusted repository. He added that as a preservation initiative, the project is distinguished because of the level of access provided to the content. Future plans for the trust include applying next-gen cataloging and making the site mobile-friendly. Services are currently available to users with disabilities to make the content accessible to them. As for other institutions, the library is working on an API that can be used to customize the project with features dictated by the institution, allowing the institution to lay its customization over the HathiTrust repository.

York revealed that there has been some disagreement among libraries concerning the subscription needed for content access from HathiTrust. He stated that, “Although there are indications that subscriptions will be reasonably priced, some have been wary of the agreement with Google, noting that there is no guarantee that the costs for subscriptions would be or stay low. The outcome of negotiations around subscription prices for Google partners and others is yet to be determined.” In conclusion, York noted that the HathiTrust repository is a library project for and by libraries. The breadth and scope of such a project would not have been conceivable earlier on, and the successful partnerships established with Google promote the overall mission of libraries—the widest access and use of materials possible.


In conclusion, online repositories that support the free distribution of scholarly works are important to strengthening a library’s relationships and furthering its impact in the digital world. As students and researchers choose to forego the expertise of the on-site library for the online environment, the repository offers a chance for libraries to continue their mission by staying relevant and remaining a benefit to their community.

Senovia Guevara ( is an information resources assistant at the Hatcher Graduate Library at the University of Michigan.

       Back to top