Magazines > Computers in Libraries > April 2015

Vol. 35 No. 3 — April 2015

The California Light and Sound Collection: Preserving Our Media Heritage
by Richard P. Hulser

“We must save our audiovisual heritage before it is too late; analog recordings are threatened by fragile physical condition, format obsolescence and the lack of playback equipment.”
—Barclay Ogden, California Preservation Program
While most of the focus on digital preservation and access has been on digitizing printed materials, there is an initiative underway in California to capture and make accessible audiovisual content in such a way that even libraries, museums, and archives with limited resources can participate.

The California Light and Sound collection is the outgrowth of the California Preservation Program’s California Audiovisual Preservation Project (CAVPP). The collection already contains many locally significant oral histories and amateur films that intimately document everyday life in 20th-century America.

The ultimate goal for the project is to provide public access to media content through the Internet Archive ( for teaching, research, and study. The task is being accomplished by partnering with libraries, museums, and archives throughout the state to build the digital collection.

Selection and Digitization

CAVPP plays the lead role in helping participating partner organizations conserve and preserve their audiovisual collections according to best practices for the archiving and preservation of moving image and sound formats. It also established a low-cost and practical workflow for helping partner organizations efficiently digitize key media artifacts. CAVPP coordinates all digitization activities with the vendor doing the digitization work and helps the participating institution throughout the process.

Identifying objects for digitization—The institutional partner first assesses its audiovisual collection to determine what items to nominate for digitization. For institutions with large collections, CAVPP recommends using a tool called CALIPR (, designed by the California Preservation Program office expressly for institutions without experts on staff to assess the preservation needs of larger paper-based and audiovisual collections.

An institution doesn’t have to have a large number of recordings in its collection to participate. CAVPP wants to preserve locally important recordings (such as those found in the history rooms of public libraries or historical societies), so smaller organizations may preserve a handful of recordings at a time. Partners are able to nominate individual recordings or whole collections. In a round of nominations in 2014, partners were asked for up to 100 recordings. If all 100 cannot be funded in one submission round, the items can be nominated later as long as funding is available to continue the project. In my library’s most recent submission, we nominated and had accepted about 13 items (roughly a third of what we nominated in the first round).

Nominating process—After assessing the collection, recordings are nominated by creating records in CAVPP’s CONTENTdm (from OCLC) account, which is used by CAVPP to manage all nominations. The nomination process requires that potential participants provide certain metadata (such as a main or supplied title, media type, name of the holding institution, date created, a copyright statement, and a statement as to why the recording is significant to California or local history).

The copyright statement can be particularly challenging, especially if there is little or no documentation as to the origin or ownership of the item. A recording may have been donated to an institution, but if the documentation doesn’t clearly indicate a transfer of distribution rights to the institution or that the recording is in the public domain, it may not be able to be posted to the Internet Archive. In that case, it wouldn’t be eligible for digitization under the terms of the grant.

Once CAVPP has reviewed and approved nominated items, the original recordings are sent to CAVPP for processing. CAVPP adds administrative metadata to the record and sends recordings to the vendor for digitization.

Digitization process—Based on current practices in the media preservation field and with input from participating partner archives, a list of technical specs was created as the default output format for various file types. Contributing partners can also request special additional output formats at a nominal cost.

The vendor performs several steps in the digitization process, including first photographing the medium and its container, and then inspecting, prepping, and transferring the recording according to the CAVPP specs. Treatment is done to the recording only if necessary, and CAVPP always checks with a partner before proceeding with any restoration. Technical metadata about the transfer is compiled and recorded.

Outsourcing the digitization work was found to be a cost-effective approach for tackling a large amount of materials and a wide range of formats. To optimize quality control, CAVPP prefers working with labs that can handle all audiovisual formats. This not only saves shipping costs but ensures that the appropriate standards and procedures are applied to all recordings. To this end, it has mostly worked with the vendor MediaPreserve, located outside of Pittsburgh. However, it is currently trying out other vendors as well. CAVPP has also worked with in-state vendors, depending on their specialty.

Quality assurance process—Following digitization, CAVPP performs a quality assurance check on the digital files. It checks the technical specifications of each file state, both the preservation master and the compressed access version. Sound and image quality are checked at the beginning, middle, and end of a recording (approximately 10% of a 30-minute file). Metadata is verified, and the content is checked to ensure it matches the title. In some cases, reviewers at this point suggest alternative titles to more accurately reflect the content if the original title was estimated or listed as “unknown.” Once the files are checked and uploaded to, CAVPP sends an email to the partner notifying it that the files are online and ready for review.

The last stage of the process is for the partner to check the quality of the digitized recording posted on the Internet Archive. Partners are asked to check files within 30 days after the recordings are online. This helps CAVPP assess the sound and image quality of the transfers in order to report back potential issues to the digitization vendor in a timely manner.

CAVPP provides detailed information and examples to help partner institutions perform the necessary quality-check steps. This part of the process takes a bit of time for the partner and includes watching and/or listening to the entire digitized recording to ensure that what was meant to be digitized was digitized and that the result is acceptable, barring any inherent problems with the original item. A student intern or volunteer can be invaluable for a first or second pass at examining the digital content.

After files and metadata are approved by CAVPP and the partner institution, the vendor returns the original materials to the partner’s archive along with a hard disk drive containing archive versions of the digitized items. If the partner wants to keep the hard disk drive, that cost will be added to the invoice. Otherwise, the partner is expected to download the files and ship the hard drive back to the vendor.

Preservation and Access

Once digitization is complete, the focus naturally shifts to preservation and access. In addition to the web storage and public access provided at, CAVPP maintains an offline depository of master files and encourages participating libraries to do the same.

The size of digital audiovisual files and their storage costs are formidable. CAVPP estimates that preservation video masters average 102GB, and access video files average 1GB, assuming a running time of 60 minutes. CAVPP currently needs 103TB to store 1,000 moving image and sound recordings in both file states. While online storage prices are in flux as drive density continues to increase, CAVPP indicates that current prices for online storage are in the range of $1,000 per TB a year, leaving the annual cost (not just a one-time cost) to store the California Light and Sound collection masters online unsustainable. Therefore, CAVPP implemented an affordable, expandable storage solution using a hybrid model of offline storage for preservation files on LTO (Linear Tape-Open—an open format tape storage technology). Between 2010 and 2014, CAVPP has stored 3,000 files offline, representing 216TB-plus of data on LTO at a cost of $16,140 (or just $5 a recording).

In accordance with the digital archival principle of redundancy, each partner is encouraged to store at least one copy of all file states per digital object. Currently, 73% of CAVPP’s active partners store copies of their files on hard disk drives (HDDs), RAID, or their own servers.

To get an idea of storage costs for a partner, the average storage needed for 12 recordings is about 1.23TB, assuming all are moving images with maximum running time. With a 2TB hard drive (costing about $250), the storage costs for 12 recordings would be about $155 (for HDD storage media and shipping) or approximately $13 per recording.

Morals of the Story

CAVPP’s goal may have been to save and provide access to California’s significant, at risk, historical sound and moving image recordings, but in addition to accomplishing that, it has achieved more. Surveys of its institutional partners have confirmed that the CAVPP collaborative model is effective in helping them address their preservation needs and provide access to audiovisual materials even with limited or no funding. The project demonstrated how to streamline the preservation workflow by establishing standards and stimulating reviews of current standards and practices, and it has inspired institutions to address the needs of other recordings outside the project’s regional scope.

And through the broader visibility of the Internet Archive, partners are using their digital archive materials as a marketing tool to promote their collection beyond the brick-and-mortar of their institutions. The files they have produced for this effort can also serve as a proof of concept for other digitization projects and for recruitment of potential donors to support other digitization efforts. CAVPP’s initiative has demonstrated an effective and affordable way for organizations to collaborate in digitizing content that broadens an understanding of local history that would otherwise be lost.

California Audiovisual Preservation Project (CAVPP)


Chart courtesy of the California Preservation Program

Since 2010, when it received development funding from the California State Library, CAVPP’s goal has been to build a new research resource: the California Light and Sound collection of digitized film, video, and audio recordings related to California history.

The project—involving 84 organizations across the state of California—addresses the numerous challenges of identifying, digitally reformatting, and preserving archival audiovisual content and delivering it to the general public. The project’s mission is to capture and make accessible this content before it is lost due to deterioration, rapidly advancing format obsolescence, and lack of playback equipment.

Most of the materials selected for digitization are unique, unpublished recording masters. They feature a variety of content, including interviews and oral histories, speeches, performance art, home movies, television and radio political ads, newsreels, and events (such as museum exhibit openings and documentaries). There are more than 3,000 recordings already online, with another 606 titles planned to be released by the end of 2015.

Nomination and Selection Criteria

In order for material to be digitized as part of the program, items must comply with the following nomination criteria:

• Historical significance— contributes to an understanding of the history of California and its people, ideally featuring widely known names and events

• Archival importance— unique, a master recording or “best available” version or representation of the content

• Risk of loss— due to physical condition (Is the item damaged or deteriorating?) or endangered due to format obsolescence

• Primary source— an item that was never published commercially and may be the only copy

• Right to digitize— items in the public domain or for which the intellectual property rights are held or secured by the contributing institution

• Provenance— cataloging or other metadata that describes the content

• Demand— materials often requested by patrons

These criteria help the CAVPP partners narrow their focus and determine urgency of action, as well as help prioritize recordings most in need and most important (a combination of curatorial judgment and physical assessment).

And the Nominees Are …

Getting information that describes an item can sometimes be challenging. In these examples, labels on the containers or the reel, or even writing on the film leader may be all that is available. There may be additional documentation from when the item was created or donated. In some cases, there may be documentation with descriptions of the content from viewings years ago and earlier attempts to catalog the items.

Metadata can be on the container or even on the film itself.

Videotape of bison in Red Rock Canyon

16mm film of monkeys arriving at the San Diego Zoo
CAVPP provides guidelines for the creation of basic descriptive and rights metadata if records don’t exist. When there is not enough information, the term “Unknown” can be a placeholder in the initial record. Later on, once the item is digitized, viewed, or listened to, more information can be added.
Film of coral trees along the street in Los Angeles
20th-century media artifacts come in a variety of formats.

The Medium Is Not the Message.

Audiovisual formats exist in a wide variety. Some examples encountered in the project include lacquer discs/acetates; LP records; wire; 1/2" or 1/4" audiotapes; mini-cassettes; CD-R; DVD; DAT; 8mm or Super 8mm and 16mm; 22mm and 35mm film; videotape in 2", 1", 1/2", or 3/4" formats; 1/2" reel-to-reel videotape; and a variety of Betacam formats.

CAVPP preserves motion picture recordings using digitization rather than traditional film-to-film preservation processes because the gains in access afforded by digitization justify the cost of preserving recordings that otherwise would be lost.

What to Check For

In checking the results of the digitization effort, partners are asked to do the following:

Confirm that image and sound quality are adequate for patron use; ideally, check the recording in its entirety. Is the file true to the original source? Is the image too dark or too light? Are skin tones acceptable? Is the sound consistent, or does it speed up and slow down?

Confirm that content corresponds to descriptive metadata. Is this recording actually what we thought it was?

Check for evidence of an incomplete recording. Is there missing or duplicate content? Are parts out of order?

Check the metadata. Are there notes or questions from the technicians or other reviewers that need to be addressed?

The author would like to thank Barclay Ogden, Pamela Jean Vadakan, Kristin Lipska, and everyone at the CAVPP office in Berkeley, Calif., for providing information for this article. He would also like to thank his volunteers, particularly Licia Maria Hurst for her many hours dealing with the metadata.

More information about CAVPP can be found at the following websites: and

Richard P. Hulser, M.L.S., M.Ed., ( is the chief librarian at the Natural History Museum of Los Angeles County and project lead for his institution’s participation in California Audiovisual Preservation Project (CAVPP). He manages the library and institutional archives for the museum family of institutions. Hulser focuses on the strategic use of advanced technologies to enhance information services, while rekindling the value of library and archive collections and diminishing their perception as useless warehouses of yesterday’s information.