Weaving the Past into the Present by Digitizing Local History
Kay Schlumpf and Rob Zschernitz
Digitization is in the news these days. Google’s doing it, and Microsoft is too. Some may think you need to be a gigantic corporation or huge university to tackle this technology. But in reality, institutions of all sizes can jump into the fray. Here at the North Suburban Library System (NSLS), a multitype consortium in Wheeling, Ill., we coordinate Digital Past (www.digitalpast.org), a centralized local history digitization initiative made up of institutions of all sizes with collections of all types.
Our responsibilities include the day-to-day running, maintenance, and development of the initiative. Kay, as project manager/Digital Past coordinator, organizes training and support, assists with marketing, and serves as the public face of Digital Past, including doing demonstrations, presentations, and recruiting. Rob, as systems engineer, handles the technological infrastructure of Digital Past, including server design and maintenance, backups, updates, patches, configuration, and development.
Weaving History and Technology: Why Digitize?
With Digital Past, there are various reasons to digitize, but the main intent of most participants is to reach out to the communities they serve. Think about the many cultural institutions you’re familiar with. Are they open when you’re available? Do you have to wait for specialized help to access the collection? Are the collections even accessible to the general public?
We’ve heard many stories over the years of fantastic collections that are locked up in a leaky basement or an off-site storage area. What good is this history if no one can access it? If you’re lucky and the collection is open to the public, what are the hours? Some historical societies we know are open about 4 hours a week, and you can bet it’s at times that aren’t convenient to most folks.
Imagine taking these issues out of the equation altogether. Putting historical artifacts on the Web immediately increases their accessibility—no walls, no specialized help, no limiting hours. From what we’ve found, expanding the collection’s presence in this way has an amazing snowball effect: Renewed community interest leads to more support in the form of donations and new volunteers. Then, more items become available. People are often hesitant to give away their family treasures, but many are willing to loan them to an institution for digitization. A good solid base builds confidence in your institution to make these precious items available to a larger audience.
Increased access isn’t the only reason to digitize. Even though digitization is not preservation—it doesn’t mean you can toss the originals or replace your microform—many of our participants do use it as a way to preserve their items and to keep them from being handled. We have many fragile old documents, photos, and books in the Digital Past collections. These items have been digitized, often transcribed, and cataloged, then put away in archival storage where they’ll rarely need to see the light of day again. School kids, community members, and genealogists, as well as serious researchers, can now access these items any time they want, no gloves required.
Designing the Pattern: The Pilot Project
In 1998, when digitization initiatives were few and far between and the Web was still in its toddler stages, a small team at NSLS was looking ahead. It applied for a grant from the Illinois State Library to create a pilot project for digitizing member libraries’ small local history collections. As with all our initiatives, NSLS’s practice is to “go with the willing.” We were thrilled to have 14 public libraries and one school library volunteer come onboard.
In 1998, scanning documents and photos certainly wasn’t as commonplace as it is today. You’ll recall that computers were still pretty slow and the Pentium processor was still rather new. Scanners were big, bulky, and typically quite expensive. People didn’t know what type of hardware they’d need to do this scanning. To help with that initially, NSLS used grant funds to purchase a complete digitization workstation for each of the inaugural members of the project. The workstations included custom-built PCs, a high-end Agfa or a large-format Epson flatbed scanner (depending on the member’s need), and a monitor.
Although Northwestern University’s Collaboratory Project provided grant-funded training to the libraries and system staff, the team still had to figure out how best to mount all the data we’d be getting. We did some research and quickly learned that even though there were image-management systems out there, none of them were designed to capture the types of data that we’d be collecting in local history archives.
Building the Base: A Platform for Expansion
Striving to always be leaders in technology, NSLS already had seven Web servers online and serving out Web pages by 1999. At that time, the technology group had been playing with putting databases online and accessing them via CGI programming written in Perl. This new digitization program was the perfect place to start building a Web site in which people could mount their images and the associated metadata into the database on our server. At that time, standards were nonexistent. We based our metadata on the Dublin Core Metadata Initiative, which was just becoming known. Even with that base, we still found that there was a wide range of acceptable metadata, especially when we migrated to a new database system. So we standardized all subject headings to Library of Congress, which is still our standard today.
Within the first year, mostly because of the sheer amount of data we were receiving, Digital Past was rapidly outgrowing the Web server that it was sharing with other projects. By then, NSLS had added a few more servers to its network, and it was time to add a new, dedicated server for Digital Past. Meanwhile, we were loading more and more content, and the first round of database development work that we had done was showing its shortcomings.
Weaving the Basket: Supporting Growth
Not wanting to reinvent the wheel, we formed a new team and did extensive research on what others were doing and what, if any, applications were out there to handle this type of initiative. Again, we found that since we were still at the forefront of this emerging technology, the marketplace lacked software as feature-rich as the initiative required. To accommodate its growing popularity, the project was given the formal name “Digital Past,” and we registered the digitalpast.org domain.
Through winter 2000 and spring 2001, we decided that we’d keep the current Digital Past running while we completely redesigned, redeveloped, and rebuilt a new Digital Past from the ground up. All preconceived ideas were thrown to the wind, and we went in with a clean slate to develop the best product we could.
One of the issues we encountered was that libraries lacked the means to store and serve out larger archival images. In response, we designed the infrastructure to accommodate a large, expanding storage capacity. The database became so intricate and housed so much data that the team designed a relational database structure using Microsoft Access in order to plan for future growth.
This new Digital Past also offered many additional features. We created a simple search that allowed for subject, keyword, and proper-name searching. Advanced search added the capacity for Boolean qualifiers and searching on several specific fields. Search results came back in a display that included the thumbnail, title, collection the item was in, date, and contributing participant. Clicking on the thumbnail brought up the full image with the following metadata: title, creator, publisher, place where published, participant, and date/
decade. Clicking on the title brought up the full metadata record. We created additional search and management tools for the participants. We also developed browsing by participant, subject, proper name, collection, decade, and class. In addition, we unveiled the concept of exhibits, which are similar to interpretive museum exhibits but are Web-based.
The new Digital Past was completely different from the original. That version ran under Solaris, using Perl and MySQL as the database back end, and was served out by a Netscape Web server. The new version went live in April 2001, running on Windows 2000 with a completely redesigned interface and logo. The code was all written in ASP and had a Microsoft Access database to power all the data manipulation. Under this new version of Digital Past, participants uploaded their images via FTP to our server and had an elegant, online, form-driven process for entering the metadata for each record.
While we were making all these changes, Digital Past’s popularity was exploding. Family Tree Magazine named it one of its Sites of the Day in August 2001. At the same time, digitization was getting very popular. Due to the increased awareness, Digital Past users and participants started providing more and more helpful feedback. At this time, the Microsoft Access database was migrated to the existing dedicated Microsoft SQL server to improve performance and to secure the data.
Reshaping the Basket
In spring 2004, with technology ever-changing, we were again itching to keep Digital Past on the cutting edge. As the database grew, our tech staff had optimized the ASP version to run more efficiently. We still needed to increase performance, so we separated the data and imagery off from the server that ran the presentation layer of Digital Past. We moved the data to a Linux server with a drive array attached to it. Also, because we had put standards in place for archival-quality images, we had to re-engineer the backup and disaster-recovery procedures to keep the program at a sustainable cost. Looking for ways to improve, we checked out DiMeMa’s CONTENTdm digital collection management software, a product that also originated in 2001 (and was acquired by OCLC in 2006). After extensive research and testing by our team, we decided that this was the wave of the future, and we needed to get involved.
With our more than 34,000 records, migrating to anything new was going to be a monumental task. Because we wanted to be cautious, we assembled a Digital Past migration team to specifically deal with setting up CONTENTdm. This team would also handle training issues and would plan out a seamless migration from the proprietary system that we built to the new standards-driven system. This team consisted of the two of us plus three others: Ian Baaske (developer/database administrator), Debbie Baaske (metadata/cataloging expert), and Dawne Tortorella (consultant/training expert).
In summer 2004, we started planning and staging the migration. By early fall, we were ready to do some final testing. Through the testing, we found that some of the features that users really loved (such as the browsing functions, the exhibits, searching by participant, and some of their collection management tools) were not going to be natively possible with the new system. So the team developed a few custom applications to bring the enhanced features back into Digital Past. We ran the previous and CONTENTdm versions of Digital Past side by side from November 2004 through February 2005 to allow users to have an easy transition. During that time, no one was allowed to update the previous version. Digital Past officially went live with CONTENTdm on Feb. 14, 2005.
When we started Digital Past, best practices and standards for digitizing were nonexistent. We asked participants to resize their display or Web images to fit on the screen. We also decided to give thumbnails a set height. We weren’t really sure what to do with archive images, so some participants had them and some didn’t. Now, following the best practices set forth by the Illinois State Library, the participants create three image files: archive, Web, and thumbnail.
The archive or master image is the highest resolution, 300–600 dpi. These images are stored at the holding institution and help to eliminate rescanning and handling of the items at a later date. Our policy is to save these as TIFF files with no compression. We drop Web or display images to 100–150 dpi JPEG, which serves two purposes: quicker loading over slower connections and protection from unauthorized use. An image at 100–150 dpi is going to print out OK for a kid’s school report or genealogical research, but if the image were to be taken to outside print firms, they would usually require at least 300 dpi. This has been a concern especially for historical societies that partner with our libraries but make most of their income from the authorized use of their images. The OCLC CONTENTdm software we now use automatically creates thumbnails.
With the Web 2.0 revolution happening, we’re always trying to find ways to enhance the online experience of Digital Past. We developed a few more plug-ins to our CONTENTdm installation with the most recent upgrade. When we launched a redesigned home page in August 2006, it sported a random image feature, an RSS feed to keep track of the latest additions, and browsing by participating institution, proper name, organization, and city.
Tucking In the Ends
Of course, there are always drawbacks. We have to keep in mind that digitization is a labor-intensive process and that all these files take up a lot of space on the server. Commercial software often isn’t within a small institution’s budget, not to mention the hardware and people to manage it. For our members, Digital Past was the answer—a centralized, consortium-based program. NSLS maintains the server and provides training, troubleshooting assistance, and staffing for it. We also maintain a digitization lab with equipment and hands-on help that’s available for the participants’ use free of charge.
Recently, we also began hosting the Digital Past Users Group, which helps facilitate idea-sharing and information dissemination. For their end of the bargain, participants are responsible for digitizing and cataloging their own items. They are also responsible for getting permission to display the item online and allowing use that may be requested by authors, producers, or people who just want a good quality copy of an item. Often a staff member or two are given the project, and some have been lucky enough to attract technically savvy volunteers who help with the work. Both NSLS and the participants promote Digital Past in a variety of ways, including at farmers’ markets and genealogy conferences and with speeches to local groups.
We’ve been thrilled to see Digital Past’s use skyrocket in the last year. This invites the question: Who uses it? Well, just about anyone who might be interested in our geographic area; products created here, such as the Curt Teich postcards or Elgin watches; or special topical strengths we have. We know that it’s used by schoolchildren for their local history studies and history fairs as well as by historians, genealogists, public-TV producers, authors, university classes, folks with a tie to the area, and others worldwide. We have had inquiries from as far away as Japan, Poland, and France. Participants have been contacted by various magazines, book authors, and TV producers for further information about items found on Digital Past.
Since its humble beginning, Digital Past has now grown into a large centralized collaboration that consists of 32 libraries and a museum as its primary contributors. The current list of participants includes 26 public libraries, two special libraries, an academic library, a school library, a local museum, and two libraries outside the NSLS service area. These participants then partner with other local cultural institutions such as historical societies. At one point, we required that non-NSLS members partner with a library in order to join, but we have expanded the opportunity to nonmembers, such as libraries outside of the NSLS service area, local historical societies, and museums. There’s a long list of other institutions of all types that are thinking about joining as well, so we anticipate growth in the coming year.
In fall 2006, Digital Past reached a milestone of 50,000 items on its server. There’s always something new being added. We’re constantly looking to the future and trying to develop new ways to weave the past into the present by offering the data in new and fresh ways. By doing this, we’re able to show the world a bit of local history, demonstrate the importance of digitization, and showcase the fact the libraries are at the forefront of technology.