What to Expect When You’re Digitizing: A Primer for the Solo Digital Librarian
by Jane Monson
Starting a digitization program can be a daunting prospect for any library, whether it is of the large, well-funded variety or the small, shoestring-budget kind. The former may have the luxury of costly commercial software, dedicated programmers, and multiple librarians of various specializations, all of which can smooth the process. But what if these things are out of your reach? What if you are a “lone wolf” digital librarian—project manager, collection developer, metadata creator, and web designer, all rolled into one—with a limited budget to boot? You may fear you are doomed to spend years toiling away with little to show for yourself. However, with creativity, flexibility, and a willingness to reach out for help, you can be well on your way to launching your digital collections within a year—despite unforeseen roadblocks that you may encounter along the way.
| With creativity, flexibility, and a willingness to reach out for help, you can be well on your way to launching your digital collections within a year—even if you are a ‘lone wolf’ digital librarian.
My own experience as a new librarian has centered upon grappling with these issues. In graduate school, I had the opportunity to work in the digital services department of a library within a major academic research institution. There I received mentoring from a team of librarians experienced in digitization and was able to get my feet wet planning and executing projects for the school’s already-established digital library. This afforded me an excellent, if rather one-sided, view of the digital library world. Upon graduation, I accepted a position as digital projects librarian at Truman State University, a small public liberal arts university in Kirksville, Mo. I was tasked with kick-starting the library’s digitization efforts, beginning with the creation of an online repository of unique and rare materials from Special Collections. While well-prepared in theory, I slowly began to feel overwhelmed by the complexity of the task and unexpected obstacles that I encountered. As the project’s completion was pushed further into the future, it sometimes seemed as if it might never get off the ground.
One year later, the digital library is still under development. But significant progress has been made, and the light at the end of the tunnel is clearly visible. What seemed like major hurdles turned out to be surmountable and resulted in learning lessons that may be helpful for other librarians who find themselves in a similar position, whether or not they have previous experience with digitization.
Lesson 1: Accept Your Limitations
An important first step is to realistically assess your skills. Realize and accept that you can’t do everything yourself, although it might be tempting to try. When I was new to the job, I had a somewhat inflated sense of my abilities—aside from hiring a student worker to assist with scanning, I was determined to prove to myself and my colleagues that I could construct the digital collections website more or less on my own. This might have been feasible had our library possessed the funds to purchase CONTENTdm or a similar digital collections management system that offers an out-of-the-box solution and technical support. Unfortunately, this was not the case, as the annual licensing fee was beyond what the library’s rapidly shrinking budget could support. (As a public institution, our finances were being stretched ever thinner due to the Great Recession and resulting state budget cuts.)
To deliver the collections online, the open source Greenstone software was chosen instead, due to its low price (it’s free) and highly configurable interface. While it offers a bare-bones front end upon installation, Greenstone can be customized to provide a sophisticated user experience similar to that of CONTENTdm. However, it requires a certain level of programming skill that I quickly realized I didn’t possess—I had some experience, but I was no web developer. After months of struggling, I was forced to face the fact that this particular task was outside the scope of my current abilities. The realization was a relief and freed me up to focus on finding a person more suited to the job. Admitting this limitation and seeking outside help from the beginning would have saved a lot of time and frustration. Regardless of what your specific strengths or weaknesses are, focus on project management as your primary task and don’t hesitate to delegate authority in areas in which you are not an expert. In my case, it meant seeking the services of a consulting firm specializing in Greenstone customization.
Lesson 2: Free Software Isn’t Really Free
The experience of wrestling with Greenstone made one truth painfully obvious: open source software, while free of upfront fees, comes with its own set of behind-the-scenes expenses. These costs include time spent installing and configuring the program, which will almost certainly require the attention of a system administrator and/or programmer. Any future system upgrades or migrations will also be your responsibility. Dedicated server space for both testing and production are necessary for collections that aren’t hosted by a commercial service.
Another point to remember is that when a system offers more flexibility, more work is required to get it to do exactly what you want. Fedora Commons is a good example of this—it’s digital object repository software that can be customized for just about any purpose, but trade-offs include a steeper learning curve and the need to install a separate user interface program.
When choosing a content delivery system for your digital collections, careful consideration should be given to the technical infrastructure and resources available at your library. Open source software offers many benefits, including greater control over the appearance and functionality of your collections. But keep in mind that if money is a concern, in some cases the time and personnel required to set up an open source system may approach the cost of purchasing a commercial license. Of course, the absence of annual licensing fees will often make open source software a better investment over time. Spend time researching your options and considering these factors before selecting a system to deliver your online collections.
Lesson 3: Seek Outside Funding Sources
The issue of cost brings us to our next lesson, that of funding your project. In a perfect world, your institution would allocate generous funds to cover all necessary expenses. In reality this is rarely the case, and creative budgeting is essential. Your first plan of attack should be to aggressively pursue outside sources of funding. Investigate state and federal grant opportunities, as they are available for digitization initiatives. Your state library is a good place to start, as it will likely have Library Services and Technology Act funds to disburse for the creation of digital collections. Organizations embarking on their first digital project may be given priority, making these grants a particularly good way to get your digitization program off the ground. Federal funding opportunities from the Institute of Museum and Library Services, the National Endowment for the Humanities, and the National Science Foundation are also worth exploring, although these tend to be geared more toward innovative or experimental undertakings and may be less likely to apply to a first-time project.
Spend some time looking into nonprofit funding sources as well. In the arts, humanities, and sciences, professional associations often have funds available for a broad array of projects that are related to their area of study. For example, my library has been exploring the possibility of digitizing the field notes of an important anthropologist with ties to the university. If this project moves forward, my research has turned up state, national, and international anthropological and archaeological associations that would be possible sources of funding for processing and/or digitizing the materials. Organizations not directly tied to the library world may be interested in your project if you can demonstrate its relevancy to their funding priorities.
Lesson 4: Start Small and Simple
The anthropology field notes mentioned previously were the first items I planned to digitize upon starting my new job. It was a collection that promised to be of great interest to those in the archaeology community, and library staff members were eager to pursue the project. A couple of months of planning and discussion later, however, it became clear that it would not be the best collection with which to inaugurate our digital library. For one thing, the materials had not yet been donated to the library and were still in the anthropologist’s possession. This posed potential problems related to intellectual property and copyright issues that could be tricky to work around. Furthermore, the collection was huge and would be very time-consuming to scan. It would also require the attention of an archivist to organize and catalog the materials before the digitization process could even begin.
Based on these issues, the project was placed on the back burner for future development. A smaller (but equally interesting) set of mid-19 th century letters, which had been housed in Special Collections for many years, was digitized instead. This proved to be more manageable, and as the first collection functioned as a sort of test run, it was much easier to go back and correct mistakes later. Over the course of a year, a number of small collections have been digitized, including the images on glass lantern slides and historic photographs of local one-room schoolhouses. Had the year been devoted to the anthropology collection instead, it would likely still be incomplete as of this writing. It would be nice to launch our digital library with the exhaustive archive of a prestigious scientist, but going that route would have prolonged our timetable considerably and created unnecessary frustration. Starting out with smaller, established collections, rather than ambitious and unwieldy ones, is a good way for a solo digital librarian to build confidence and see tangible pro gress in a shorter amount of time.
Lesson 5: Learn to Juggle, and Establish a Backlog
In planning your digital collections, it can be a good idea to avoid proceeding in a completely linear fashion. Juggling multiple projects at once may seem like a less organized way to go, but this approach has its benefits. The iterative nature of digital project management, particularly if it is an unfamiliar undertaking, means that decisions made in the process of completing one project will often influence the development of other projects. Of course you will want to begin with a set of standards to adhere to, but fine-tuning a metadata schema or file specification along the way is to some degree unavoidable. Changes made and lessons learned while processing one collection may cause you to rethink how to approach another, so completing them one at a time can lessen your chances of encountering the full range of issues. This can mean going back and correcting entire collections in retrospect. Having your hand in multiple projects at once may make it easier to identify problems and change course midstream if necessary.
By the same token, you never know when unforeseen challenges will throw you off your plan, in which case you will need something to fall back on. In my case, a mold outbreak caused our Special Collections department to close for an entire semester, cutting off access to new materials for digitization. Luckily, I had just started scanning from the now-defunct Central Wesleyan College, a German Methodist teachers college and theological seminary, a sizeable archive of monographs that included yearbooks and course catalogues spanning more than 50 years. In this situation, having a large collection saved me, as it created a backlog of scanning to carry my student worker through until Special Collections re-opened. This scenario was admittedly a freak occurrence, but it demonstrates that having a surfeit of materials selected and waiting to be digitized is not necessarily a bad thing.
Lesson 6: Get Creative About Collaboration
At the same time, the sheer amount of material to be dealt with can be overwhelming. This is compounded by any aspects that are out of the ordinary or beyond the expertise of you or your colleagues. For example, the college yearbooks and catalogues mentioned previously contained hundreds of pages of German text interspersed with English. Not only that, but much of the German was in Fraktur script, which can be difficult for even German speakers to read. The scanned page images for this collection were to be accompanied online by OCR-generated full-text transcriptions, but this proved to be a challenge: Neither the OCR software nor the student workers were able to transcribe the German text. This left the option of providing transcripts only for the English portion of the texts, which seemed a shame, because the collection appealed to researchers in large part due to its bilingual nature, and I wanted to avoid privileging the English content over the German.
My supervisor possessed adequate knowledge of German to perform the translations himself, but as both department head and acting library co-director, he was far too busy to devote the necessary time to the task. However, he did come up with a creative solution to the problem. Using connections with faculty in the university’s German department, he proposed a scenario in which students taking advanced German classes would transcribe the text as a graded class assignment—an idea that was welcomed by the German faculty. If you work at an academic library, consulting faculty members can also be a valuable way to determine the scholarly significance of materials you are considering for digitization. In fact, whether you work at a library, historical society, museum, or other type of institution, developing partnerships with other local organizations can mean the difference between a successful project and one that doesn’t make it past the idea stage. Don’t hesitate to seek out help from those beyond the library world with useful skill sets.
Lesson 7: Realize That Progress Will Be Slower Than You Expect
Finally, as you embark upon your first digitization projects, don’t get discouraged if things don’t advance as quickly as you’d like. Particularly for the lone wolf, low-budget librarian, progress can seem slow. At a conference, I once heard a wise librarian say that whatever your anticipated time frame is when starting a new digital initiative, you should automatically double it. My experience has proven this to be the case. Digitization is definitely an area where it pays to get it right the first time, so be deliberate and take your time. At the same time, there are few errors that can’t be corrected, so don’t be too afraid of making mistakes either.
One year into our new digitization program, my library has a solid groundwork laid, with thousands of image files and metadata records created and future collections waiting in the wings. I would have liked for them to be accessible online by now, but I am obliged to wait patiently for grant funding to come through so we can proceed with hiring a consultant to customize Greenstone for us. My hope is to launch the digital library next summer. But, I have learned not to be disheartened if this doesn’t happen. For the solo digital librarian, patience is a supreme virtue.