Computers in Libraries
Vol. 21, No. 5 • May 2001 

Table of Contents Subscribe Now! Previous Issues ITI Home
• FEATURE • 
Digitization: Is It Worth It?
by Stuart D. Lee

Now that everyone is experimenting with digitization projects, it's time to ignore the hype and weigh out the facts.
Every now and then, it's cathartic to take a long, hard look at what you have achieved, and more importantly at the various activities that have eaten up so much of your time. This, I would suggest, is extremely worthwhile, though you must be prepared to face some awkward questions. Writing this article presents me in many ways with such an exercise. Having spent the last 10 years working in computer services at Oxford University, and the past 4 years concentrating specifically on the digital imaging of rare manuscripts, I have set myself the starkest of questions to answer: "Is digitization worth it?"

Bearing in mind all the hype (and not to mention the money) that surrounds digital imaging (which I would suggest is the most common form of digitization we all encounter), I am beginning to feel like the small child pointing out the Emperor's nakedness at the parade by even suggesting such a question. After all, digitization must be worthwhile, surely? Everyone's doing it, from the largest national libraries to the smallest institutions; and if nothing else, just look at all the lovely pictures that we can download now. Ah, but remember the story of "The Emperor's New Clothes." Just because the crowd seemingly agrees on something does not necessarily make it correct. Every now and then we need someone to stop the parade and to point out some obvious facts, or at least get us to question what we are doing. As I said before, it's cathartic.

Before proceeding, I think it is a good idea to pin down exactly what I mean by "digitization." A strict definition might be the conversion of analog media to digital form (hence the fact that in many books "conversion" and "digitization" are synonymous). The original media or source material might be printed text or images, but we should never forget that it could also include audio and video (or, time-based media). Nevertheless, a quick glance at the main projects that have taken place within the library sector quickly reveals that for most people, digitization equates to digital imaging—that is to say, the creation of a still digital facsimile of a source item, such as a rare manuscript, photograph, slide, journal, painting, monograph, exam paper, and so on.

The question we should be asking, therefore, is which would be the most useful to the readers?
OK then, now that we are clear as to what I mean by digitization, is it worth doing? I suspect that if you ask anyone who has actually worked on a project for a reasonable length of time, the answer could well be a categorical "no," as he would remember all of the things that went wrong, the systems that crashed, the images that needed to be recaptured, the external vendors that failed to deliver, and so on. 

Yet such a subjective answer is pretty unhelpful. To fully answer the question, we need to be as objective as possible and to look also at the benefits that such projects offer, and pretty quickly we find ourselves steering toward such ideas as "value for money." I would suggest, though, that this in turn is problematic, because in most cases the costs and benefits are very difficult to estimate in terms of dollars and cents, or pounds and pennies. Yet the terminology that surrounds such an approach is appropriate. In other words we should look at the cost of digitization, the benefits derived from it, and then, most importantly, compare the act of digitization with other possible scenarios.

So, digitization then—is it worth it?
 

Weighing Some Cost Issues
Numerous digitization projects have outlined the costs they encountered, and you can also look to the charges of commercial digitization agencies to get some idea of the figures we are dealing with. But, whenever I have attempted to look at these, I run into two main difficulties: 1) The number of caveats and variables that have to be taken into account, which will ultimately dictate the total costs, are bewildering; and 2) There is the underlying suspicion that these really do not reflect the true cost of the project. In my recent book on digital imaging, 1 I adopted the approach of looking at the unit costs of digitizing a particular item (i.e., a page or photograph), derived from the reports of various projects and initiatives, and presented this as an average. See the chart entitled "Sample Costs for Digitization."

Here we can see that the unit cost of digitizing a single image is listed, but the variations (the different columns) are due to the original format of the source document and the specifications used for conversion (the higher the "dpi" or "dots per inch," the better quality the image, for example). However, this is only part of the story. Reports and studies following from completed digitization projects consistently note that the cost of conversion often only accounts for as little as one-third of the costs of the entire project.2 When you fully account for such things as assembling the source material, clearing copyright, setting up the machines, checking the quality of the output, post-editing, cataloging the item, delivering the item, managing the project, and so on, the real unit cost of digitizing an item could be three to four times the figures listed above. In other words, the real cost of digitizing and delivering a printed black-and-white, letter-sized page at 300 dpi 1-bit could be as high as 54 cents.
 

What Are the Benefits?
So what are the benefits of digitizing? This has been explored in depth elsewhere, but in summary the listed advantages offered by digitization tend to come under the headings of increasing access, preservation, and meeting strategic goals (i.e., raising the profile of the institution running the project, and so on). The first, allowing increased access to the object, is the most-often-cited benefit of digitization. An electronic facsimile of a page, for example, can be theoretically copied and distributed ad infinitum without any degradation in quality (if correct standards are maintained). More importantly, a single copy can be mounted on a server (most commonly a Web server), and this can be viewed and downloaded by a large number of users (possibly in the region of hundreds of thousands), simultaneously, and from any location in the world (assuming appropriate access restrictions and server technology). 

The clear advantage of such a system is that it liberates the document (albeit a facsimile) from the constraints of traditional access methods. Take, for example, a digital image of a folio from a rare manuscript. Traditionally users may only be allowed access to the original item if they have an appropriate reader's card, and a good and validated reason. Most importantly they would have to physically go to the manuscript itself, which may involve travel, time, and/or money. However if a digital facsimile of the folio was mounted on the Web, for example, and made freely accessible, suddenly everyone can look at the image from the comfort of his own home, office, or school. This example also leads us to the second-cited advantage of digitization, namely preservation. Although the preservation of digital objects is a discussion in itself, the above scenario does imply that the original item might be handled less, or at least that the curators would have an extra reason for restricting access to the print manuscript.
 

Let's Make Comparisons
Now let's analyze digitization as an alternative to other possible actions. The simplest example would be to take an item, consider the costs and benefits of digitizing it, and then compare this with not digitizing it. The easiest quantifiable method to use would be how this affects access. (I would argue, for example, that it is extremely difficult to measure the fulfillment of strategic goals, or to quantify the loss of an item if it is not preserved.) With access we can draw up a simple example based on usage statistics.

Let's take an article (for which copyright has been cleared) that is 10 pages long, printed in black and white with no graphics. Our table suggests that the unit cost of digitizing this would be $1.80 (10 pages at 18 cents each), but our real costs would be more in the region of three times that amount, or $5.40. Even so, for this amount we would now have a digital facsimile of the article that we could place on the Web and make available to our readers. Now let's look at what would happen if we did not digitize the item, and kept to the traditional ways of making this available to more than one person at a time. Here the only obvious alternative (outside of taking out secondary subscriptions, that is) would be to photocopy the article. Even discounting the time that this would take the staff to do, we could say that a full copy of the item would cost in the region of 30 cents to photocopy. Now let us look at the two scenarios together. In short, it is clear that by the 19th use of the article on the Web site, we are already saving money over traditional reproduction (19 photocopies would cost $5.70). In other words then, if the technology is all in place to digitize the item and deliver it across networks, and if the item is in reasonable demand, in most cases, digitization and delivery of the source item by electronic means quickly becomes cost-effective.

Where it becomes slightly more contentious is to compare digitization (the creation of an item) with acquisition (purchasing an item). To put it another way, many libraries or institutions have to face the choice between launching a digitization project or using the funds toward other activities such as traditional collection development. I would argue that because of the perceived "prestige" of having a digitization project, libraries and institutions have often chosen to embark on such a venture without ever asking whether they (or more importantly their readers) were receiving better value from this, as opposed to purchasing a traditional resource. In other words, "digitization" has often experienced a privileged position in strategic thinking, which may not have happened if the issue had been judged more fairly.

Again we need to present alternative scenarios for comparison, and here I have to refer to my knowledge of the market in the U.K., and in particular in higher education. In Britain, one of the most important resources for the education sector is the Web of Science, which contains the various ISI citation indexes. This is used in nearly every subject area by researchers day in, day out. Due to the national funding of universities in the U.K., an institution can take out an annual subscription to the Web of Science for around £9,000 (about $14,500).It's true that this is a recurrent cost, but even so, averagestatistics suggest that a U.K. university would register around 21,000 user sessions a year for this product (and each session, of course, could be hiding several successful searches).

You really have to determine whether digitizing projects are worth the cost on a case-by-case basis.
If, however, we took the same amount of money and put it into digitization we can begin to see where this example is taking us. To do this, let's take an extreme example—the rare, priceless manuscript, which will need to be digitized at the highest possible quality. Looking at our cost table, each folio could cost in the region of $4.82 to capture. Even a small manuscript therefore, of around 200 folios, would cost around $964 to digitize, and when it comes to delivering this to the reader (and taking into account all the extra costs), this could amount to nearly $3,000 per volume. Now let's compare the two scenarios. If we had $14,500 to spend, then we could either digitize and deliver five rare manuscripts at the highest possible quality, or take out a year's subscription to the Web of Science. The question we should be asking, therefore, is which would be the most useful to the readers? On the one hand we would be presenting a beautifully crafted Web site, full of wondrous images, but in reality this would probably only be of interest to a small group of scholars. On the other hand we could have access to a major research tool covering all subject areas. In this extreme example then, the "worth" of digitization would come under some suspicion.

Of course, this is an exaggerated case. If we were taking an item that was a lot cheaper to capture and would be in much greater demand (examination papers, for example) then quite quickly the balance shifts back. Furthermore, we should never forget that any digitization brings the hidden results of new equipment and raised skill levels. On the other hand though, it is undoubtedly true that many of the digitization projects that have been undertaken so far do concentrate on rare or unique items of undoubted aesthetic value, but outside of a few subjects, they have little consequence to most readers. The question is, Could this money have been better spent elsewhere?
 

What's the Final Answer?
I do not have an answer, to be honest, apart from the fact that each case should be treated separately. But I would like to raise the need for, at the very least, opening up this debate. When we look at the value of a digitization project we must not simply compare it with the alternative scenario of not digitizing the item; but instead we must also look to other areas of collection development that may yield better results for our readers. In the foreword to my book, I termed the 1990s the "decade of digitization" and indeed this is true, and in many ways it is a cause for celebration. Yet we are all now older, more experienced, and wiser in the ways of digitization. We are now, therefore, in a position to really evaluate its benefits in real terms. Above all we should not be forgetting that our primary aim is to meet the requirements of the readers and to provide them with the resources they really need to use.
 
 
 

Stuart D. Lee is head of the Learning Technologies Group at the University of Oxford and deputy manager for the Humanities Computing Unit. He has a Ph.D. in medieval literature and is a member of Oxford's English faculty where he lectures on electronic publishing. He managed the digitization project of the manuscripts of the poet Wilfred Owen (http://info.ox.ac.uk/jtap) and was the main researcher on the Mellon-funded study into digitizing the collections at Oxford (http://www.bodley.ox.ac.uk/scoping). He has recently finished a book entitled Digital Imaging: A Practical Handbook, which is available from Neal-Schuman Publishers in the U.S. or The Library Association in the U.K. His e-mail address is stuart.lee@oucs.ox.ac.uk.
 


Footnotes

1. Lee, Stuart. Digital Imaging: A Practical Handbook. New York: Neal-Schuman Publishers, Inc., 2001.

2. Puglia, Steven. "The Costs of Digital Imaging Projects." RLG DigiNews (http://www.rlg.org/preserv/diginews), vol. 3, no. 5, Oct. 15, 1999.

Table of Contents Subscribe Now! Previous Issues ITI Home
© 2001