Every now and then, it's cathartic
to take a long, hard look at what you have achieved, and more importantly
at the various activities that have eaten up so much of your time. This,
I would suggest, is extremely worthwhile, though you must be prepared to
face some awkward questions. Writing this article presents me in many ways
with such an exercise. Having spent the last 10 years working in computer
services at Oxford University, and the past 4 years concentrating specifically
on the digital imaging of rare manuscripts, I have set myself the starkest
of questions to answer: "Is digitization worth it?"
|Now that everyone is experimenting
with digitization projects, it's time to ignore the hype and weigh out
Bearing in mind all the
hype (and not to mention the money) that surrounds digital imaging (which
I would suggest is the most common form of digitization we all encounter),
I am beginning to feel like the small child pointing out the Emperor's
nakedness at the parade by even suggesting such a question. After all,
digitization must be worthwhile, surely? Everyone's doing it, from the
largest national libraries to the smallest institutions; and if nothing
else, just look at all the lovely pictures that we can download now. Ah,
but remember the story of "The Emperor's New Clothes." Just because the
crowd seemingly agrees on something does not necessarily make it correct.
Every now and then we need someone to stop the parade and to point out
some obvious facts, or at least get us to question what we are doing. As
I said before, it's cathartic.
Before proceeding, I think
it is a good idea to pin down exactly what I mean by "digitization." A
strict definition might be the conversion of analog media to digital form
(hence the fact that in many books "conversion" and "digitization" are
synonymous). The original media or source material might be printed text
or images, but we should never forget that it could also include audio
and video (or, time-based media). Nevertheless, a quick glance at the main
projects that have taken place within the library sector quickly reveals
that for most people, digitization equates to digital imaging—that is
to say, the creation of a still digital facsimile of a source item, such
as a rare manuscript, photograph, slide, journal, painting, monograph,
exam paper, and so on.
OK then, now that we are clear
as to what I mean by digitization, is it worth doing? I suspect that if
you ask anyone who has actually worked on a project for a reasonable length
of time, the answer could well be a categorical "no," as he would remember
all of the things that went wrong, the systems that crashed, the images
that needed to be recaptured, the external vendors that failed to deliver,
and so on.
|The question we should
be asking, therefore, is which would be the most useful to the readers?
Yet such a subjective answer
is pretty unhelpful. To fully answer the question, we need to be as objective
as possible and to look also at the benefits that such projects offer,
and pretty quickly we find ourselves steering toward such ideas as "value
for money." I would suggest, though, that this in turn is problematic,
because in most cases the costs and benefits are very difficult to estimate
in terms of dollars and cents, or pounds and pennies. Yet the terminology
that surrounds such an approach is appropriate. In other words we should
look at the cost of digitization, the benefits derived from it, and then,
most importantly, compare the act of digitization with other possible scenarios.
So, digitization then—is
it worth it?
Weighing Some Cost Issues
Numerous digitization projects
have outlined the costs they encountered, and you can also look to the
charges of commercial digitization agencies to get some idea of the figures
we are dealing with. But, whenever I have attempted to look at these, I
run into two main difficulties: 1) The number of caveats and variables
that have to be taken into account, which will ultimately dictate the total
costs, are bewildering; and 2) There is the underlying suspicion that these
really do not reflect the true cost of the project. In my recent book on
I adopted the approach of looking at the unit costs of digitizing a particular
item (i.e., a page or photograph), derived from the reports of various
projects and initiatives, and presented this as an average. See the chart
entitled "Sample Costs for Digitization."
Here we can see that the
unit cost of digitizing a single image is listed, but the variations (the
different columns) are due to the original format of the source document
and the specifications used for conversion (the higher the "dpi" or "dots
per inch," the better quality the image, for example). However, this is
only part of the story. Reports and studies following from completed digitization
projects consistently note that the cost of conversion often only accounts
for as little as one-third of the costs of the entire project.2
When you fully account for such things as assembling the source material,
clearing copyright, setting up the machines, checking the quality of the
output, post-editing, cataloging the item, delivering the item, managing
the project, and so on, the real unit cost of digitizing an item could
be three to four times the figures listed above. In other words, the real
cost of digitizing and delivering a printed black-and-white, letter-sized
page at 300 dpi 1-bit could be as high as 54 cents.
What Are the Benefits?
So what are the benefits
of digitizing? This has been explored in depth elsewhere, but in summary
the listed advantages offered by digitization tend to come under the headings
of increasing access, preservation, and meeting strategic goals (i.e.,
raising the profile of the institution running the project, and so on).
The first, allowing increased access to the object, is the most-often-cited
benefit of digitization. An electronic facsimile of a page, for example,
can be theoretically copied and distributed ad infinitum without any degradation
in quality (if correct standards are maintained). More importantly, a single
copy can be mounted on a server (most commonly a Web server), and this
can be viewed and downloaded by a large number of users (possibly in the
region of hundreds of thousands), simultaneously, and from any location
in the world (assuming appropriate access restrictions and server technology).
The clear advantage of such
a system is that it liberates the document (albeit a facsimile) from the
constraints of traditional access methods. Take, for example, a digital
image of a folio from a rare manuscript. Traditionally users may only be
allowed access to the original item if they have an appropriate reader's
card, and a good and validated reason. Most importantly they would have
to physically go to the manuscript itself, which may involve travel, time,
and/or money. However if a digital facsimile of the folio was mounted on
the Web, for example, and made freely accessible, suddenly everyone can
look at the image from the comfort of his own home, office, or school.
This example also leads us to the second-cited advantage of digitization,
namely preservation. Although the preservation of digital objects is a
discussion in itself, the above scenario does imply that the original item
might be handled less, or at least that the curators would have an extra
reason for restricting access to the print manuscript.
Let's Make Comparisons
Now let's analyze digitization
as an alternative to other possible actions. The simplest example would
be to take an item, consider the costs and benefits of digitizing it, and
then compare this with not digitizing it. The easiest quantifiable method
to use would be how this affects access. (I would argue, for example, that
it is extremely difficult to measure the fulfillment of strategic goals,
or to quantify the loss of an item if it is not preserved.) With access
we can draw up a simple example based on usage statistics.
Let's take an article (for
which copyright has been cleared) that is 10 pages long, printed in black
and white with no graphics. Our table suggests that the unit cost of digitizing
this would be $1.80 (10 pages at 18 cents each), but our real costs
would be more in the region of three times that amount, or $5.40. Even
so, for this amount we would now have a digital facsimile of the article
that we could place on the Web and make available to our readers. Now let's
look at what would happen if we did not digitize the item, and kept to
the traditional ways of making this available to more than one person at
a time. Here the only obvious alternative (outside of taking out secondary
subscriptions, that is) would be to photocopy the article. Even discounting
the time that this would take the staff to do, we could say that a full
copy of the item would cost in the region of 30 cents to photocopy. Now
let us look at the two scenarios together. In short, it is clear that by
the 19th use of the article on the Web site, we are already saving money
over traditional reproduction (19 photocopies would cost $5.70). In other
words then, if the technology is all in place to digitize the item and
deliver it across networks, and if the item is in reasonable demand, in
most cases, digitization and delivery of the source item by electronic
means quickly becomes cost-effective.
Where it becomes slightly
more contentious is to compare digitization (the creation of an item) with
acquisition (purchasing an item). To put it another way, many libraries
or institutions have to face the choice between launching a digitization
project or using the funds toward other activities such as traditional
collection development. I would argue that because of the perceived "prestige"
of having a digitization project, libraries and institutions have often
chosen to embark on such a venture without ever asking whether they (or
more importantly their readers) were receiving better value from this,
as opposed to purchasing a traditional resource. In other words, "digitization"
has often experienced a privileged position in strategic thinking, which
may not have happened if the issue had been judged more fairly.
Again we need to present
alternative scenarios for comparison, and here I have to refer to my knowledge
of the market in the U.K., and in particular in higher education. In Britain,
one of the most important resources for the education sector is the Web
of Science, which contains the various ISI citation indexes. This is used
in nearly every subject area by researchers day in, day out. Due to the
national funding of universities in the U.K., an institution can take out
an annual subscription to the Web of Science for around £9,000 (about
$14,500).It's true that this is a recurrent cost, but even so, averagestatistics
suggest that a U.K. university would register around 21,000 user sessions
a year for this product (and each session, of course, could be hiding several
If, however, we took the same
amount of money and put it into digitization we can begin to see where
this example is taking us. To do this, let's take an extreme example—the
rare, priceless manuscript, which will need to be digitized at the highest
possible quality. Looking at our cost table, each folio could cost in the
region of $4.82 to capture. Even a small manuscript therefore, of around
200 folios, would cost around $964 to digitize, and when it comes to delivering
this to the reader (and taking into account all the extra costs), this
could amount to nearly $3,000 per volume. Now let's compare the two scenarios.
If we had $14,500 to spend, then we could either digitize and deliver five
rare manuscripts at the highest possible quality, or take out a year's
subscription to the Web of Science. The question we should be asking, therefore,
is which would be the most useful to the readers? On the one hand we would
be presenting a beautifully crafted Web site, full of wondrous images,
but in reality this would probably only be of interest to a small group
of scholars. On the other hand we could have access to a major research
tool covering all subject areas. In this extreme example then, the "worth"
of digitization would come under some suspicion.
|You really have to determine whether
digitizing projects are worth the cost on a case-by-case basis.
Of course, this is an exaggerated
case. If we were taking an item that was a lot cheaper to capture and would
be in much greater demand (examination papers, for example) then quite
quickly the balance shifts back. Furthermore, we should never forget that
any digitization brings the hidden results of new equipment and raised
skill levels. On the other hand though, it is undoubtedly true that many
of the digitization projects that have been undertaken so far do concentrate
on rare or unique items of undoubted aesthetic value, but outside of a
few subjects, they have little consequence to most readers. The question
is, Could this money have been better spent elsewhere?
What's the Final Answer?
I do not have an answer,
to be honest, apart from the fact that each case should be treated separately.
But I would like to raise the need for, at the very least, opening up this
debate. When we look at the value of a digitization project we must not
simply compare it with the alternative scenario of not digitizing the item;
but instead we must also look to other areas of collection development
that may yield better results for our readers. In the foreword to my book,
I termed the 1990s the "decade of digitization" and indeed this is true,
and in many ways it is a cause for celebration. Yet we are all now older,
more experienced, and wiser in the ways of digitization. We are now, therefore,
in a position to really evaluate its benefits in real terms. Above all
we should not be forgetting that our primary aim is to meet the requirements
of the readers and to provide them with the resources they really need
Stuart D. Lee is head
of the Learning Technologies Group at the University of Oxford and deputy
manager for the Humanities Computing Unit. He has a Ph.D. in medieval literature
and is a member of Oxford's English faculty where he lectures on electronic
publishing. He managed the digitization project of the manuscripts of the
poet Wilfred Owen (http://info.ox.ac.uk/jtap)
and was the main researcher on the Mellon-funded study into digitizing
the collections at Oxford (http://www.bodley.ox.ac.uk/scoping).
He has recently finished a book entitled
Imaging: A Practical Handbook, which is available from Neal-Schuman
Publishers in the U.S. or The Library Association in the U.K. His e-mail
address is email@example.com.
1. Lee, Stuart. Digital
Imaging: A Practical Handbook. New York: Neal-Schuman Publishers, Inc.,
2. Puglia, Steven. "The
Costs of Digital Imaging Projects." RLG DigiNews (http://www.rlg.org/preserv/diginews),
vol. 3, no. 5, Oct. 15, 1999.