Supporting Digital Humanities: The Basics

Digital humanities: What might support for it look like in academic libraries? Our professional literature frequently stresses the role of librarians as collaborators on digital humanities projects, as well as addressing how librarians can have an instructional role in training faculty, students, and other librarians in digital humanities tools. However, there is more to supporting emerging digital methods in the humanities than workshops or projects, especially for academic libraries without the infrastructure (especially in terms of personnel) to adequately create or collaborate on larger-scale digital humanities initiatives.

This article attempts to outline some common digital humanities resources from the perspective of library public services. The intent is not to be comprehensive, but to help librarians get started with support for more digitally driven humanities research.

Digital humanities and texts

I have not always been involved with digital humanities. The first time I heard of anything resembling digital humanities was when a graduate student came to our reference desk asking for help locating texts that could be used to create a corpus for analysis. Because the request was historical in nature and I was the history librarian, I was brought into the conversation.

The student’s request was fundamentally a request for primary sources, although it had the added twist that these sources had to be available digitally and, ideally, as plain text. If students and faculty at your institution are interested in digital humanities, they are likely in a similar situation: They need digital primary source material to work on. Helping researchers locate primary source material in any discipline should be familiar to public services librarians. Numerous library digital humanities guides that point to online resources exist. However, the questions librarians should be asking concerning online resources are different from what we typically think of when evaluating a resource:

How amenable to digital analysis are these resources?
Would a researcher be able to get plain text files out of the database?
Can researchers download all the files they want, or are there limits on how much can be downloaded at a time?

By recognizing that digital humanities researchers need to do more than simply read online sources and that, in most cases, these researchers will want plain text files, librarians can help them identify resources that would be more, rather than less, useful to their work.

Digital humanities researchers will often need to be pointed to online resources beyond those to which libraries subscribe. There are innumerable digital projects these days, and librarians are best situated to help researchers locate material on the internet. Some readers will be familiar with the Library of Congress’ Chronicling America project (chroniclingamerica.loc.gov), an invaluable source for getting full-text access to pre-1923 U.S. newspapers.

They may be less familiar with the fact that the project allows anyone to access the metadata and full-text in JSON or XML formats to facilitate anything from computationally pulling out names from article texts (known as geoparsing) to tracking the re-publication of texts across different newspapers (chroniclingamerica.loc.gov/about/api has more information). Many of the states that participate in Chronicling America have their own, separate digital newspaper projects that use the same software (and thus facilitate digital work in the same way) but have additional newspaper content available. Be sure to check in cases in which researchers are focused on a single state.

For researchers who are interested in tracking word use across time, the primary option is the Google Ngram Viewer (books.google.com/ngrams), which uses Google Books as the dataset. Ngram Viewer covers books published up to 2008. Additionally, it is worth knowing about the text corpora made available by Mark Davies, a linguistics professor at Brigham Young University (corpus.byu.edu).

The BYU corpora include U.S. soap operas, Time Magazine, Spanish-language books, and the British parliamentary de bates in addition to providing an interface to the Google Books data. While the BYU resource does not provide individual or groups of texts for download (for a fee, you can download some of the corpora for offline study: corpus.byu.edu/full-text/purchase.asp), it supports searching for word (or phrase) use across time, finding the most common words associated with a given keyword (known as collocates), as well as simple keyword-in-context.

For those working with textual sources, and particularly those who might be looking for text analysis tools for use in the classroom, the best starting point is Voyant (voyant-tools.org). Voyant allows you to upload your own text files or to enter URLs of webpages and then access a web platform where users can see word counts, keyword-in-context, and various visualizations (a word cloud and a graph of word use across the texts uploaded are the default visualizations, but there are other options). The site has enough documentation to support both novice users and novice instructors, which makes it an ideal entry-level option. Voyant’s creators have also recently published a book that offers an in-depth look at methods and pedagogy (Geoffrey Rockwell and Stéfan Sinclair, Hermeneutica: Computer-Assisted Interpretation in the Humanities, MIT Press, 2016) that you may want to have on hand.

Digital humanities and mapping

I don’t want to give the impression that digital humani ties is only about text; it’s much more than that. Mapping is also an important component. You may have heard of ESRI’s ArcGIS software (arcgis.com), particularly if you work at an institution where your library, or perhaps the geography department, has licensed it. ArcGIS is often what people think about when it comes to making maps, but the software is not simple to learn. Luckily, most digital humanities map ping projects do not need something as computationally sophisticated as ArcGIS (or its open-source competitors, such as QGIS).

There are numerous options for making maps. The software you choose depends on criteria such as file type, what data the user has, and what the user wants to do with the map after it is generated. (For example, does it need to be embedded in a blog post for a course assignment?) If your campus does have an ESRI ArcGIS license, you probably have access to ArcGIS Online, which has many useful fea tures for novice users, as well as Story Maps (storymaps.arcgis.com), which is a way to combine narrative and maps to create more visually engaging content. A free feature of ArcGIS Online is the maps gallery (arcgis.com/home/gallery.html), which is a good place to generate ideas and see what maps others have created.

Google Fusion Tables (fusiontables.google.com) is a free online option that will generate maps from spreadsheets with simple place names (such as an address or a city). Other mapping tools, such as Palladio (hdlab.stanford.edu/palladio) or Mapbox (mapbox.com), need geocoordinates in order to generate a map from a spreadsheet. How you get those coordinates depends on the scale of the data. However, because getting geocoordinates for place names/addresses is a relatively common need, there are numerous options to be found online. If your library does not have significant support for mapping/GIS services, check research guides from libraries that do.

John Russell is the associate director of the Center for Humanities and Information at the Pennsylvania State University Libraries.