Top Tools for Digital Humanities Research
by Nancy K. Herther
With the upsurge of Big Data and new analytical tools, it’s not just marketers or scientists who are making use of these new resources and methods; they are also contributing to the rise of digital humanities (DH). DH seeks to use these resources and tools to interrogate existing texts and data to look for new insights and connections that were not possible before. The value of these new approaches isn’t just for academics: They are proving to be an excellent way to engage students in topics that may have, in the past, seemed complicated—or even boring.
|Here is a brief look at some of the best free, open source tools available today that can allow anyone容ven you葉o get started with his or her own projects.
DH programs have been established across North America and Western Europe that are developing a whole new 21st-century approach to studying our worldHere is a brief look at some of the best free, open source tools available today that can allow anyone—even you—to get started with his or her own projects.
Recently, Micki Kaufman, a doctoral student at The City University of New York (CUNY), took the entire corpus of Henry Kissinger’s correspondence from the National Security Archive and created a visual map of this information. This was done to better understand the trends and major issues, as well as the correspondence itself, during his term as Secretary of State under President Richard Nixon (blog.quantifyingkissinger.com). “As larger and larger archives of human cultural output are accumulated,” Kaufman notes, “historians are beginning to employ other tools and methods—including those developed in other fields, including computational biology and linguistics—to overcome ‘information overload’ and facilitate new historical interpretations.”
Taking on the assignment as a disability studies librarian 2 years ago, I decided to study the terminology in the field so I could better understand the growing literature and citations (see DOI: 10.1080/09687599.2014.993061). In order to locate appropriate terminology, I chose to use text analysis of words from dissertation titles, subject headings, and abstracts. With an open source program called Concordle (folk.uib.no/nfylk/concordle), I was able to quickly do an analysis of most common phrases using “disability” as a base, as well as—by frequency of mentions—find the use of “ableism” and “crip” as other key terms important for my research. For a field in its early stages, this was critical to being able to complete my study. Let me review some tools that may help you complete your studies or help your patrons complete theirs.
A Brief Look at Key Free Tools
Want to gauge public sentiment? One easy place to start is Umigon (umigon.com), which takes tweets and processes them for sentiment with accounting for factual statements (i.e., “I hate war” will be classified as negative, and “war in Syria” will be classified as neutral). The results can then be downloaded into Excel or CSV format. You are asked to comment on the program’s accuracy to help refine the software. The war example demonstrates the weakness of sentiment analysis today, given the word limits in Twitter and the use of slang and sarcasm that prevail in this medium.
Interested in crowdsourced visualization and interpretation? Prism (prism.scholarslab.org/users/sign_in) is another free, open source visualization tool that works in your browser and represents texts based on crowdsourced interpretations. Users—many in K–12 schools—are encouraged to highlight words in a text according to various categories or facets. Individual interpretations contribute to a visualization combining the interpretations of all users of that text, revealing patterns from subjective readings. You can enter your own text to be shared for comments, or you can browse some of the existing texts.
Another interesting free collaborative tool is the crowdsourcing application called Sophie (sophie2.org/trac), developed by the Institute for the Future of the Book and the University of Southern California’s school of cinematic arts. This Creative Commons authoring tool enables collaboration, open reading, and publication. The software must be downloaded to your computer before you can begin. This project is noteworthy for its authoring environment, which lets users create complex networked multimedia documents, and the reader is able to open the resulting “book” in a browser without downloading any additional software. The system allows for comment frames for adding discussions within books, the development of digital libraries compatible with the Sophie iPad app, and the creation of timelines. Importantly, Sophie includes HTML5 export for data integrating multimedia content and can be used and read on a wide variety of devices. Although the software hasn’t been updated in a few years, the techniques and visuals are still stunning.
Netlytic (netlytic.org) is a “community-supported text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites. It is made by researchers for researchers, no programming/API skills required.” Data can be taken from Twitter, Instagram, Facebook, YouTube, RSS feeds, or your own text files. It’s maintained by The Social Media Lab and Ted Rogers School of Management at Ryerson University. The development of up to five visualizations is free. This is a very sophisticated product, and the imaging is high-quality. Another example of where the technology is leading: 15,000 Twitter interactions were analyzed, and the resulting visual (see middle image, above) illustrates how communications among small groups (inner circle) compare to communications about single events—some to unofficial or incorrect hashtags (prmetrix.net/#!Visualizing-TwoWay-Communication-A-Case-Study-on-the-2015-‘Bell-Let’s-Talk’-Initiative/c1qn/DDC05A84-2617-4474-81B3-59ED 7474F00D).
Northwestern University Knight Lab’s TimelineJS (timeline.knightlab.com) allows you to create sophisticated timelines (the format will seem familiar, since it has become a popular tool for professionals and students alike). It is an “open-source tool that enables anyone to build visually rich, interactive timelines. Beginners can create a timeline using nothing more than a Google spreadsheet … [but] experts can use their JSON skills to create custom installations, while keeping TimelineJS’s core functionality.” Not only do you get a timeline of events, but you can bring in graphic media to further document your project.
This is just the tip of the proverbial iceberg. Many products are available today, and a lot are open source (dirt directory.org/categories/text-mining). These systems provide excellent examples of how we can better access, analyze, and comprehend voluminous text files. What about content? Today, the Internet Archive (IA; archive.org/index.php), Digital Public Library of America (DPLA; dp.la), and thousands of other organizations and groups are making texts and other materials available—including sounds and visuals. Many of these collections—especially those in the public domain—are able to be downloaded, making text analysis and creation something that anyone can use in his or her research.
Revisualizing Our World: The Future of DH
These tools and methodologies have taken over education and the web today, as researchers look at new issues while they revisit assumptions of the past. As both the field of DH and the tools continue to grow and evolve, we are bound to see even more amazing creations and in-depth reassessments of prior knowledge. Even more exciting, anyone can participate in this latest technology-enabled venture. As one anthropology student mentioned to me recently, “DH is very much like archaeology, you find you are revisiting your own assumptions and learning more all the time.” It couldn’t be easier for you to try some of these systems out for yourself, your students, or your website.
CHECK OUT THESE FREE DH TOOLS
DIGITAL HUMANITIES TOOLKITS
DH Press (digitalinnovation.unc.edu/projects/dhpress), from the University of North Carolina–Chapel Hill, provides a key toolkit that requires no technical or programming knowledge. It can be used for a wide range of projects, such as creating repositories, exhibits, maps, and multimedia.
Omeka (omeka.org), from George Mason University, provides a wonderful suite of tools that allows for the creation and sharing of scholarly collections or exhibits comprising “complex narratives … adhering to Dublin Core standards.” Neatline (neatline.org), an Omeka add-on product, allows more tools for creating maps and timelines.
Scaler (scalar.usc.edu), housed at the University of Southern California and produced with funding from The Andrew W. Mellon Foundation, supports the creation of long-form, born-digital online scholarship using a variety of media, with little required technical expertise.
Chronos Timeline (hyperstudio.mit.edu/software/chronos-timeline) is a Massachusetts Institute of Technology product that “dynamically presents historical data in a flexible online environment. Switching easily between vertical and horizontal orientations, researchers can quickly scan large numbers of events, highlight and filter events based on subject matter or tags, and recontextualize historical data.”
Historypin (historypin.org) is reached through your Facebook, Google, or Twitter account. The software allows you to join with hundreds of libraries, museums, and other institutions in sharing your photos or other materials with users around the world. You can also access these digital objects to create “glimpses of the past and build up the huge story of human history.”
QGIS (qgis.org/en), initiated in Europe, is an open source community that offers free downloads of its software and a blog for sharing, with the goal of making it “the best GIS tool in the free and open-source software (FOSS) community.”
TimelineJS (timeline.knightlab.com), from Northwestern University Knight Lab, “is an open-source tool that enables anyone to build visually rich, interactive timelines. Beginners can create a timeline using nothing more than a Google spreadsheet,” adding media from such sources as Twitter, Flickr, YouTube, Vimeo, Dailymotion, Google Maps, Wikipedia, SoundCloud, and Document Cloud.
Concordle (folk.uib.no/nfylk/concordle) is the “Not so pretty cousin of Wordle” (wordle.net). Both of these programs offer easy one-step processes for creating word clouds. Any word in the cloud can be clicked, and snippets of neighboring text will be shown in the concordance area.
Netlytic (netlytic.org) is a community-supported text and social network analyzer (Twitter, Facebook, YouTube, Instagram, RSS feed, or text/CSV files) that automatically summarizes and discovers social networks that are apparent from online conversations on social media sites. Created by researchers, no programming or API skills are required to use it, and it is a user-friendly way to explore and visualize publicly available data.
Palladio (hdlab.stanford.edu/palladio), from Stanford University, allows you to visualize even complex historical data, copying and pasting from existing spreadsheets by dragging and dropping to upload tabular data (e.g., .csv, .tab, .tsv). You can link to a file in a public Dropbox folder.
Prism (prism.scholarslab.org/users/sign_in) is a tool for crowdsourcing the interpretation of any type of textual materials. Users are invited to add their comments or interpretations by highlighting words according to different categories or facets. Each interpretation contributes to the generation of a visualization demonstrating the combined interpretation of all users.
Sophie (sophie2.org/trac), developed by the Institute for the Future of the Book and the University of Southern California, allows users to combine text, images, video, and sound easily and without any programming knowledge or training in the use of more complex tools such as Flash.
Tableau (tableau.com) is free visualization software that allows you to take a wide variety of data-based information (from spreadsheets, files, etc.) to create interactive data visualizations for mounting on the web.
Umigon (umigon.com) allows you to sift through tweets or other content in order to do sentiment analysis of the text. The product is simple and straightforward, but powerful.
Voyant Tools (voyant-tools.org) is a web-based reading and analysis environment for any digital text.
If you want to look deeper at these tools, check out DiRT, which is “a tool, service, and the most comprehensive and up-to-date collection registry of digital research tools for scholarly use.” This website covers the entire field of DH, including comparing resources such as CMSs, music optical character recognition (OCR), statistical analysis packages, and mind-mapping software.
Nancy K. Herther (email@example.com) is a librarian for American studies, anthropology, and sociology at the University of Minnesota–Twin Cities campus.