What’s next for the digital humanities? And how might they be part of changing our collective futures? As universities across the country reach the end of this unprecedented year, Public Books and Digital Humanities at MIT present a four-part series examining the role of the digital in the life of scholars and societies. In this moment of global reckoning around issues of virological, ecological, historical, and moral concern, some of the field’s top thinkers here ask how digital media and methods continue to challenge, harm, sustain, and liberate—and they show how investigations into the relationship between the digital and the human have only just begun.
The word indifference used to mean “a thing or matter of no essential importance.”1 In my field—the digital humanities—we tend to sell our work to administrators and students by stressing the value of STEM skills and data-driven research, in contrast to the tendency in the humanities at large to stress what is without price. This affects our research priorities and the choices our students make; we apply for grants, we advise students, by presenting a case for which white-hot fields—artificial intelligence? cultural analytics? quantum computing?—provide the clearest road map to the future.2 This always entails prognostication and projection. How can we use the present to predict the future? Based on our understanding of the present, how can we seize on the tools and research that will be most useful in the future? Based on our understanding of the future, how can we pick out the tools and research that will be most useful for us in the present?
A wiser essay might push back against the utilitarian idea that everything should be judged by what it adds to the marketplace. But what the heck—let’s embrace that idea.
Here are three stories about endeavors that were “useless,” and yet whose applications helped to shape the digital world. As this history suggests, nobody knows what will be useful in the future. That’s why an indifference, to use the old phrase, can be an engine of innovation. And this, in turn, is why we so often find humanistic activities in the seeds and roots of STEM.
We don’t know what will turn out to be useful. But we do know that the future will have its roots in the humanities, which is a self-interested reason to give the humanities a future.
In the 1940s, the linguist George Kingsley Zipf set his graduate students to what was surely a tedious and unwelcome task. He asked them to count the number of times every discrete word appears in James Joyce’s novel Ulysses. They found an unusual concordance to help them with this work: the Word Index to James Joyce’s Ulysses (1937), created by a team of linguists under the direction of Miles Hanley.3
You would be hard-pressed, I think, to imagine a book that is, on its face, more useless than the Word Index to James Joyce’s “Ulysses”, or a project more useless than the word-counting task of Zipf’s graduate students. This he foisted on his underlings because he wanted to study natural language quantitatively and imagined that Ulysses, famous for its streams of consciousness, might document the flow of natural language with special fidelity.4
Using the numbers his students compiled, Zipf graphed each word’s frequency (the number of times it appears in the novel) against its ranking (whether it is, for instance, the first, tenth, or hundredth most frequently used word in the novel). He discovered that the frequency and ranking of the words in Ulysses have an inverse relationship: if n words appear once in a corpus, n2 words can be expected to appear twice in the corpus, (n2)2 words can be expected to appear three times, and so forth.
Zipf had unintentionally discovered a law. Today it’s called Zipf’s law, and researchers have found it to have applications in all kinds of information ecosystems, from library science to financial trading.5 In the information age, it has helped us to understand databases, internet traffic, video streaming, chat rooms, and social networks. In fact, the most famous rules concerning social networks—that the value of a social network is equal to the square of the number of its nodes; that nodes in a social network have dramatically uneven importance—derive their reasoning from Zipf’s law.6
We owe our knowledge of this law to a linguist’s idle inquiry into Ulysses; to his willingness to read a novel dramatically against the usual rules, for no reason except for his interest in the way people talk.
We don’t know what will turn out to be useful. But We do know that the future will have its roots in the humanities.
The second story: In 1916, a wealthy eccentric hired Elizebeth Smith, a recently graduated English major who lived in Chicago, to study William Shakespeare’s First Folios. Smith’s patron wanted her to find hidden clues in the plays that would prove that the plays were written not by Shakespeare, but by Sir Francis Bacon.
Shakespeare wrote Shakespeare’s plays. No serious scholar doubts this. One might think, then, that Smith was on a fool’s errand. The verdict of history, however, has been that her work was far from a waste of time. Her efforts as a historian trying to answer a question about Shakespeare and Sir Francis Bacon obliged her to become an expert in cryptography, because Bacon himself was an expert in cryptography. Smith worked, with others her patron hired for the same task, at a facility called Riverbank, which became a school for researching the history of codes and ciphers.
During the First World War, Riverbank scholars trained American military personnel to create ciphers. For part of the war, as Jason Fagone notes in a marvelous book on Smith, the Riverbank team “did all of the codebreaking for every part of the U.S. government: for the State Department, the War Department (army), the navy, and the Department of Justice.”7 In 1921, Smith and William Friedman, a Riverbank scholar she had recently married, started a new career in Washington, DC, working for the government.8 They helped to shape the methods of emerging spy and law-enforcement agencies like the Federal Bureau of Investigation, the Office of Strategic Services (later the Central Intelligence Agency), and the National Security Agency.
The Friedmans’ heirs today are the cryptanalysts and cypherpunks who build and crack cryptographic codes. Was studying Shakespeare a waste of their time?
Our third story concerns Edmond Locard, a French investigator in the early 20th century.9 Locard admired the Sherlock Holmes stories of Sir Arthur Conan Doyle; today, Locard is often called the father of modern forensics. As such, detective stories helped to inspire the modern science of detection.
The originator of the detective genre was the 19th-century writer Edgar Allan Poe.10 The literature scholar Paul Grimstad argues persuasively that Poe started workshopping the genre in an article he wrote in 1836 about a chess-playing machine that he suspected was a hoax—as in fact it was. That piece got him started on a way of thinking about thinking that led to his writing the first “analytic detective story.”11
In this chain of influence, who was wasting their time? Was Poe wasting his time when he wrote what readers might have seen as a rather niche and self-indulgent article about a Victorian artificial-intelligence system? Or was he wasting his time when, intrigued by the literary possibilities of his article’s gimmick—putting deductive and inductive reasoning on display—he wrote a locked-door mystery starring the detective Monsieur Dupin? Was Conan Doyle wasting his time when, drawing on Poe’s example, he wrote the Holmes stories? Was Locard when he read the Holmes stories and thought about how a science of detection might work in real life?
The forensic professionals who examine security logs, network traffic, and hard drives work in a technological field that has its origins in literary writing. To adapt a phrase from Robert Darnton, literature does not merely reflect on history; literature causes history.12
the activities that commentaries on S.T.E.M. and the humanities tend to discount as useless have been powerful drivers of the economy and shapers of the modern world.
Researchers in the digital humanities often explain the field to outsiders in terms of either the humanities discovering tech (algorithms perform a “distant reading” of Hamlet) or tech discovering the humanities (sociological analysis reveals algorithms to be saturated with human bias). However, I have often seen the field facilitate a different phenomenon: technological tools and norms and methods rediscovering the humanistic world from which they first sprang.
The arts drive tech so often that it should be a cliché. The art of weaving provided technical inspiration to Charles Babbage, creator of the Difference Engine and the Analytical Engine, and Ada Lovelace, whose algorithm for the Analytical Engine has earned her recognition as the first computer programmer.13 The philosopher Ted Nelson’s experimental, cutup writings about literary hypertext, which links information in sprawling networks rather than in “rigid hierarchies and matrices,” gave Tim Berners-Lee a toolkit for the design of the World Wide Web.14 Punk saved the internet by instilling the young people who became internet insiders in the 1980s and 1990s with a do-it-yourself ethos, a distrust of authority, and a general rebelliousness toward the utility-maximizing engines of the establishment, which together define hacker culture even today; it helped the hacker ethic to catch on outside the privileged spheres of Stanford and MIT. Low-art ephemera, like records and zines, shaped the high-tech world for decades—even if, as so often happens, the establishment eventually co-opted the counterculture that opposed it.15
I would love to challenge the idea that a field is valuable in proportion to the number of dollars it adds to the economy. But sometimes it’s worth entertaining this flawed premise, because it provides a chance to show that the activities that commentaries on STEM and the humanities tend to discount as useless have been powerful drivers of the economy and shapers of the modern world.
Tech is not separate from poetry and politics and other (as a programmer might claim they are) indifferences; merely forgetful of them. The digital humanities helps us to see how they work together.
- Oxford English Dictionary, 2nd ed. (1989), s.v. “indifference,” entry xi. ↩
- On their first visit to office hours, my students often tell me they want to study “artificial intelligence and” x, with x being medicine or whatever field they want to have a career in. ↩
- Miles L. Hanley et al., Word Index to James Joyce’s “Ulysses” (University of Wisconsin Press, 1937). Even at the time of publication, researchers saw this book as an odd duck. One reviewer wrote in 1937, “Out-of-the-way adventures in research have an unholy fascination for me. The latest is the Word Index to James Joyce’s “Ulysses” by Professor Miles L. Hanley of the University of Wisconsin.” Untitled book review, Colophon, vol. 2, no. 4 (1937), p. 622. As an aside, the book’s acknowledgments section shows the heroic labors of women credited as amanuenses. Hanley writes, “By far the largest amount of individual work was done by Miss Theresa Fein, who supervised most of the alphabetizing, the proofreading, and the verification of references. She was principally responsible for the typewritten copy from which the final stencils were made; and, with the help of others, she also prepared Appendix I” (xi). ↩
- The edition of Ulysses that Hanley used (Random House) contained 260,430 words in total. Hanley’s book indexed 29,899 discrete words, meaning that if you gather all the instances of “the” into one “the,” all the instances of “dog” into one “dog,” and so forth, you end up with 29,899 words. Zipf’s students counted the number of times each of those 29,899 discrete words appears. ↩
- Zipf’s law says that in any index we might make of a text, corpus, or body of information, we will always find that the frequency of a given item multiplied by the rank will yield a constant. In other words, the frequency and the rank of the items in that body of information have an inverse relationship. Let’s say the body of information is a university library; the frequency of a given title is how often readers withdraw that title, and the rank of the same title is whether it is the number one title most withdrawn, the number two title most withdrawn, etc. The frequency and rank of the titles have an inverse relationship: for any library, we can expect that 20 percent of the titles account for 80 percent of withdrawals, and that 80 percent of the titles account for 20 percent of withdrawals. ↩
- See, for example, Lada Adamic and Bernardo Huberman, “Zipf’s Law and the Internet,” Glottometrics, vol. 3 (2002); and Alexander Saichev, Yannick Malevergne, and Didier Sornette, Theory of Zipf’s Law and Beyond (Springer, 2010). ↩
- This group also wrote foundational cryptological texts. “Between 1917 and 1920,” writes Fagone, Riverbank’s press published “eight pamphlets,” largely authored by Elizebeth Smith Friedman and William Friedman, “that described new kinds of codebreaking strategies. These were little books with unassuming titles on plain white covers. Today they are considered to be the foundation stones of the modern science of cryptology.” The Woman Who Smashed Codes: A True Story of Love, Spies, and the Unlikely Heroine Who Outwitted America’s Enemies (HarperCollins, 2017), p. 77. ↩
- Smith took Friedman’s name upon their marriage. ↩
- Locard’s Exchange Principle—“Every contact leaves a trace”—remains the basis of forensic science today. As Matthew Kirschenbaum notes, this principle “is more, not less, true in the delicate reaches of computer systems.” Mechanisms: New Media and the Forensic Imagination (MIT Press, 2012), p. 49. ↩
- The first detective story on record is Poe’s “The Murders in the Rue Morgue” (1841). ↩
- Poe argued that a human was hiding inside the machine and directing its moves, which was in fact the case. As Grimstad notes, Poe’s presentation of this argument is an example of inferential reasoning, worked up, in part, for literary purposes: he believed that readers would enjoy following him along this inferential chain of reasoning about what would be required for an automaton to engage in deductive thought. To summarize in modern terms: with deductive reasoning, as with classical programming, one takes data and rules and uses them to find results, whereas with inductive reasoning, as with machine learning, one takes data and results—how a chess machine plays—and uses them to find rules. A machine in Poe’s time could engage in deductive reasoning, in the sense that it could solve mathematical problems, but it could not engage in the kind of inductive reasoning that Poe displays in his essay. “What a machine must be able to do in order to play chess,” says Grimstad, “is the very thing Poe is doing in making inferences based on observation alone.” Experience and Experiential Writing: Literary Pragmatism from Emerson to the Jameses (Oxford University Press, 2013), p. 45. ↩
- Robert Darnton, “What Is the History of Books?” Daedalus, vol. 111, no. 3 (1982), p. 81. ↩
- Famously, Lovelace compared the Analytical Engine’s operations to “the principle which Jacquard devised for regulating, by means of punched cards, the most complicated patterns in the fabrication of brocaded stuffs.” She added, “We may say most aptly that the Analytical Engine weaves algebraical patterns just as the Jacquard-loom weaves flowers and leaves.” Ada Lovelace, “Translator’s Notes to M. Menabrea’s Memoir,” in Scientific Memoirs, by Luigi Federico Menabrea, vol. 3 (1843), p. 712. ↩
- See, for example, Ted Nelson, Literary Machines (Mindful, 1981); and Tim Berners-Lee and Mark Fischetti, Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor (HarperCollins, 1999), pp. 3–6. ↩
- See Elyse Graham, A Unified Theory of Cats on the Internet (Stanford University Press, 2020). ↩