Why an Age of Machine Learning Needs the Humanities

It isn’t easy to be a citizen in 2018. We are told to watch out for bots and biased ...

It isn’t easy to be a citizen in 2018. We are told to watch out for bots and biased search engines, but skepticism about new media can also make us easy prey for old-fashioned propaganda. Donald Trump notoriously explains away critique by appealing to this skepticism, claiming that most news sources are “fake,” and Google searches are biased against him.

If democracy depends on informed citizens, democracy is in trouble. This is a moment of crisis for many institutions, including higher education, especially in disciplines such as English, philosophy, and history, which promise to prepare students as citizens. To prepare students for a world where information is filtered by computers, we will need a stronger alliance between the humanities and math. This alliance has two reciprocal parts: cultural criticism of the mathematical models shaping our world, and mathematical inquiry about culture.

Traditional humanistic skills remain important, of course: we still need to scrutinize assumptions and evaluate arguments. But the challenges that confront 21st-century citizens are not always arguments that come one by one to be evaluated. Information is more likely to come in cascades, guided both by networks of friends and by statistical models that anticipate our preferences. Evaluating sources one by one won’t necessarily tell us whether these computational and social systems are giving us a biased picture. Instead, we need to think about samples and models—in other words, about math. Mathematics may once have seemed a specialized scientific tool. But in the 21st century, culture and politics are increasingly pervaded by the automated form of statistical inference called “machine learning.” Students who don’t understand it will struggle to understand everyday life.

And it is easy to misunderstand machine learning. To begin with, many popular books and articles on the topic encourage readers to argue about the danger of algorithms, which is not even the right word to be debating. An algorithm is just a recipe—a series of steps to follow in solving a problem. “Separate the eggs, then beat the whites” is an algorithm. Since 20th-century conversation about computers was deeply shaped by the truism that computers can only follow human instructions, algorithm became a powerful buzzword. Whenever computers are used to do something, journalists say it has been done “by algorithms”—which may technically be true. But the innovations that have allowed computing to reshape everyday life in recent decades should not really be understood as a mass of new algorithms. We have moved toward a system where computers are controlled in a less explicit way. Instead of manually writing algorithms that directly govern a computer’s decisions, we often ask computers to write their own instructions by modeling the problem to be solved.

Filtering spam out of email, for instance, is a poorly defined task. Undesirable email comes in many different shapes, and it would be hard to write an algorithm that could catch them all. A more flexible approach begins by collecting examples of messages that human readers have rejected, along with messages they approved. Then we ask the computer to write its own instructions, by observing differences between the two groups. For instance, the computer might draw up a list of words that are common in rejected email (free, offer, c1alis) and measure the relative probability of seeing them in rejected or accepted messages. Then it could use this description of spam—this “statistical model” of the concept—to filter incoming email. Since words like free and offer appear in many legitimate contexts, the model won’t be a streamlined algorithm that simply rejects messages containing certain words. It rather adds up evidence from a host of clues—each individually ambiguous—to estimate the probability that a message is spam.

Instead of giving computers explicit instructions, this approach, called “machine learning,” asks them to grasp fuzzy patterns implicit in evidence they are shown. The patterns are fuzzy, in part, because computers often learn from human behavior, which doesn’t follow strict rules. But the need for imprecision comes, more fundamentally, from a mathematical insight about learning. To learn a language, toddlers have to generalize from specific examples (a familiar tabby cat) to a looser category (cat or animal). This requires a subtraction of detail, since animals aren’t always tabby-colored, don’t always purr, and so on. We may not be conscious that learning requires subtraction, since forgetting details comes naturally to human beings. But computers find it easy to remember details, so if we want them to grasp general patterns, we have to explicitly tell them to condense a long list of emails (or animals) into a usefully fuzzy model. The success of machine learning depends both on gathering data and on condensing it, but the second, subtractive step is the part statisticians call “learning.”

Machine learning increasingly shapes human culture: the votes we cast, the shows we watch, the words we type on Facebook all become food for models of human behavior, which in turn shape what we see online. Since this cycle can amplify existing biases, any critique of contemporary culture needs to include a critique of machine learning.1 But to prepare students for this new world, we need to do more than wag our fingers and warn them that algorithms are problematic. Generalized suspicion about technology doesn’t necessarily help people understand the media—as presidential attacks on biased search engines and fake news have recently made clear. Telling students that new technologies are out to fool them can just make them hungry for an easy cure—a so-called “red pill.” (In fact, danah boyd has argued that attempts to teach media literacy often backfire in exactly this way.2) To be appropriately wary, without succumbing to paranoia, students need to understand both the limits and the valid applications of technology. Humanists can contribute to both halves of this educational project, because we’re already familiar with one central application of machine learning—the task of modeling fuzzy, changeable patterns implicit in human behavior. That’s also a central goal of the humanities.

This may sound like a bizarre claim, if we believe the stereotype that represents math as alien to history and literature. But humanists have always been more flexible than the stereotype implies: economic historians, for instance, often use numbers. Cultural historians haven’t done so in the past, because the simple quantitative methods available in the 20th century genuinely couldn’t do much to illuminate culture. We can’t write a simple algorithm to recognize a literary genre, for instance, because most genres lack crisp definitions. Humility on this topic is hard-earned: dozens of 20th-century critics spent years of their lives trying to define “science fiction,” before critics conceded that the phrase has meant different things at different times.3 Scholars have reluctantly abandoned the quest for an essential feature that unifies genres, in order to acknowledge that genres are loose family resemblances, organized by a host of overlapping features, and changing their meaning from one decade to the next.4

To prepare students for a world where information is filtered by computers, we will need a stronger alliance between the humanities and math.

Concepts of that kind may sound slippery and unscientific. But machine learning can also be slippery and unscientific.5 Remember that we resorted to machine learning because we couldn’t invent a simple, universal definition of spam. Instead, we had to draw on the tacit knowledge of human readers who had rejected email for a range of reasons. A model based on this sort of evidence will never be stable. It will have to be updated every few years, as old scams die out and new ones emerge. In short, recent warnings about biased algorithms understate machine learning’s real limitation—which runs deeper than any accidental bias. Models shaped by examples of human behavior are necessarily models of a particular cultural context. Timeless objectivity is not something they could ever provide; it’s not what human culture provides.

This means that helping students understand the strengths and limitations of historical knowledge can also be a way of helping them understand the strengths and limitations of machine learning. In fact, the fuzzy, context-specific models produced by machine learning have a lot in common with the family resemblances historians glimpse in culture. (For instance, both patterns tend to be defined by many overlapping clues, not one essential feature.) So, it shouldn’t be surprising that machine learning is turning out to be useful for cultural history. Hoyt Long and Richard Jean So have used it to trace the diffusion of a haiku style in modernist poetry.6 Katherine Bode’s A World of Fiction compares different models to explain how Australian and American writers diverged from British tradition in the 19th century.7 Andrew Piper’s Enumerations even uses these methods to tease out insights about the special appeal of windows for introverted heroines.8

Not every student needs to care about these specific examples, but every student will need some hands-on experience with the statistical models newly central to our culture. And every student does need to understand that these models are limited by the same contextual provisos that make historians and literary critics so cautious. Tech leaders who argue that machine learning is more objective than other knowledge cannot be trusted. But we should just as fiercely distrust political leaders who use the perspectival complexity of the internet to imply that real knowledge is impossible, everything is fake, and we can only fall back on affinity and prejudice. It is possible to build real knowledge by comparing perspectives from different social contexts. Historians have long known how. As our knowledge about the present is increasingly filtered through statistical models aimed at specific target markets, we may need the same comparative strategies to understand our own lives. More fundamentally, we need to realize that historians’ traditions of caution and relativism are not alien to the sparkling new world of computers.


The Big Picture: Misinformation Society

By Victor Pickard

Bridges of this kind between the humanities and sciences could soon have enormous importance. But universities aren’t yet explaining them to students. Instead, we have usually tried to fit the cultural changes associated with technology into some existing disciplinary box. “Digital humanities,” for instance, can easily become an inward-looking name for conflicts within history and English departments. Those conflicts matter more for professors than they do for students; what students really need are new alliances between disciplines. The participants in such an alliance needn’t lose their separate identities, but they do need to shake hands. Students might learn how to use machine learning mostly in quantitative disciplines—in information science, or in the new interdisciplinary field of data science. They might turn to the humanities in order to apply new methods to cultural questions, or in order to reflect on the history and social implications of the methods themselves.

New connections between disciplines won’t displace the traditional strengths of history, philosophy, art, and literature, but the emerging assumption that humanists and scientists are working on a shared educational project still represents a massive change. For much of the 20th century, these parts of the university saw themselves as (civil) antagonists. The sciences taught you how to clone T. Rex, as one familiar poster has it. The humanities taught you why you shouldn’t. It’s a clever story but one we need to get beyond. The real monster in our world is not a dinosaur we could avoid creating. It is human history, already broken loose, already ravenous and hard to predict. To understand it—to understand ourselves—we will need numbers as well as words. Humanists have a lot to contribute to this struggle, since we know the monster’s past and understand its slipperiness better than anyone else. Healthy skepticism about numbers is one thing we bring to the table, but we have much more than skepticism to offer. We can also join forces with science, to show students that statistical inference and historical interpretation are allied, intertwined parts of a life committed to understanding.

I leave you with recommendations for some further reading. This article has presented machine learning not just as a new technology but as a turning point in recent intellectual history. For a book-length exploration of this perspective, I suggest Adrian Mackenzie, Machine Learners: Archaeology of a Data Practice (2017). For hands-on experience, the best place to start is often an introductory course in “data science”: the materials for Data 8, at UC Berkeley, are available online. A lot can also be learned from computational and social scientists who reflect on the history of their own fields. The first few pages of Leo Breiman’s “Statistical Modeling: The Two Cultures” give a good brief history of the philosophical tensions associated with machine learning. A valuable critical take can be found in an article by danah boyd and Kate Crawford called “Critical Questions for Big Data.” The consequences of machine learning for the humanities in particular are still being hashed out: consult the works footnoted in this article, or my own forthcoming book Distant Horizons (2019).


This article was commissioned by Richard Jean So. icon

  1. See, for instance, Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown, 2016).
  2. On the connection between media critique and “red pill” paranoia, see danah boyd, “You Think You Want Media Literacy … Do You?,” Points, March 9, 2018.
  3. John Rieder, “On Defining SF, or Not: Genre Theory, SF, and History,” Science Fiction Studies, vol. 37, no. 2 (2010).
  4. Ted Underwood, “The Life Cycles of Genres,” Journal of Cultural Analytics, May 23, 2016.
  5. A leading historian of machine learning has stressed that its emergence required “the creation of new epistemic virtues … at times in conflict with long-held views of statistical rigor.” Matthew Jones, “Querying the Archive: Data Mining from Apriori to PageRank,” Science in the Archives: Pasts, Presents, Futures, edited by Lorraine Daston (University of Chicago Press, 2017), pp. 311–3.
  6. Hoyt Long and Richard Jean So, “Literary Pattern Recognition: Modernism Between Close Reading and Machine Learning,” Critical Inquiry, vol. 42, no. 2 (2016).
  7. Katherine Bode, A World of Fiction: Digital Collections and the Future of Literary History (University of Michigan Press, 2018), pp. 157–97.
  8. Andrew Piper, Enumerations: Data and Literary Study (University of Chicago Press, 2018), pp. 138-43.
Featured image: Robot Sonata (2018). Photograph by Franck V. / Unsplash