The first thing that happens, when a literary historian starts using computers to think about literature, is that the object of study changes. Not just the tool; the object itself. “The objects studied by contemporary historians” have this peculiarity, Krzysztof Pomian observed some time ago, that “no one has ever seen them, and no one could ever have seen them […] because they have no equivalent within lived experience.” He was thinking of things like demographic evolution and literacy rates, and it’s true, no one can have a “lived experience” of these “invisible objects,” as he also calls them; our objects are different of course, they are literary ones, but they too have no equivalent within the usual experience of literature.
So what are they like, these objects we study in the Literary Lab? They are things like—this:
This image comes from our recent collective pamphlet, “Style at the Scale of the Sentence,” and the full argument can be found on the Literary Lab’s website. Here, let me just say that the chart correlates a certain number of words, in green and gold, with four types of clauses, indicated by the red lines, that are particularly significant in 19th-century novels; we spent quite a few hours trying to understand the logic behind this distribution, and others like it. These are the objects we study. Or this:
The red segments at the bottom express the declining presence of loud speaking verbs, hence the “silencing” of the English novel that 21-year-old Holst Katsma discovered in our database. This is what our objects are like. And no one had ever seen them, because they exist on a different scale from that at which we typically experience literature: one that is simultaneously much bigger and much smaller than the usual: three thousand novels, and a handful of words for loudness; or, as in:
The 11 different literary genres represented by the colored word strips, and the 20-odd verb forms indicated by the black vectors. No one experiences literature as a scatterplot of verb forms and genres. Reading a novel; watching a play; listening to a ballad: this is the lived experience of literature. And instead, here literature is de-composed into its extremes; but this radical reduction also allows us to see a relationship between the very small and the very large that would otherwise remain hidden: how crucial the passive past simple is for the rhetoric of Gothic novels, for instance, or progressive tenses for the Bildungsroman. And it’s not just a matter of “seeing” the relationship; you can work on it: change the variables, use adjectives instead of verbs to test if they differentiate genres better; exclude function words or include them; you can conduct small experiments with historical evidence. This says something important about the new object of study: it is not something we have found somewhere (in an archive, say); it’s something we have constructed for a specific purpose; it’s not a given, it’s the result of a new practice. A new type of work that, before the advent of digital corpora and tools, was simply unimaginable.
Which brings me to a question I have often been asked, and rightly so: will the humanities of the digital age lose what has so powerfully characterized them—the experience of reading a book from beginning to end? And, I don’t want to answer for the humanities in general, but for those of us in digital literary studies the answer has to be, Yes: reading a book from beginning to end loses its centrality, because it no longer constitutes the foundation of knowledge. Our objects are much bigger than a book, or much smaller than a book, and in fact usually both things at once; but they’re almost never a book. The pact with the digital has a price, which is this drastic loss of “measure.” Books are so human-sized; now that right size is gone. We’re not happy about the loss; but it seems to be a necessary consequence of the new approach.
Now, let me be clear about this, this does not mean that literary critics, let alone readers in general, shouldn’t read books any more. Reading is one of the greatest pleasures of life, it would be insane to give it up. What is at stake is not reading, it’s the continuity between the experience of reading of a book and the production of knowledge. That’s the point. I read a lot of books; but when I work in the Literary Lab they’re not the basis of my work. The “lived experience” of literature no longer morphs into knowledge, as in Ricoeur’s great formula of the “hermeneutic of listening,” where understanding consists in hearing what the text has to say. In our work we don’t listen, we ask questions; and we ask them of large corpora, not of individual texts. It’s a completely different epistemology.
Do we not read at all, then? Well, not exactly. You may have noticed a crazy outlier at the top of the third chart, each of the strips indicates a set of two hundred narrative sentences from various novels, and that one, from the early chapters of Middlemarch, was so extreme, we of course took those two hundred sentences and read them very very carefully. The question is, were we thereby reading Middlemarch? I don’t think so. The sentences came from Middlemarch, yes, but they couldn’t be “read” like one reads a novel because they were not continuous with each other; rather, they formed a series only on the basis of a grammatical peculiarity we wanted to investigate. No one could have ever “seen” them together while reading Middlemarch. We were studying Middlemarch, then, but not by reading it.
The objects have changed, and the scale has changed, and the type of work, and of knowledge, and the relationship to reading. And this of course raises all sorts of other questions: are the old and the new type of knowledge—in conflict? Complementary? Independent of each other? And the study of these new objects—what exactly has it achieved? Has it achieved anything? But, for today, this is enough.