The Selfie-Taker and the Dictionary-Maker

Soon after John Simpson was hired as an editorial assistant at the Oxford English Dictionary in 1976, he took on the task of documenting new meanings of old words. When was queen first used to describe mattress sizes and women’s plus-size clothing? Was its use for homosexuals originally an Australian invention? How and when did Louis Pasteur’s aerobic work its way into a plural noun for a kind of vigorous exercise? Such investigations require patience and tenacity. As detection goes, they’re less redolent of Sherlock Holmes than of Inspector Lestrade.

Simpson’s engaging memoir of his 37 years at the OED describes a period of unprecedented change. For the few centuries prior to his tenure, the dictionary had been the most conservative of all bookish genres. In 1978, the linguist Adam Makkai wrote that “nothing significantly new has happened in lexicography since Gutenberg invented the printing of books.”1 You could take exception to the Gutenberg part of that, but it was fair to say that the form of the English-language dictionary had changed very little since Samuel Johnson’s time. Students in a college English class need preparation before they can tackle Johnson’s Juvenalian satires or his apologue Rasselas, but a page of Johnson’s Dictionary strikes them as transparently familiar, despite some changes in the meanings of the words themselves. The form of the dictionary seems natural and immutable, reflecting the structure of the language in the way the periodic table reflects the atomic structure of the elements.

The process of dictionary-making, too, was largely unchanged between Johnson’s time and the beginning of Simpson’s career. Johnson’s famous definition of a lexicographer as “a harmless drudge” was a bit of humblebragging avant la lettre, in anticipation of the acclamation the Dictionary would win him.2 But whenever we celebrate a lexicographer, it’s in tribute to a kind of exalted scut work. The iconic photograph of the OED’s first editor, James Murray, shows him standing in his scriptorium, with his white beard and John Knox cap, examining a citation slip in front of shelves that held hundreds of thousands of other slips that had been submitted by the Dictionary’s volunteer readers and patiently assembled by Murray and his children.

The OED—originally titled the New English Dictionary on Historical Principles (1884–1928)—was conceived as a record of the origins and development of the English lexicon, in the hope, in the words of the project’s intellectual champion Archbishop Richard Chevenix Trench, that one “could scarcely follow upon one of its significant words, without having unawares a lesson in English history as well.”3 The advent of digital tools and corpora has transformed that enterprise, but paradoxically calls into question some of its fundamental assumptions.

Early on Simpson’s watch, the text of the print OED was typed by roughly 150 keyboarders in Florida and proofread by 50 freelancers in Oxford. Undertaken to facilitate the editing and production of a second edition in 1989, the dictionary’s “computerization,” as one said at the time, had important collateral effects. For one, the dictionary was now searchable, not just as a text but as a structured database, so that queries could be restricted by etymology, date, part of speech, and so on. You could track the growth of Hindi borrowings from the 16th century to their peak in the first half of the 19th century (a “timeline” feature in the online dictionary enables you to display the results as a bar graph). Or as I had call to do recently, you can ask for a list of all the French expressions that entered the language during Jane Austen’s lifetime (à la carte, bonhomie, boudoir …).

Dictionaries are still trying to work out the implications of life in a digital world.

The dictionary was made available on CD-ROM soon after, severing the iconic connection between a dictionary’s physical size and its compass that runs from Johnson’s Folio edition—“a great solid square-built edifice,” Carlyle called it—to the literally monumental Webster’s Third. True, you can still buy the 20-volume print version of the second edition for $1,045, but what would you use it for, other than home decor?

In the early 21st century, when it was made available online, the OED changed its business model to a subscription service. The new version is described as the third edition, but an “edition” of an online reference work like the OED can always be expanded and revised. A list of new words is announced every quarter, many of them hot off the Twitter feed: the last batch includes bralette, YouTuber, and tombstoning (of a surfboard, bobbing nose-up).

The editors have been slower to update the meat-and-potatoes vocabulary of public life—the “significant words” that Trench was focused on, whose OED entries Raymond Williams drew on in writing his 1973 book Keywords, which has proven to be the most influential work of derivative scholarship to which the dictionary has given rise. Elite got its last full update in 1891, demagogue in 1895. On the face of things, it’s an awkward disparity: nobody would go to the OED to discover the meaning of bralette or YouTuber, whereas the world is crying out right now for a distillation of the contemporary meaning of falsehood (last updated 1894). That shift in emphasis might reflect the editors’ new sense of the dictionary’s cultural role—though it may be no more than lexicographical FOMO (added in 2015).

Dictionaries are still trying to work out the implications of life in a digital world. With unlimited capacity, there’s no longer any reason to limit citations for each sense of a word to a mere handful per century or to present them in such truncated form. The online OED now allows the reader to click on citations from Shakespeare and Milton to get the extended passage they’re drawn from, and readers can easily go online to do the same with citations from other writers. Online dictionaries like Wordnik already use algorithms to construct citation lists on the fly; at the limit, you could think of an online dictionary as simply a lexicographical web interface. Not that we could dispense with the OED’s lucid definitions, but as Murray suggested, “Quotations will tell the full meaning of a word, if one has enough of them.”4

The advent of online historical corpora has also altered the lexicographer’s method. Word sleuthery has become a game that anyone with access to a search engine can play. It’s not hard to find examples that antedate the OED’s earliest citations for words, particularly in the modern period. The first use of Ms. listed in the second edition was from 1949; the Wall Street Journal’s language columnist Ben Zimmer tracked it back to 1901.


As Simpson notes, the breadth of online corpora has redressed the dictionary’s overreliance on the canonical writers who have traditionally been seen as the founts of linguistic innovation, for no better reason than because they were the only writers that the dictionary’s early readers looked at. James Joyce is listed as the first user of some 575 terms in the second edition, but earlier references have since been found for 40 percent of these—in reality, Simpson says, Joyce is better thought of as an “avid magpie.”

While corpus data has improved the OED’s accuracy, scope, and usefulness, its availability can also cloud one of the dictionary’s premises, the identification of a word’s “origin” with its first occurrence in the language. That originalism is embedded in the historical method that has always driven the detective work of its editors. But the call of the first cuckoo doesn’t necessarily signal the coming of spring. Words often make their first appearance long before they enter the wider vocabulary, and corpus tools now enable us to date their changing frequency. It may be of passing interest that Carlyle was the first writer to use propaganda to refer to information disseminated to promote a particular cause, but the word was rare until World War I, when it passed into “the vocabulary of peasants and ditch diggers,” as one contemporary described it.5 The current edition of the OED dates lifestyle from an article in Mind in 1915, but its frequency increased a hundredfold between 1965 and 1975. The dictionary labels selfie as originally Australian in the light of its occurrence in a 2002 contribution to an Australian online tech forum, but there’s no good reason to connect that use to the word’s explosive growth in the US and Britain in 2013 and 2014.

Such observations force us to reconceive the lexicon itself. The linguist Dwight Bolinger once said, “Words are not things, but activities.”6 That point of view doesn’t really invalidate the dictionary’s emphasis on origins, but does denaturalize it, with consequences that are hard to foresee. The one thing we know is that the OED and other dictionaries will be very different kinds of things 40 years from now, and Simpson can take some of the credit for that change. icon

