Culture industries increasingly use our data to sell us their products. It’s time to use their data to study them. To that end, we created the Post45 Data Collective, an open access site that peer reviews and publishes literary and cultural data. This is a partnership between the Data Collective and Public Books, a series called Hacking the Culture Industries, brings you data-driven essays that change how we understand audiobooks, bestselling books, streaming music, video games, influential literary institutions such as the New York Times and the New Yorker, and more. Together, they show a new way of understanding how culture is made, and how we can make it better.
—Laura McGrath and Dan Sinykin
In 1983, William Blatty—author of The Exorcist—sued the New York Times.1 His lawsuit alleged that the Times had incorrectly excluded his latest novel, Legion (a sequel to The Exorcist), from its bestseller list—the coveted ranking that purports to show the books that have sold the most copies that week in the United States. According to Blatty’s lawyers, Legion had sold enough copies to warrant a spot on the list, so its absence was due to negligence or fraud, for which Blatty was entitled to compensation. The Times countered with what might sound like a surprising admission: the bestseller list is not mathematically objective; it is editorial content, which is protected by the First Amendment. The court ruled in favor of the New York Times.
The Blatty case draws attention to a fundamental truth about bestseller lists, one that often gets forgotten amid the drama of their weekly publication: they are not a neutral window into what the public is really reading. Rather, they reflect editorial decisions about how and what to count. Changes on the list might reflect changes in counting procedure, rather than changes in the market. Despite their lack of neutrality—or, perhaps, because of it—these editorial and counting decisions can have a big effect on which books and authors get the honor of appearing on the list; in turn, they shape the public’s perception of what it is reading and what it should consider reading next.
In this piece, I want to explore one way such decisions have affected the Times list over its almost 90-year publication history: the separation of sales by book format (hardcover, paperback). In the 1950s and 1960s, the fact that the Times exclusively publicized hardcover sales meant that some of the most popular novelists of the time rarely appeared on the list, because they made most of their sales in paperback. Today, the Times publishes distinct lists for different formats, and the content of these lists often reflects status hierarchies associated with different genres and communities of readers.
It turns out, then, that “bestseller” is a more complicated category than you might at first think. Though its name seems to refer to something very straightforward, there are all sorts of weird historical factors and counting choices that affect whether a book might make the cut. Given the influence of the Times list, it’s worth examining the effects of the choices made when assembling it, and what they can tell us about the kinds of information about books we consider valuable.
The occasion for this analysis is the recent publication of a dataset I compiled that records every book that made it onto the New York Times hardcover fiction bestseller list between 1931 and 2020. The dataset allows one to ask aggregate questions about the history of popular literature in the United States. For example, the following visualization shows which genres have appeared on the list most frequently (among those bestsellers for which a library record could be identified).2 As can be seen, over nearly a century, the lists’ two biggest genres are “historical” and “detective and mystery,” by a fairly large margin.
The distribution of these genres over time isn’t static, however. Some fell out of popularity, while others became more popular. For instance, historical fiction has declined in popularity; its peak was in the 1940s (when about 24 percent of novels in the data were historical), but the genre then dropped to a low in the 2000s (when it made up only 10 percent of listed novels). Meanwhile, “thriller and suspense” fiction and “detective and mystery” fiction have become much more prominent, with particularly rapid growth between 1980 and 2000.
What changed? To answer this question, it’s important to understand what exactly this particular list tracks: specifically, it is a list of hardcover bestsellers. It does not consider paperback sales, either trade or mass market. Importantly, the Times did not begin regular weekly paperback coverage until 1976 (although it began irregular publication of a monthly paperback list in 1965).3
In publishing history, the distinction between paperback and hardcover formats is also a distinction between markets. In the early days of the New York Times list, hardcover novels were primarily sold at bookstores, which catered to relatively wealthier and more urban customers. Paperbacks, meanwhile, were sold at a variety of different outlets, including newsstands, drugstores, and, eventually, supermarkets. These books were accessible to a much wider audience, due to both their more affordable prices and their greater geographic availability.
But, for decades, those more accessible books were not tracked by the New York Times. The fact that this list exclusively tracks hardcover sales at bookstores means it necessarily won’t reflect the popularity of other books: that is, those that sold in large numbers as paperbacks at nontraditional outlets.
In some cases, this format distinction doesn’t change the overall picture of the list. After all, books are often published in multiple formats, and a book that sells well in hardcover often also sells well when republished in paperback. Where this distinction begins to matter is in the case of novels that never received a hardcover printing, or in the case of those that sold comparatively much better (or worse) in one format rather than the other. Here, the list’s omissions become provocative.
In the 1940s and 1950s, the early years of mass-market paperback publishing, the “paperback originals” of publishers like Avon and Fawcett were mostly works of genre fiction. Some of the most popular mass-market genre writers of that period are conspicuously absent from the Times lists of that era. For instance, detective novelists Mickey Spillane and Erle Stanley Gardner (of Perry Mason fame) are almost entirely missing—in spite of the fact that, in raw numbers, they were two of the most widely purchased authors between 1940 and 1960.4 In other words, the comparative absence of mystery and thriller bestsellers before 1980 reflects, in part, the fact that the biggest authors in these genres aren’t even being counted.
Editorial decisions shape the public’s perception of what it is reading and what it should consider reading next.
How different would the history of the list appear if it accounted for sales in different formats and at different outlets? Answering this question precisely is tricky: no comparable source on paperback bestsellers exists before the late 1960s, and the Times mass-market paperback list has only been digitized for the years between 2008 and 2017.
But, at least for those nine years, the relation between hardcover and mass-market paperback bestsellers is informative. The first thing to note is that there are a lot of books that only appear on one list or the other. Of the total 3,257 books to appear on either list, only 712—a mere 22 percent or so—appear on both. So, any account of popular literature using only one list is necessarily working with a restricted sample.
To get a sense of the “taste” of each list, consider the table below. It shows the top authors, by number of appearances, for each list. Importantly, this table is restricted to authors who only ever appeared on one of the two lists in that nine-year period. Such a restriction can help clarify what sort of books are considered by publishers to be “right” for each format.
Hardcover Mass-Market Paperback
Anthony Doerr Robyn Carr
Kathryn Stockett Lora Leigh
Kristin Hannah Lynsay Sands
Liane Moriarty Catherine Anderson
Donna Tartt Julia Quinn
Paula McLain Heather Graham
Mitch Albom Gena Showalter
Authors that appear only on the hardcover list are people who write in a variety of genres, but who generally occupy a more “prestigious” place in the culture industry. For example, Anthony Doerr won a Pulitzer, while Kathryn Stockett’s The Help was adapted into a film that was nominated for an Academy Award. Meanwhile, the top authors who only made it onto the mass-market list are, notably, all women. All are primarily authors of romance, though they span its many subgenres: erotic, historical, paranormal. These are genres that, even today, don’t always get a hardcover printing.
But, despite the sharp divisions suggested by the table above, the two lists aren’t completely unrelated. Authors who appear many times on one list tend to also appear many times on the other list, at least if they made it onto both lists at least once.5 When one considers the authors in this category—those who show up on both lists—one begins to understand why mysteries and thrillers might have become more prominent among hardcover bestsellers between 1980 and 2000.
Here are the authors who most commonly appear on both lists between 2008 and 2017:
George R. R. Martin Debbie Macomber John Grisham
David Baldacci Nicholas Sparks Charlaine Harris
Danielle Steel Stieg Larsson Nora Roberts
Janet Evanovich James Patterson Stephen King
Lee Child John Sandford Maxine Paetro
Unlike the hardcover-exclusive list, these are invariably genre-fiction authors, including many authors of thrillers (Grisham, Baldacci, Patterson, Child) and crime or mystery novels (Harris, Larsson, Evanovich, Paetro). In publishing history, this type of author—an author of genre fiction with major sales in both formats—was becoming increasingly important in the late 1970s and early 1980s. This was a period in which publishers were becoming more aggressive about adapting mass-market strategies for their hardcover titles.
There were many reasons for this, but one of them had to do with changes on the distribution side of the industry. The spread of chain stores like B. Dalton and Waldenbooks, as well as the sale of books at nontraditional outlets like supermarkets, were helping to make it possible to sell hardcover novels at paperback scale. Put another way, it was now possible to sell more higher priced books to more readers.
This also meant that hardcover books were being sold to a larger, more geographically diverse readership. As Richard Snyder, then president of Simon and Schuster, put it in 1980: “The chains serve a different community of book readers from any that the book business has ever had before. … The minute you get into the suburbs, where ninety per cent of the chain stores are located, you serve the customers, mainly women, the way you would serve them in a drugstore or a supermarket.”6 Drugstores and supermarkets: traditional outlets for mass-market paperbacks.7
As part of this shift, publishers became more likely to give a hardcover printing to authors and genres that might previously have been paperback exclusives. This included some authors of genre fiction listed above, such as Danielle Steel.
By 1990, many industry veterans felt that hardcover publishing had completely changed. As the vice president of one publishing house put it, “At the beginning of the 1980s, you could not imagine first printings of 500,000 [hardcovers]. … The fact that 1.5 million people walk into a bookstore and pay $20—and up—for a hardcover book is mind-boggling.”8
In other words, some portion of the dramatic rise in the number of bestselling mysteries and thrillers after 1980 is probably due to changes in distribution, rather than popular taste.9 Spillane, Gardner, and other authors of crime fiction were already popular in the ’40s and ’50s, but—due to segmentations in the market—they were only sold at certain outlets and in certain formats. Today, you can find a James Patterson thriller almost anywhere and in any format: paperback at Walgreens, hardcover at Barnes and Noble, e-book on Amazon.
Meanwhile, the status of mass-market romance today is perhaps comparable to that of thrillers in the 1940s and ’50s. If it weren’t for the fact that the Times now publishes a separate mass-market list, some of these authors wouldn’t appear on bestseller lists at all (and even this mass-market Times list has recently been demoted from a weekly to a monthly publication schedule). This says more about formatting practices in the publishing industry than it does about the popularity of these authors.
So, what, then, is a bestseller? It seems like the answer should be simple—it’s just a book that sold the best! But, as we’ve seen, the truth turns out to be more complicated. Since novels are published in many different formats and sold at many different kinds of stores, decisions must be made about how and what to count. This is not to say that bestseller lists are arbitrary, or that they can’t be trusted. Rather, it’s just to point out that editorial decisions may favor some books over others.
How we count reflects what we want to know—or, at least, what the Times thinks we want to know. Whether they are right about this might depend on who you ask.
- Blatty v. New York Times Company, Supreme Court of California (1986). ↩
- Specifically, among those titles that could be matched to a record in HathiTrust or Worldcat. This comprises 54 percent of titles that hit the list between 1931 and 2009. After 2009, HathiTrust coverage drops off. ↩
- For more on the history of the list’s construction, see Laura Miller, “The Bestseller List as Marketing Tool and Historical Fiction,” Book History, vol. 3 (2000). ↩
- Spillane made it onto the list once, with his 1952 novel Kiss Me, Deadly. The book’s hardcover sales were meager, barely enough to put it at #11. In paperback, meanwhile, it sold millions. Gardner, on the other hand, never once appears, despite the fact that his novels regularly received first printings of over half a million copies. For comparison, contemporary estimates suggest that 10,000 copies sold in a week is enough to put a book on the lower rungs of the bestseller list. Notably, before 1940, the Times occasionally published an alternative bestseller list, drawn from numbers taken straight from book distributor Baker and Taylor rather than from bookstore sales. Gardner appeared on that list multiple times, probably because distributors also sold books to nontraditional outlets. ↩
- The two values have a Pearson correlation of ~0.66 and a Spearman rank correlation of ~0.54. The difference between the two values reflects the skew of the data—some authors spent appeared many more times on one list or the other than average. It’s worth noting that this correlation disappears when one considers both lists in full, since most authors only appear on one list (and thus have no appearances on the other). The Pearson correlation on the full data is 0.61, but the Spearman correlation is 0.02, suggesting that the Pearson is driven almost entirely by outliers. ↩
- Thomas Whiteside, “Onward and Upward with the Arts—The Blockbuster Complex, II,” New Yorker, October 6, 1980, pp. 136, 138, quoted in Janice Radway, Reading the Romance: Women, Patriarchy, and Popular Culture (University of North Carolina Press, 1984), pp. 37–38. ↩
- It’s also worth noting that which bookstores get counted will have an effect on which books make an appearance. As readers might have noticed, the top authors listed above are exclusively white. This lack of racial diversity among top sellers will not be equally true of all bookstores. For this reason, Essence magazine used to publish an alternative bestseller list derived from sales at independently owned Black bookstores. This list featured mostly Black authors, many of whom never appear on the Times list. For more details on this bestseller list, see Jacinta Saffold’s The Essence Book Project. ↩
- Elizabeth Mehren, “The Decade of the Mass-Market Hardcover,” Los Angeles Times, December 31, 1989. ↩
- Of course, some of the shift is probably attributable to changes in taste. But disentangling changes in taste from changes in distribution is beyond the scope of this piece. ↩