Justice for “Data Janitors”

Lilly Irani

Photograph by John Marino / Wikimedia Commons

January 15, 2015 — Science fiction author Arthur C. Clarke famously declared that any sufficiently advanced technology becomes indistinguishable from magic. Today, the likes of Google, Amazon, and Facebook appear determined to sell us the dream that machines—drones, self-driving cars, and one-click shopping services—can almost miraculously fulfill users’ desires. But what is at stake in hiding the delivery people, stockroom workers, content moderators, and call center operators laboring to produce the automated experience?

Between 2003 and 2007, I worked as a “user experience designer” at Google, a company celebrated for being a creative, perk-filled information factory. Once, as I worked in my cubicle, an Oprah camera crew rolled by, filming my stickered laptop and whiteboards covered with scribbles. Every so often, company founders Larry Page and Sergey Brin would lead policy makers, including Bill Clinton and Colin Powell, around the campus, conjuring visions of Silicon plenty and showing off the engineers, designers, and product managers who would build them.

Another Google presented itself after hours, at the edges of campus, in the marginalia of product talks, and beyond journalists’ and policy makers’ view. Watching a “tech talk,” I saw an engineer present a machine for turning the pages of rare books under a scanning camera. The patented machine housed a worker who flipped the pages in time to a rhythm-regulated soundtrack. Later, I worked on an advertising project to partially automate the process of sanitizing Google ad results. Indian workers I never met checked ads to filter out porn, alcohol, and violence. The partial automation reduced their work but could not replace it completely.

These moderators and scan workers never showed up in the lavish, celebrated spaces where Googlers drank, ate, and brainstormed. They didn’t ride the Google shuttle, eat the Google food, or attend beer-filled all-hands Friday meetings. In fact, Google’s abundantly productive, nonhierarchical, and playful workplace seemed to rely on hidden layers of human data work: subcontractors who were off the books, out of sight, and safely away from both central campus and technological entrepreneurship’s gleaming promise of job creation.

The human-fueled automations I saw at Google are also largely out of sight in current international debates about the relationship between digital technology and the future of work. Will technology produce new jobs, new industries, and new forms of comparative advantage? Or will technology take away jobs and concentrate wealth among those who own the machines? As the US and Europe grapple with austerity policies, the threat to employment is deepened by the boom in machine learning, robotics, and drones, as two recent books show.

Erik Brynjolfsson and Andrew McAfee’s The Second Machine Age offers a tour through corporate visions of spectacular automation, and well-meaning advice about how to stave off the future inequality such visions might generate. Simon Head’s Mindless redirects our gaze: business automation has already spread far and wide, he demonstrates, and its dire effects on workers’ wages and souls lie in the here and now. In examining computerization futures—whether in terms of abundance or of alienation—neither book recognizes the profits and pleasures of pretending technology is magic. That magic always relies on invisible labors.

The Second Machine Age opens with a roller coaster ride through the promised future of artificial intelligence (AI) and its threat to replace human labor. Once incredulous about AI’s reach, the authors have become AI faithful, and here they march out one killer demo after another. They open the book with a ride in Google’s self-driving car. The vehicle drives them around the Mountain View campus and surrounding roads with nary a bump or a hard brake. Google has named the driving software Chauffeur. Chauffeur symbolically promises middle-class drivers relief from drudgery and suggests automated access to the lifestyles of the rich and famous. Chauffeur represents just one of legion labor-replacing automations around the corner, the authors prophesy. Why hire a living person when a learning robot can work quickly, quietly, and without breaks, demands, or opinions?

Automation doesn’t replace labor. It displaces it.

There are lots of reasons managers won’t fully automate workforces. The cost of research and development (R&D) for automations like Chauffeur can be staggering. Far from simply realizing a dream of GoogleLabs, Google’s Chauffeur project builds on decades and untold millions of dollars of military investment in research on autonomous cars. The military’s on-the-ground autonomous warfare initiatives have produced and shaped an ecology of careers, techniques, infrastructures—including that of Chauffeur leader Sebastian Thrun. Complex automation, then, is not something many companies can develop on their own, even with dedicated effort.

But could AI and robotics companies do the R&D and sell automation technologies to other companies? To the extent that companies constrain their operations to meet robots’ limitations, yes. But even still, human labor is necessary to configure, calibrate, and adjust automation technologies to adapt to a changing world, whether those changes are a differently shaped product or a bird that flies into the factory. Brynjolfsson and McAfee lump together a wide swath of AI applications and predict that the successes among them portend the more general expansion of automated work. But in doing so they overlook the enormous amounts of behind-the-scenes, domain-specific labor that makes AI possible in the first place. Google’s self-driving car doesn’t simply go anywhere its passengers please. For this car to drive “itself,” a human worker has to drive around, scan, and map the car’s world—including everything from curb heights to intersection angles. Machine-learning algorithms that partially automate data processing still need to be trained for every new form, or every new kind of topic the algorithm might deal with. Other robots profiled in The Second Machine Age will learn the movements of shop floor workers and then replace them, until the next tune-up or calibration is necessary. Such work of alignment is not a bug—it is the condition of possibility for keeping humans and automation working in the same world. The Second Machine Age leans heavily on the accounts of corporate executives promising fantastic new horizons of tech profit, but it’s undeniable that for those pursuing customers and venture capital for automation, there’s good money to be had in hiding these headaches.

This care for and feeding of artificial intelligence suggests a much bigger oversight in Brynjolfsson and McAfee’s argument. Automation doesn’t replace labor. It displaces it. Historian Ruth Schwartz Cowan famously showed how the invention of the washing machine mainly increased the standards of cleanliness domestic workers (paid and unpaid) had to meet. Shoshana Zuboff’s 1988 book In the Age of the Smart Machine described how factory automation created text-based labor, displacing workers who smelled and felt wood pulp with those who could read screens and meters and tend to the machines. Hamid Ekbia and Bonnie Nardi call these managerially advantageous human-machine configurations “heteromation.”

The emergence of the digital microwork industry to tend artificial intelligence shows how labor displacement generates new kinds of work. As technology enterprises attempt to expand the scope of culture they mediate, they have had to grapple with new kinds of language, images, sounds, and sensor data. These are the kinds of data that flood Facebook, YouTube, and mobile phones—data that digital microworkers are then called on to process and classify. Such microworkers might support algorithms by generating “training data” to teach algorithms to pattern-match like a human in a certain domain. They might also simply process large volumes of cultural data to prepare it to be processed in other ways. These cultural data workers sit at computer terminals, transcribing small audio clips, putting unstructured text into structured database fields, and “content moderating” dick pics and beheadings out of your Facebook feed and Google advertisements.

Computers do not wield the cultural fluencies necessary to interpret this kind of material; but people do. This is the hidden labor that enables companies like Google to develop products around AI, machine learning, and big data. The New York Times calls this “janitor work,” labeling it the hurdle, rather than the enabling condition, of our big data futures. The second machine age doesn’t like to admit it needs help.

These cultural data workers are central to the political economy of computing, from the free labor of AOL chat room moderators to the organized but invisible labor of paid content moderators. Since the early 2000s, Google has relied on data workers to tune and train its algorithms. The company constantly refines its search algorithms in a war for higher rankings with other search optimizers and spammers. How do Google engineers figure out if their new algorithm produces high-quality results? They have to rely on workers called “raters”—contractors often working from home—to judge the search result pages and rate them; workers can label resulting pages as “vital,” “useful,” “slightly relevant,” or even “maybe spam.” Google engineers then feed these worker-generated ratings back into their algorithm so the algorithm can learn to see more like the rating workers.

McAfee and Brynjolfsson ignore the labor of cultural data workers, as if algorithms trained, tuned, and augmented themselves, like magic.

Twitter also relies on cultural data workers to help its search engine cope with breaking news and instances of sudden linguistic change. Think of when Mitt Romney impressed the nation with his diversity strategies, including “binders full of women.” The gaffe set Twitter afire with satire, but Twitter’s algorithms didn’t know the difference between “binders full of women” and binders on sale at Office Depot for 99 cents. Such differences matter for Twitter’s search results, for its ad placements, and for its famed trend detection. Because terms like these spike and go away very quickly, Twitter engineers can’t train algorithms fast enough. To fill the gap, Twitter deploys an army of cultural data workers to sort and classify tweets in real time.

In response to demand for this cultural labor, several companies have sprung up to match workers to engineers who need them. The most famous of these is Amazon Mechanical Turk (AMT), the data work clearinghouse that Twitter relied on to power the search engine described above. Amazon launched AMT to allow programmers to issue data processing calls directly from their computer code. Rather than other code answering the call, thousands of workers wait at their computers, ready to perform cognitive piecework on demand. AMT workers choose among tasks like transcription, content moderation, and image classification, getting paid per piece of data processed. They might work for 10 employers in the span of a day. Amazon sends the data work to the employer, structures the market that enables the microcontracts, and, should the employer choose to pay, transmits payment to the worker.

Work conditions for these data workers are what “the market,” or workers, will tolerate. As contractors, AMT workers are excluded from the protections of minimum-wage laws. Amazon also allows employers to decide whether or not they want to pay. The intention is to let employers set standards. The effect is that unscrupulous AMT users steal wages. Although workers share information to avoid these thieves, they report that Amazon will very rarely step in to arbitrate disputes when an employer and worker disagree about work quality or where the fault lies for bad work.

New companies, and even a subfield of computer science, have sprung up to develop research and applications around algorithmically managed “human computation.” History repeats itself. Originally, “computers,” until their calculations were automated in the mid-20th century, were women. Today’s hierarchy of data labor echoes older gendered, classed, and raced technology hierarchies. What’s new is the way AMT and similar crowdsourcing platforms democratize outsourcing to any employer with a computer and credit card.

Such necessary and low-paid data work has no place in The Second Machine Age. McAfee and Brynjolfsson readily admit that AI will not replace all jobs. They quote Steven Pinker: “The main lesson of thirty-five years of AI research is that the hard problems are easy and the easy problems are hard. … [I]t will be the stock analysts and petrochemical engineers and parole board members who are in danger of being replaced by machines. The gardeners, receptionists, and cooks are secure in their jobs for decades to come.” First, this all depends on how one defines “secure,” since those irreplaceable workers often earn little and work on precarious terms. Second, McAfee and Brynjolfsson ignore the labor of cultural data workers, as if algorithms trained, tuned, and augmented themselves, like magic.



Undazzled by AI fantasies, journalist Simon Head looks closely at the semi-automated work we’ve been living with for decades. In particular, Head details how computing business systems (CBSs) have shaped work conditions at warehouses, banks, and call centers. CBSs are massive networked data management systems that underpin the operations of most very large financialized organizations, from Goldman Sachs to public universities. The systems are built, sold, and maintained by high-tech behemoths, some famous, some that operate below the radar: IBM, PeopleSoft, SalesForce, and SAP, just to name a few. Head, citing Brynjolfsson and McAfee’s earlier research, points out that enterprise resource planning systems—systems that automate and control organizational data processing—comprised 75 percent of all US corporate IT investment in 2001.

These routine technologies rarely make the nightly news but, unheralded, they allow managers tight control over workers. Managers use algorithms to steer employee workflows. They can track workers’ typing at their keyboards and their movements through body-worn GPS. They can monitor fulfillment rates or success at sales and cut workers who cannot meet targets. By manipulating information screens, managers never have to confront workers, who might push back, or observe workers’ circumstances.

Head describes Amazon’s warehouses as a prime example of such grueling, semi-automated management. Amazon’s algorithms take incoming orders and develop scripts to direct a worker around the warehouse. The worker has to follow the script, gathering items into carts and meeting travel times set at management whim. Like with AMT, employers set the script and workers have to meet it or leave. Warehouse workers are hired on as temps, so management can let go those who cannot keep the pace: older workers, sick workers, or just tired workers.

Call center workers, ticket agents, and delivery people all work under similar scripts and under comparable surveillance. The CBS-enabled workplace has become the factory floor of the service economy. CBS control, Head argues, forces people into skill-stripped jobs that satisfy corporate thirst for transparency and control, heightening the effects of earlier strategies of scientific management and assembly line factory organization. The scripts allow service workers, like their warehouse counterparts, little latitude to express judgment, creatively problem-solve, or leverage their built-up, on-the-job wisdom. Instead, managers retain control so they can swap one worker for another and keep wages low.

for Head, the problem is not automation itself but the ways it entrenches command-and-control relationships between managers and workers.

CBSs organize the systems we all have to live with as consumers, enabling the production of complex financial instruments, insurance plans, and health care systems—systems that shape our lives but operate in mysterious, seemingly nonnegotiable ways. Head even argues that the financial crisis is a consequence of CBSs run amok. Banks like Goldman Sachs script their bankers to sell financial products—complex informational commodities—at a speed and scale that prohibits individual, human judgment from interrupting fast capitalism’s flow.

For Head, a better digital workplace is possible. The Treuhand workshop in Chemnitz, Germany, presents one possibility. The shop uses advanced machining systems to manufacture components, but strong trade unions facilitate worker control over their labor. Managers send specifications to workers trained in craft-apprentice traditions and those workers decide how to use machine tools to design the component. Managers check for quality only just before shipment. The Treuhand workers augment their craft with technology without falling under managerial microcontrol from a distance. We might call this approach specify and enable.

Head then compares the German shop with two others that use the same technologies: Caterpillar in Peoria, Illinois, and John Deere in Waterloo, Iowa. In the American shops, managers act as engineers, specifying both parts and process in detail. They plan how to machine a component. Machinists then execute the plan. Little collaboration transpires between machinists, who know the machines intimately, and the managers who detail the work. When machining the materials inevitably reveals design problems, the distance between workers and managers stymies resolution. With their command-and-control structure, the American workshops do not meet the design and quality standards German companies achieve with empowered machinists.

Automation itself is not to blame, Head argues. The problem lies with the ways automation entrenches command-and-control relationships between managers and workers. This is a deep and valuable insight of the book. Such insight could have led Brynjolfsson and McAfee to more productive questions. They might still ask if and how automation may augment productivity, but they should also ask: What exactly is being produced? And, What is its quality? Command-and-control automation strips away the “the human factor,” as Head calls it. Such a wide range of accumulated human ability and wisdom can generate more subtle kinds of value than that which top management can predict. Head’s subtitle, “why smarter machines are making dumber humans,” sounds like catchy technological determinism, and yet the book actually tells a more subtle story about machines and social power.

In fact, Head’s analysis of CBS systems answers a question that effectively haunts Brynjolfsson and McAfee’s book: what are the implications of digital technologies for the growing gap between rich and poor? The Second Machine Age notes that wage stagnation parallels another economic trend; its authors argue that owners of physical capital today keep a far larger portion of their profits than in the past five decades. (Their argument assumes but does not demonstrate that businesses more dependent on human capital such as talent or skill distribute profits more equally.) Unwilling to question free-market ideas, the authors argue that digital bounty replaces some workers’ skills; it also creates rock stars, who gather in the lion’s share of profit, by delivering their superior talent to bigger markets. This conventional explanation effaces the struggles over control, craftsmanship, and the value of worker labor that Head details so well. Such different diagnoses of inequality lead to starkly differing prescriptions.

Brynjolfsson and McAfee believe machines produce digital abundance; humans, therefore, must become entrepreneurial masters of the machines or peacefully coexist with them. This leads them to personal and policy recommendations that focus on individualized human capital enhancement. They argue, for example, that massive open online courses (MOOCs) and revamped universities can reskill soon-to-be-redundant workforces. Machines, Brynjolfsson and McAfee argue, are great at processing data but can’t generate new ideas. To race against the machine, workers should become creative entrepreneurs. Still, they argue, the digital playing field will produce irrationalities and require some forms of redistribution. Brynjolfsson and McAfee advocate for high taxes on top earners, recognizing that those who control platforms or brands often charge more because they can (part of what economists call rent seeking). More radically, the authors suggest funding basic income guarantees (BIG) to sustain lives and consumer demand during capitalism’s periodic crises.

Neither book displays an awareness of how the likes of Google, Twitter, and Amazon already rely on low-status workers’ smarts to power the companies’ seemingly miraculous algorithms and information systems. 

Head, on the other hand, would improve the quality of life and work. He calls for a coalition among those squeezed by CBSs—white-collar workers, like middle managers and nonelite professionals, and low-income workers—to press for higher-paying, more highly skilled jobs. CBSs, he argues, should supplement worker expertise rather than replace it. Head calls on governments to reinvest in educational institutions that can prepare workers for these more highly valued jobs. Brynjolfsson and McAfee, on the other hand, take the CEO perspective on the world, assuming that skill and talent concentrate at the top, with mediocrity increasingly predominant as one descends the ladder of success. The authors of both these books agree that the lower rungs of production are low-skill, whether by repression in Mindless or absence of need in The Second Machine Age. Neither book displays an awareness of how the likes of Google, Twitter, and Amazon already rely on low-status workers’ smarts to power the companies’ seemingly miraculous algorithms and information systems. Of course, the magic depends on a slight of hand that obscures the humans and their minds inside the machines.

In the early years of AMT, I saw programmers excitedly explain the system to one another. They could write code as they always did, and call upon Amazon’s processors with data crunching requests. When they sent their requests to Amazon’s servers, the described how it was “like magic”—the servers answered the call. The magic, however, was Turk workers’ handiwork. With workers hidden in the technology, programmers can treat workers like bits of code and continue to think of themselves as builders, not managers. Anthropologist Lucy Suchman has argued that much technology enchants in precisely this way—by masking the labors of production. Technologies like AMT, then, do not make dumber humans, but rather channel the human factor into forms pleasurable for programmers.

The aesthetics of service magic pervade many businesses launched by technologists. Google Express (formerly Google Shopping Express) enables customers in pilot cities to click a few buttons on the web to receive their chapstick, board games, and other items from brick-and-mortar stores. A subcontractor courier, directed and paid by Google, will deliver the items to customers’ front doors. Another company, Hointer, allows mall shoppers to walk up to products, scan them, and have robots and invisible workers drop the right-sized jeans and sweaters into a dressing room. The company, started by an early Amazon employee, makes shopping without salespeople possible. In all these cases, workers are not only complementing automation; they are employed to simulate it.

What is at stake here? Google, Hointer, and AMT all work to “disintermediate” consumption, to remove obstacles between the user and her desires. With workers and distribution infrastructures out of sight, consumers can focus on and value the brand, the product, and the platform. Disintermediation liberates consumers from troublesome or awkward labor encounters and accountabilities.

The stakes of technological magic are also financial. Companies able to pose as “technology” companies can also command higher market valuations than those with lots of labor on the books. This became clear to me at a technology industry conference when a venture capitalist explained that he saw crowdsourcing companies as technology companies, rather than labor companies. Technology companies invest lots in up-front research, but then scale to serve large volumes of customers—and make large volumes of profit—with relatively little increase in operating costs. Labor-intensive companies, on the other hand, increase their labor expenditures as their revenue increases. Venture capitalists love to invest money up front and then collect massive returns. Technology companies enjoy soaring market valuations when they make their non-R&D labor force as low-cost and low-risk as possible.

The global distribution of information work reveals a racial politics to hiding the workers as well. Responding to customers’ preference to hear voices like their own, Avatar Technologies has devised software that allows overseas operators to “speak” American English; at their stations in Asia, workers click buttons that trigger pre-recorded voices. A Mexican call center capitalizes on deportees from the US to offer American-accented services to US clients. In his 2008 film, Sleep Dealer, screenwriter and director Alex Rivera deftly depicts the transnational politics of American service work. His protagonist, Memo Cruz, labors in an infomaquila where he uses a networked telepresence system to control construction robots across the US border. “We give the United States what they always wanted,” Cruz explains of his remote labors, “all the work, without the workers.”

Subcontracted work powers technological magic, just out of consumer sight and off the company’s payroll. Google page raters work for outsourcing companies. Google Express is powered by subcontracted courier companies. AMT workers are all self-employed contractors. The US Equal Employment Opportunity Commission tracks labor force diversity with the Employer Information report EEO-1. Google’s EEO-1, released reluctantly in 2013, lists only managers, technicians, sales, and professional workers. Their report suggests they employ almost no laborers, service workers, or operatives. These workers power the tech industry, yet are out of sight and out of mind in the press and policy on diversifying the tech workplace. The diversity is there. It’s just subcontracted and paid poorly.

These workers excel in doing what machines cannot. They have won the race against the machine, but they do not always even make minimum wage. Should they lean in, take MOOCs, and be more creative? Among the most active AMT workers, nearly 58% already have a bachelor’s degree or higher. (An AMT worker, clickhappier, generated this statistic by cross-tabulating data from NYU professor Panos Ipeirotis.) Many do the work because the cost of saying no is too high when bills come due. Their employers, situated in universities, start-ups, and tech companies, have more powerful forms of human capital that enable them to take advantage of AMT workers’ diverse skills and talents.

The pleasures and conveniences of human-powered technology will continue to fuel a growing market for technology’s hidden laborers. Employers, driven by profit margins and stock prices, have great incentives to keep these workers off the books and out of sight. Inside the machines, inequality will persist. Unless, that is, we discredit and challenge the industry’s hierarchies of value that grant managers and programmers rock star status and wealth, while confining data workers to a life of underpayment and insecurity.

How can workers withdraw their essential labors to demand fair work conditions? Despite the shortcomings of their analysis, Brynjolfsson and McAfee propose a weapon that could strengthen the hidden workers of the digital age: a basic income guarantee (BIG). The guarantee provides all citizens with a lump sum; the programs can make up the cost of money outlays by cutting government institutions focused on administering, disciplining, and monitoring welfare recipients—a redistribution strategy compatible with neoliberal economic theories that attempt to separate the market from the social. The state of Alaska already has such a guarantee in the form of oil income disbursements. Brynjolfsson and McAfee frame the BIG as a support for those who cannot produce value in an automated economy; we might also see it as a means to strengthen opposition to coercion. Today’s hidden data contractors often must work to maintain their livelihoods—strikes prove difficult in online work when anyone with a computer and a bill coming due can break the picket line. An income guarantee would allow workers to walk away, or at least starve the algorithms of their data until managers shape up.