Facebook Engineers: We Have No Idea Where We Keep All Your Personal Data

In March, two veteran Facebook engineers found themselves grilled about the company’s sprawling data collection operations in a hearing for the ongoing lawsuit over the mishandling of private user information stemming from the Cambridge Analytica scandal.

The hearing, a transcript of which was recently unsealed, was aimed at resolving one crucial issue: What information, precisely, does Facebook store about us, and where is it? The engineers’ response will come as little relief to those concerned with the company’s stewardship of billions of digitized lives: They don’t know.

The dispute over where Facebook stores data arose when, as part of the litigation, now in its fourth year, the court ordered Facebook to turn over information it had collected about the suit’s plaintiffs. The company complied but provided data consisting mostly of material that any user could obtain through the company’s publicly accessible “Download Your Information” tool.

Facebook contended that any data not included in this set was outside the scope of the lawsuit, ignoring the vast quantities of information the company generates through inferences, outside partnerships, and other nonpublic analysis of our habits — parts of the social media site’s inner workings that are obscure to consumers. Briefly, what we think of as “Facebook” is in fact a composite of specialized programs that work together when we upload videos, share photos, or get targeted with advertising. The social network wanted to keep data storage in those nonconsumer parts of Facebook out of court.

In 2020, the judge disagreed with the company’s contention, ruling that Facebook’s initial disclosure had indeed been too sparse and that the company must reveal data obtained through its oceanic ability to surveil people across the internet and make monetizable predictions about their next moves.

Facebook’s stonewalling has been revealing on its own, providing variations on the same theme: It has amassed so much data on so many billions of people and organized it so confusingly that full transparency is impossible on a technical level.

240