To Make Fairer AI, Physicists Peer Inside Its Black Box

Physicists built the Large Hadron Collider to study the inner workings of the universe. Inside a 27-kilometer underground ring straddling the French-Swiss border, the machine smashes protons together at nearly the speed of light to produce—fleetingly—the smallest constituent building blocks of nature. Sifting through snapshots of these collisions, LHC researchers look for new particles and scrutinize known ones, including their most famous find, in 2012: the Higgs boson, whose behavior explains why other fundamental particles like electrons and quarks have mass.
Less well known is the intricate software engine that powers such discoveries. With particle collisions occurring at approximately a billion times per second, the facility generates about 40 terabytes of data per second, according to LHC physicist Maurizio Pierini. Consequently, researchers have developed extensive tools for extricating signal from noise. Part of the analysis involves creating detailed simulations of the collisions, which allow physicists to lay out exactly what they know, to more easily spot the unexpected. These simulations require massive computing power, and researchers will accelerate their data production as they prepare to upgrade the LHC to produce around six times as much in 2027. To scale up the required simulations, Pierini is considering the tech trend of the moment: artificial intelligence.Specifically, Pierini has begun simulating proton collisions with an AI algorithm known as a generative adversarial network, or GAN. These algorithms are known for their ability to create realistic-looking fake data. “We can even simulate the behavior of particles that we don't know exist,” says Pierini.

In fact, perhaps the most notorious application of GANs also involves producing objects that don’t exist. They are the software behind deepfakes, the fake images of human faces which have been used to forge videos that portray events that never really happened, or to glue a person’s image onto another person’s body. That’s right; the same algorithms that trolls use to make fake celebrity porn may also help uncover the universe’s deepest mysteries. Physicists have found a noble purpose for some of AI’s creepiest tools.

And this poses a moral quandary to physicists, who know that any algorithm can be both creepy and noble, depending on its context. “I'm quite convinced that AI will ultimately be either the best thing or the worst thing ever to happen to humanity, depending on how we use it,” says physicist Max Tegmark of the Massachusetts Institute of Technology.Here’s another example of the amorality of AI algorithms: When applied to personal data, they can make racially biased mistakes. Studies have shown that commercial facial-recognition software makes mistakes with Black faces at far higher rates than with white ones. In January, a Michigan police department facial-recognition algorithm misidentified Robert Julian-Borchak Williams, a Black man, for shoplifting, resulting in possibly the first example of a wrongful arrest in the US due to faulty facial-recognition software. But physicist Brian Nord of Fermilab uses algorithms derived from facial-recognition technology to identify abnormal-looking galaxies whose light is warped by curved spacetime. Nord, who is Black, studies these galaxies for clues to why the universe is expanding faster.
In 2014, before Nord began using AI, he and his colleagues classified these galaxies by eye. “It was a huge endeavor to identify these objects,” says Nord. “So I started exploring other algorithms out there and accidentally came upon facial recognition.” The software worked—after all, to a computer, faces and galaxies are both just data. Yet for Nord, this raises concerns about how such work might escape from the lab: What if physicists made an algorithm for galaxies, but somebody else re-purposed it for surveillance?