Editor’s note: This is a version of an article originally published by Argonne National Laboratory.
Symmetry — displayed in areas ranging from mathematics and art, to living organisms and galaxies — is an important underlying structure in nature. It characterizes our universe and enables it to be studied and understood.
Because symmetry is such a pervasive theme in nature, physicists are especially intrigued when an object seems like it should be symmetric, but it isn’t. When scientists are confronted with these broken symmetries, it’s as if they’ve found an object with a strange reflection in the mirror.
The proton, a positively charged particle that exists at the center of every atom, displays asymmetry in its makeup. Physicists at the U.S. Department of Energy’s Argonne National Laboratory and their collaborators recently investigated the intricacies of this known broken symmetry through an experiment conducted at DOE’s Fermi National Accelerator Laboratory. The results of the experiment could shift research of the proton by reviving previously discarded theories of its inner workings.

In this graphical representation of the proton, the large spheres represent the three valence quarks, the small spheres represent the other quarks that make up the proton, and the springs represent the nuclear force holding them together. Image: Brookhaven National Laboratory.
The outcome of this experiment contradicts the conclusion of a study from the late 1990s, also performed at Fermilab. Scientists can now revisit theories to describe asymmetry in the proton that were ruled out by the old experiment.
Understanding the properties of the proton helps physicists answer some of the most fundamental questions in all of science, and by investigating the world at the smallest level, scientists are advancing technology we use every day. Studies of the proton have led to the development of proton therapy for cancer treatment, measurement of proton radiation during space travel and even understanding of star formation and the early universe.
“We were able to look at the puzzling dynamics within the proton,” said Argonne physicist Don Geesaman, “and through this experiment, nature is leading the way for concepts in older models of the proton to get a second look.”
Mismatched matter
Just as shapes can have symmetry, particles can, too. A perfect circle consists of two semicircles of the same size facing opposite directions, and each type of particle in the universe has an antiparticle of the same mass with opposite electric charge.
The building blocks of the proton include particles called quarks, and their antiparticles, called antiquarks. They come in “flavors,” such as up, down, antiup and antidown. Quarks and antiquarks are bound together inside the proton by a strong nuclear force. The strength of this force can pull pairs of quarks and antiquarks out of nothing, and these pairs exist for a short time before annihilating each other. This “sea” of quarks and antiquarks popping in and out of existence is ever-present inside the proton.
“Nature is leading the way for concepts in older models of the proton to get a second look.” – Argonne physicist Don Geesaman
Curiously, at any given time, there are three more quarks than antiquarks: two more up quarks than antiup quarks, and one more down quark than antidown quarks. In other words, these mismatched quarks have no antimatter counterparts. This asymmetry is the reason protons are positively charged, allowing atoms — and therefore all matter — to exist.
“We still have an incomplete understanding of quarks in a proton and how they give rise to the proton’s properties,” said Paul Reimer, an Argonne physicist on the study. “The fleeting nature of the quark-antiquark pairs makes their presence in the protons difficult to study, but in this experiment, we detected the annihilations of the antiquarks, which gave us insight into the asymmetry.”

This photo shows the apparatus used in the experiment. The proton beams pass through each of the shown layers, with the iron wall at the end of the path in the upper right corner of the image. Photo: Fermilab
The experiment determined that there are always more antidown quarks in the proton than antiup quarks, no matter the quarks’ momentums. The significance of this result is its contradiction with the conclusion of the Fermilab experiment in the late 1990s, which suggested that at high momentums, the proton’s asymmetry reverses, meaning the antiup quarks begin to dominate antidown quarks.
“We designed the new experiment to look at these high momentums to determine if this change really occurs,” Reimer said. “We showed that there is a smooth asymmetry with no flip of the ratio between antiup and antidown quarks.”
Reconstructing annihilation
To probe the quarks and antiquarks in the proton, the scientists shot beams of protons at targets and studied the aftermath of the particle collisions. Specifically, they studied what happens after a proton from the beam hits a proton in the target.
When protons collide, quarks and antiquarks from the protons annihilate each other. Then, two new fundamental particles called muons come out of the annihilation, acting as the interaction’s signature. From these interactions, the scientists determined the ratio of antiup quarks to antidown quarks at a range of high momentums.

Graphic of quarks annihilating (left red lines), producing a photon (middle line), and producing two muons (right magenta lines). Scientists detected these muons to gain insight into the quark asymmetry of the proton. Image: Paul Reimer, Argonne National Laboratory
“We chose to measure muons because they can pass through material better than most of the other collision fragments,” Reimer said. In between the targets and their measurement devices, the team placed a five-meter-thick iron wall to stop other particles from passing through and clouding their signals.
When the muons hit the measurement devices at the end of their journey, the scientists reconstructed the quark-antiquark annihilations from the measurements, enabling them to confirm the smooth, consistent ratio of antiup quarks to antidown quarks.
A second look
“What we thought we saw in the previous experiment isn’t what happens,” said Geesaman, who was part of both the present and previous studies. “Why, though? That’s the next step.”
Theories that were rejected after they contradicted the previous experiment’s results now give a great description of the new data, and scientists can revisit them with greater confidence because of this experiment. These theories will inform further experiments on asymmetry in the proton and other particles, adding to our understanding of the theory surrounding quarks.
Clues about the nature of quarks in the proton ultimately lead to better understanding of the atomic nucleus. Understanding the nucleus can demystify properties of the atom and how different chemical elements react with each other. Proton research touches upon fields including chemistry, astronomy, cosmology and biology, leading to advances in medicine, materials science and more.
“You need experiment to lead the thinking and constrain theory, and here, we were looking for nature to give us insight into the proton’s dynamics,” Geesaman said. “It’s an interlacing cycle of experiment and theory that leads to impactful research.”
A paper on the study, “The asymmetry of antimatter in the proton”, was published in Nature on Feb. 24.
The work was performed by the SeaQuest Collaboration, which is supported in part by DOE’s Office of Nuclear Physics and the National Science Foundation.
Fermilab is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit science.energy.gov.
Analyzing the mountains of data generated by the Large Hadron Collider at the European laboratory CERN takes so much time that even the computers need coffee. Or rather, Coffea — Columnar Object Framework for Effective Analysis.
A package in the programming language Python, Coffea (pronounced like the stimulating beverage) speeds up the analysis of massive data sets in high-energy physics research. Although Coffea streamlines computation, the software’s primary goal is to optimize scientists’ time.
“The efficiency of a human being in producing scientific results is of course affected by the tools that you have available,” said Matteo Cremonesi, a postdoc at the U.S. Department of Energy’s Fermi National Accelerator Laboratory. “If it takes more than a day for me to get a single number out of a computation — which often happens in high-energy physics — that’s going to hamper my efficiency as a scientist.”
Frustrated by the tedious manual work they faced when writing computer code to analyze LHC data, Cremonesi and Fermilab scientist Lindsey Gray assembled a team of Fermilab researchers in 2018 to adapt cutting-edge big data techniques to solve the most challenging questions in high-energy physics. Since then, around a dozen research groups on the CMS experiment — one of the LHC’s two large general-purpose detectors — have adopted Coffea for their work.

Around a dozen research groups on the CMS experiment at the Large Hadron Collider have adopted the Coffea data analysis tool for their work. Starting from information about the particles generated in collisions, Coffea enables large statistical analyses that hone researchers’ understanding of the underlying physics, enabling faster run times and more efficient use of computing resources. Photo: CERN
Starting from information about the particles generated in collisions, Coffea enables large statistical analyses that hone researchers’ understanding of the underlying physics. (Data processing facilities at the LHC carry out the initial conversion of raw data into a format particle physicists can use for analysis.) A typical analysis on the current LHC data set involves processing an astounding roughly 10 billion particle events that can add up to over 50 terabytes of data. That’s the data equivalent of approximately 25,000 hours of streaming video on Netflix.
At the heart of Fermilab’s analysis tool lies a shift from a method known as event loop analysis to one called columnar analysis.
“You have a choice whether you want to iterate over each row and do an operation within the columns or if you want to iterate over the operations you’re doing and attack all the rows at once,” explained Fermilab postdoctoral researcher Nick Smith, the main developer of Coffea. “It’s sort of an order-of-operations thing.”
For example, imagine that for each row, you want to add together the numbers in three columns. In event loop analysis, you would start by adding together the three numbers in the first row. Then you would add together the three numbers in the second row, then move on to the third row, and so on. With a columnar approach, by contrast, you would start by adding the first and second columns for all the rows. Then you would add that result to the third column for all the rows.
“In both cases, the end result would be the same,” Smith said. “But there are some trade-offs you make under the hood, in the machine, that have a big impact on efficiency.”
In data sets with many rows, columnar analysis runs around 100 times faster than event loop analysis in Python. Yet prior to Coffea, particle physicists primarily used event loop analysis in their work — even for data sets with millions or billions of collisions.
The Fermilab researchers decided to pursue a columnar approach, but they faced a glaring challenge: High-energy physics data cannot easily be represented as a table with rows and columns. One particle collision might generate a slew of muons and few electrons, while the next might produce no muons and many electrons. Building on a library of Python code called Awkward Array, the team devised a way to convert the irregular, nested structure of LHC data into tables compatible with columnar analysis. Generally, each row corresponds to one collision, and each column corresponds to a property of a particle created in the collision.
Coffea’s benefits extend beyond faster run times — minutes compared to hours or days with respect to interpreted Python code — and more efficient use of computing resources. The software takes mundane coding decisions out of the hands of the scientists, allowing them to work on a more abstract level with fewer chances to make errors.
“Researchers are not here to be programmers,” Smith said. “They’re here to be data scientists.”
Cremonesi, who searches for dark matter at CMS, was among the first researchers to use Coffea with no backup system. At first, he and the rest of the Fermilab team actively sought to persuade other groups to try the tool. Now, researchers frequently approach them asking how to apply Coffea to their own work.
Soon, Coffea’s use will expand beyond CMS. Researchers at the Institute for Research and Innovation in Software for High Energy Physics, supported by the U.S. National Science Foundation, plan to incorporate Coffea into future analysis systems for both CMS and ATLAS, the LHC’s other large general-purpose experimental detector. An upgrade to the LHC known as the High-Luminosity LHC, targeted for completion in the mid-2020s, will record about 100 times as much data, making the efficient data analysis offered by Coffea even more valuable for the LHC experiments’ international collaborators.
In the future, the Fermilab team also plans to break Coffea into several Python packages, allowing researchers to use just the pieces relevant to them. For instance, some scientists use Coffea mainly for its histogram feature, Gray said.
For the Fermilab researchers, the success of Coffea reflects a necessary shift in particle physicists’ mindset.
“Historically, the way we do science focuses a lot on the hardware component of creating an experiment,” Cremonesi said. “But we have reached an era in physics research where handling the software component of our scientific process is just as important.”
Coffea promises to bring high-energy physics into sync with recent advances in big data in other scientific fields. This cross-pollination may prove to be Coffea’s most far-reaching benefit.
“I think it’s important for us as a community in high-energy physics to think about what kind of skills we’re imparting to the people that we’re training,” Gray said. “Making sure that we as a field are pertinent to the rest of the world when it comes to data science is a good thing to do.”
U.S. participation in CMS is supported by the Department of Energy Office of Science.
Fermilab is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit science.energy.gov.