Nick Smith

Fermilab operates the world's largest CMS Tier-1 facility. It provides 115 petabytes of data storage, grid-enabled CPU resources and high-capacity network to other centers. Photo: Reidar Hahn

Data science is one of the world’s fastest growing industries, and as a consequence, a large ecosystem of software tools to enable data mining at ever increasing scales has emerged. Data processing campaigns have distilled the more than 100 petabytes of raw data produced by the CMS experiment to around 10 terabytes. Even this reduced data is still unwieldy for HEP researchers to analyze. Fermilab researchers is currently leading an effort using novel approaches to complete two full CMS analyses.