Fermilab feature

Sloan Digital Sky Survey receives award for early, groundbreaking work in science data management

Sloan Digital Sky Survey received the 2021 ACM SIGMOD Systems Award for its “early and influential demonstration of the power of data science to transform a scientific domain.” The award recognized the contributions of Fermilab’s Bill Boroski, Steve Kent and Brian Yanny, as well as several others, for work done from 2000 to 2008 on the database systems developed to distribute SDSS data.

A large telescope on a rooftop, the sunset behind it

The SDSS telescope.

The Association for Computing Machinery’s Special Interest Group on Management of Data recently announced the recipients of the 2021 ACM SIGMOD awards. Sloan Digital Sky Survey, an ambitious sky-mapping project that observed first light in 1998 and today — after numerous upgrades — still collects large amounts of data about celestial objects, received the 2021 ACM SIGMOD Systems Award for its “early and influential demonstration of the power of data science to transform a scientific domain.” Among more than a dozen people, the award recognized the contributions of Fermilab’s Bill Boroski, Steve Kent and Brian Yanny for work done from 2000 to 2008 on the database systems developed to distribute SDSS data. Not only did the system demonstrate the value of data management technology, it also influenced data management by publishing real analytic workloads that have been used for testing, comparing and advancing data management systems.

Building the Catalog Archive Server

The Sloan Digital Sky Survey was a true collaborative effort, perhaps the first large astronomy project in the world to be planned and executed in such a fashion. A number of institutions, including Fermilab, developed and built the telescope and its various elements and supporting systems. It was assembled and commissioned at Apache Point Observatory in New Mexico.

When the project entered the operations phase in 2000, data was collected at the observatory and shipped overnight to Fermilab for processing. At the time, network bandwidth was limited and would not support acceptable transfer rates between the observatory and Fermilab. To negotiate this bottleneck, magnetic tapes containing data from a night’s observations were packaged and shipped daily via overnight express.

Once at Fermilab, data was processed and prepared for distribution. There were two channels for said distribution: The Data Archive Server, developed and managed at Fermilab, which provided access to raw and calibrated data in the form of ordinary data files; and the Catalog Archive Server — the subject of the 2021 SIGMOD Systems Award — which provided access to the data through a sophisticated website and highly tuned system of databases. The CAS was developed at Johns Hopkins University in collaboration with individuals from academia, industry and Fermilab. Access to the CAS is available through the SkyServer web portal.

Bill Boroski, the SDSS Project Manager from 1998 through 2008, oversaw all aspects of day-to-day operations, including data distribution. He worked closely with the JHU team on the planning for each data release. He also served as a key liaison between JHU, Microsoft and Fermilab.

Steve Kent was responsible for survey operations, which included setting the observing strategy to ensure that SDSS goals and objectives were met. Involved in the SDSS project from the very beginning, Kent oversaw the planning and development of nearly all software associated with observing and data processing.

Brian Yanny oversaw data processing operations at Fermilab and interacted regularly with the primary development team at JHU during the development of the CAS. Yanny also possesses an intimate knowledge of the SDSS data set and played a key role in establishing data relationships.

The SkyServer was one of the earliest large-scale, publicly searchable, database-backed archives available on the web. When it went live in 2001, it served parameters on 14 million unique sky objects in an 80 GB database from servers residing at Fermilab, due in part to the laboratory’s high-bandwidth network connections. Today’s SkyServer, in its 16th incarnation 20 years later, serves parameters and images of nearly 500 million unique objects in multi-terabyte databases from a set of university-based sites across the United States.

Truly a group effort

The 2021 ACM SIGMOD Systems Award represents the work and contributions of many individuals associated with SDSS, including a large number of people representing many areas of Fermilab, such as the Accelerator Division, Applied Physics and Superconducting Technology Division, Finance, Particle Physics Division, Core and Scientific Computing Divisions, WDRS and Directorate.

On a more somber note, one of the SDSS co-recipients did not live to see the award. Jan Vandenberg passed away May 2021. He was a computer scientist and system architect for all of the computing systems and web hosting services used at Johns Hopkins University for the SDSS project. He designed and maintained the systems used to support the development and commissioning of the CAS.

All award recipients received a plaque, as well as a $5,000 honorarium to be shared collectively. They all agreed to send the honorarium to the Vandenberg family.

Fermilab is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.