AI/ML virtual seminar: Bridging the gap between simulations and instrument data – domain adaptation for deep learning in astronomy

  • April 22, 2021, 11:00 am US/Central
  • Zoom

Speaker: Aleksandra Ćiprijanović, Fermilab

Abstract: In astronomy, high-energy physics as well as other sciences, neural networks are often trained on simulation data with the prospect of being used on real instrument data. Astronomical large-scale surveys are already producing very large datasets, and machine learning will play a crucial role in enabling us to fully utilize all of the available data. Unfortunately, training a model on simulated data and then applying it to observations can potentially lead to a substantial decrease in model accuracy on the new target dataset. In this context, simulated and telescope data represent different data domains, and for an algorithm to work in both, domain-invariant learning is necessary. In this talk we’ll cover two domain adaptation techniques— Maximum Mean Discrepancy (MMD) and Domain Adversarial Neural Networks (DANNs)— which can help improve model performance and bridge the gap between two datasets. We study the problem of distinguishing between merging and non-merging galaxies in simulated (Illustris-1 cosmological simulation) and observational data (Sloan Digital Sky Survey). Understanding galaxy mergers is an important step in understanding the evolution of matter in the universe, and our ability to utilize and combine knowledge from different data domains will be very important for these efforts. With further development, these techniques will allow different domain scientists to construct machine learning models that can successfully combine the knowledge from simulated and detector data or data originating from multiple instruments.

Zoom info can be found at