Frank Würthwein is the executive director of the Open Science Grid, which facilitates access to high-throughput computing in the United States. On a recent visit to Fermilab, he gave a talk on the outlook for OSG now that it has reached its 10-year anniversary.
In your recent talk at Fermilab you talked about how Open Science Grid is entering into “teenagehood.” What do you see as OSG’s goals for the next decade?
We’ve always wanted to be open to all of science, but in the first five or six years, the focus was entirely on getting the LHC to work well. We’ve since demonstrated successfully what we had always claimed — whatever we do for the LHC will also be useful generically, because computers don’t care what the bytes are that they compute on or the programs that they run. We now have data to show that OSG benefits all of science.
The next big challenge in the adolescent years is to have the same kind of success not just in broadening across all of science, but also across all types of institutions. OSG would like to be open to anyone from small colleges to leadership-scale facilities in computing. As a crude estimate, that means gigaflops to exaflops, gigabytes to exabytes — six orders of magnitude. Ultimately we want to manage this range of scale because we want to democratize access to computing. A PI at a small college with a bright idea should be able to consume significant resources and get great science done. They shouldn’t be limited by the fact that their institution can’t afford to buy them a huge cluster.
OSG connects researchers and institutions through high-throughput computing. How else can researchers benefit from OSG?
We see an opportunity not just to enable science by having resources and connecting things up and doing plumbing, but to produce our own data. The use of OSG in itself, and its performance characteristics, all of this data that we can take and traditionally have not made available or even collected in a coherent fashion — there could be a science interest in that data. We’re starting to think about how OSG can become a data creator and not just an access provider. OSG five years from now might be an interesting big-data platform for doing computer science analysis of what the sciences do. And people might be writing papers on our data.
What are OSG and Fermilab’s future plans for collaborating on computing?
One of the high-priority projects Fermilab’s Scientific Computing Division [SCD] is focusing on right now is the HEP Cloud Facility. Fermilab could be in a position where not all the hardware it needs is on the Fermilab campus. It could transparently use hardware that is here, hardware that is available anywhere on OSG, collaborating with high-performance computing facilities at DOE and NSF, and hardware that it purchased from cloud providers, so that you have a way to have a service-oriented structure as the interface to the facility. Not all the hardware has to be owned by or operated by people on Fermilab payroll. That maps very well onto the objectives of OSG.
SCD and OSG want to accomplish the same thing for this work. Our vision has always been to have transparent computing across the entire nation along multiple lines of policies, computing on things you own, your friends want to share with you, the nation wants to share with you and where you have an allocation. Also computing that you buy. All of this should be transparent. That’s exactly the same objective that HEP Cloud has. That will be a big driver of what OSG and Fermilab are going to do together in the next several years.