Claremont Colleges Computing Infrastructure to Benefit From Second NSF Grant

Share story

A second National Science Foundation grant to build regional computing infrastructure within The Claremont Colleges, has been awarded to a University of Southern California team that is collaborating with Claremont faculty and staff.

“This is great news for the Southern California regional research computing community! In addition to the award we won last year—Science DMZ: infrastructure to enable faster data transfers—this is a great step forward for forging the regional alliance for computational science and engineering infrastructure,” said co-principal investigator Byoung-Do (BD) Kim, associate chief research information officer and director of the Center for Advanced Research Computing at USC. A science “DMZ” is designed to allow approved data nodes to communicate with one another via very high speed connections that bypass normal network security such as firewalls. Los Nettos, through USC, received the earlier grant to build a science DMZ for the regional network.

“Campus Cyberinfrastructure (CC) Regional Computing: Building Cyberinfrastructure to Forge a Regional Research Computing Alliance in Southern California” is a complementary grant for $999,970 to build and support high-performance computing servers that can be used by faculty and students for their research computing needs. The CC program invests in coordinated campus-level networking and cyberinfrastructure improvements, innovation, integration and engineering for science applications and distributed research projects.

“The strength of our applications is derived from our ‘science drivers,’ faculty who provide letters of support and share details about the significance of their research and how it would benefit from computational resources,” says Aashita Kesarwani, a data scientist and member of HMC’s Academic and Research Computing Services staff.

Testimonies from faculty members, including those from Harvey Mudd College (below), described challenges and opportunities related to their projects.

  • Biology professor Eliot Bush and his students have developed methods for reconstructing the history of horizontal transfer events in clades of closely related microbes. He says, “There are now hundreds of thousands of sequenced microbial genomes. One application that we have in mind for our method is to apply it systematically across the tree of life. Doing this would be aided by access to HPC, and also by fast network connections allowing us to quickly get data from NCBI and other sources.”
  • Computer science professor Geoff Kuenning oversees the SNIA IOTTA Trace Repository, a large repository of scientific data primarily hosted at Harvey Mudd College that has been providing collections of file system traces to researchers around the world since 2007 and has been cited in over 580 scientific papers. He says, “The repository currently contains 22 TB of data, and is constantly growing. It is the world’s only reliable, long-term collection of such traces, and the file systems community considers it the ‘go-to source’ for such data. The internet is used to place data in the repository and to provide it to researchers. It can take days to transfer a large contribution to HMC or for a researcher to download that data for their own use. Thus, an enhanced Internet connection would be a tremendous benefit not only to the repository itself, but also to storage researchers throughout the U.S. and worldwide.”
  • Mathematics professor Susan Martonosi and her research group use large-scale optimization techniques to solve problems related to network optimization and resource allocation. This work has applications to interdiction of covert networks, such as terrorist or drug smuggling networks; contract negotiation for pediatric vaccines for low-income countries; and developing interventions to mitigate the propagation of fake news on social media. She and her researchers implement and test models and algorithms across thousands of large test cases for the purposes of sensitivity analysis. “Having access to a high-performance computing infrastructure is critical to complete this work,” says Martonosi.
  • Chemistry professor Katherine Van Heuvelen and her students study how enzymes break down priority pollutants in a process known as reductive dehalogenation. They use computational chemistry methods to model the enzymatic reaction pathway and to design small molecules that can carry out these reactions in the laboratory. This work can be computationally expensive and relies on rapid data transfer between local machines and a supercomputing environment.

The infrastructure will take several years to come online and, when completed, will be shared by The Claremont Colleges and other Southern California colleges.

The grant is a continued effort by the NSF to address the growing requirements of the NSF community. “Campuses today face challenges across multiple levels of cyberinfrastructure,” states the NSF in its CC Program Solicitation, “where meeting the needs of scientific research and education goes far beyond the networking layer in capacity and services, and extends to computing, data services, secure and trustworthy systems, and especially human expertise, collaboration and knowledge-sharing. Recognition of the ‘data driven’ nature of scientific advancement and discovery has led to an increased focus in addressing the data challenges posed by the NSF research and education community.”