Integrating Distributed-Memory Machine Learning into Large-Scale HPC

Lawrence Livermore National Laboratory Computer Science, 2017-18

Liaison(s): Cyrus Harrison, Ming Jiang, Brian Gallagher, Matt Larsen
Advisor(s): Christopher Stone
Students(s): Amy Huang (PM), Evan Chrisinger, Jeb Bearer, Katelyn Barnes

Supercomputers provide the computing power for complex physics simulations, but these simulations require frequent manual adjustments to pre-vent run-time failures. Machine learning is a potential solution for automating this process. The LLNL clinic team is developing a machine learning model appropriate for supercomputers that can learn from the output of physics simulations as they run in real time.