CS Colloquium: “A theoretical CS lens on language modeling,” Clayton Sanford
April 3, 2026 11 a.m.–12:15 p.m.
Location
Shanahan Center, Auditorium
320 E. Foothill Blvd.
Claremont, CA 91711
Contact
Morgan McArdle
mmcardle@g.hmc.edu
909.607.0299
Details
Multi-layer transformer models form the backbone of modern deep learning, yet little mathematical work details their benefits and deficiencies as compared with other architectures. This makes it difficult to answer practical and fundamental questions about the transformer architecture: What powers increase with model depth? Can alternative architectures improve efficiency without sacrificing expressivity? Clayton Sanford presents a communication-based theoretical framework for understanding the representational capabilities and limitations of multi-layer transformers. These results imply that parallelizability is a key property of the standard transformer that other architectures cannot easily replicate. Stanford contextualizes these results within the broader conversation about the challenges of developing a principled theory of neural networks and share opinions on how theoretical computer science can remain relevant to their study.
Speaker
Clayton Sanford is a senior research scientist at Google, where he works on distillation and pretraining for Gemini. He has a PhD in computer science from Columbia University and studied machine learning theory with advisors Rocco Servedio and Daniel Hsu. His research focuses on the theoretical capabilities of neural architectures, particularly transformers.
More information on this event ›
This event is for: faculty, staff, students
Community Connections events provide opportunities for HMC faculty, students and staff to cultivate community, foster open conversations and share important information as together we live out our mission and shape the future of the College.