Summer Seminar: “Text Analysis Isn’t a Piece of Cake,” Xanda Schofield ’13

June 4, 2021 Add to Calendar

12–1 p.m.



In recent years, experts across a variety of social science and humanistic disciplines have adopted natural language processing technologies to help assist their analyses of large text collections. However, these new projects in computational text analysis are often stymied by obstacles in the critical human work of applying these models: obtaining access to data in a useful format, implementing a processing workflow that attends to things the expert cares about, and analyzing the limited information that a model of text can reflect. In this talk, Xanda Schofield discusses how it can be hard for text analysis novices to navigate the underspecified “recipes” of the text analysis process, focusing specifically on LDA topic models. She will touch on research she's done with students in understanding how text analysis practitioners make meaning from LDA models and how to build software to better support their work. Expect many baking analogies.

Xanda Schofield ’13 (CS & math) is an assistant professor of computer science at Harvey Mudd College. She completed her PhD in computer science at Cornell University in 2019. Her work focuses on practical applications of unsupervised models of text, particularly topic models, to research in the humanities and social sciences.

Contact Carissa Saugstad for Zoom meeting information.