In pursuit of developing models and algorithms for reconstructing gene histories across multiple species, Harvey Mudd College researchers have completed a paper that has been accepted to the 11th Association for Computing Machinery Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2020).
Computer science professor Yi-Chieh (Jessica) Wu and seven students will present “An Integer Linear Programming Solution for the Most Parsimonious Reconciliation Problem Under the Duplication-Loss-Coalescence Model,” the culmination of a project that started in summer 2017. Student co-authors are Morgan Carothers ’20, Joseph Gardi ’20, Gianluca Gross (UPenn) ’19, Tatsuki Kuze ’22, Nuo (Ivy) Liu ’20, Fiona Plunkett ’21 and Julia Qian ’22.
The project builds on a model of gene evolution in eukaryotic species.
“This model helps scientists understand differences within and across species, particularly in how genes form and function,” Wu says. “But so far, making inferences under the model relied on either an exhaustive search, which can be computationally expensive, or a heuristic search, which is not guaranteed to find an optimal solution.”
Rather than use a purely computational approach, Wu and her students reformulated the inference problem as an integer linear programming (ILP) problem, which allowed them to use optimization techniques and tools from mathematics. The resulting software is ideal for use in genomic pipelines, particularly for large datasets, because it provides the ability to trade off between accuracy and scalability when examining data.
Each student co-author had ownership of a particular piece of the problem and made substantial contributions to the research. Gross and Liu began formulating the ILP problem in summer 2017. Gardi then completed the formulation in fall 2017, and Plunkett added support for a commercial solver in summer 2018. Carothers simplified the formulation in the 2018–2019 academic year, and Kuze and Qian made the system more robust and ran experiments in summer 2019.
“I am really pleased to see this project wrap up nicely,” Wu says. “It exemplifies how research progresses incrementally through the combined efforts of a team.”
Wu also says she appreciates Harvey Mudd professors Ran Libeskind-Hadas (computer science), Susan Martonosi (mathematics) and Eliot Bush (biology) for their assistance on the project.
“Faculty are fortunate to have both motivated and hardworking students, and colleagues with expertise in other domains who are generous with their time,” Wu says. “These are the reasons we can do this type of research at Mudd.”