CS Research Applies Phylogenetic Reconciliation to Study Pathogen Evolution

Share story

To track and prevent the spread of emerging pathogens such as SARS-CoV-2, it is necessary to have an accurate understanding of their evolutionary history. While phylogenetic methods enable researchers to reconstruct these histories from sequence data, viruses undergo rapid evolution and frequent recombination, making them challenging to analyze using traditional methods. A study by Harvey Mudd College computer science professor Yi-Chieh (Jessica) Wu and colleagues presents a phylogenetic workflow, virDTL, for analyzing such evolution. The paper, “virDTL: Viral recombination analysis through phylogenetic reconciliation and its application to sarbecoviruses and SARS-CoV-2,” was published in the Journal of Computational Biology in August.

Often, researchers infer relationships between species by using the entire genome or using conserved regions of the genome. However, in viruses, recombination causes different parts of the genome to have different histories. By reconstructing histories at the gene level, and seeing how these histories change along the genome, researchers gain a better understanding of how evolution has shaped genes and species. “In this work, we recognized that inferring recombination in viruses is in many ways similar to detecting horizontal gene transfer in prokaryotes, at least in terms of how it can be framed as a computational problem,” says Wu. “At the same time, since viruses evolve rapidly, we have to distinguish biological signal from noise, which can affect how we reconstruct a strain tree, how we reconstruct and root gene trees, and how we map gene trees inside a strain tree. Our approach is uniquely able to identify ancestral recombinations while accounting for these sources of uncertainty. We applied this workflow to SARS-CoV-2 and several related genomes to provide a more complete picture of its evolutionary history. Our analysis supports the growing body of work that the zoonotic origin of SARS-CoV-2 is likely horseshoe bats. In addition, we identify several ancestral recombination events with other viral strains that merit further study.”

Wu collaborated on the article with graduate students and professors at the University of Connecticut and MIT. “We started this work in March 2020, so it is nice to see the manuscript officially accepted, which will help disseminate our approach and findings,” she says.

Wu’s research interests lie in computational biology, with an emphasis on evolutionary genomics. She develops and applies computational and mathematical models and methods to reconstruct gene histories across multiple species, with the goal of understanding how evolution shapes gene content and function and leads to similarities and differences across species over time.