The Answer is Know

In the realm of computational biology, sometimes learning what isn’t possible is just as important as learning what is. That has been the case with recent research conducted by Professor Yi-Chieh (Jessica) Wu and her colleagues, who have presented an algorithm for assessing feasibility of gene families with multiple loci and samples.

Traditional methods of phylogenetic tree mapping have investigated gene families using methods that are restricted to using data from an individual sample. “If you allow for multiple individuals, you get data that you can’t explain,” Wu says. “That means that there is some explanation, but it can’t be explained by this model.”

Wu’s publication bridges these models by considering a joint model and allowing for multiple loci and multiple samples per species. “Reconciliation Feasibility in the Presence of Gene Duplication, Loss, and Coalescence with Multiple Individuals per Species,” recently accepted to BMC Bioinformatics, was coauthored by Wu, Jennifer Rogers ’16, Andrew Fishberg ’16 and Nora Youngs (Colby College).

“Now we know that the current models are insufficient,” says Wu. “So we know that we have to develop new models or extend current models. It’s revealing a problem that before we could have intuited but couldn’t see. Now we have algorithms that can reconstruct evolutionary history or identify errors in the data that make your model insufficient.”

This work is an extension of Wu’s postdoctoral work, which is already being used in the field. “My work is driven by the idea that biologists might have specific questions to ask, and I want to help them answer them. As we sequence more genomes, data sets will increasingly call for the types of methods we present in this paper.”