HMC Biologists Develop New Software

Sometimes writing software is like any other kind of writing; it takes a few drafts to get it right, but the final result is worth the extra effort. Harvey Mudd College Biology Professor Eliot Bush knows that feeling. His recently released software program, xenoGI, which helps researchers reconstruct the history of genomic island insertions in clades of closely related microbes, didn’t exactly happen overnight. “We started it more than five years ago,” he says. “We had students working on it already in 2012. So it’s been percolating for a while. In summer 2016, I started to get more serious about it, and rewrote the system from scratch, and added a number of new features.”

A paper describing the project, “xenoGI: Reconstructing the history of genomic island insertions in clades of closely related bacteria” was published by BMC Bioinformatics Feb. 5. The team— Bush, Anne Clark ’13, Carissa DeRanek ’19, Alexander Eng, Juliet Forman ’18, Kevin Heath ’16, Alexander Lee ’14, biology professor Daniel Stoebel, Zunyan Wang ’18, Matthew Wilber ’17 and Helen Wu ’12—has made the code available on Github so that others can use it and help work out any bugs.

While studying how bacteria regulate themselves genetically, Harvey Mudd biology researchers couldn’t find software to handle the scope of their work, so they decided to write their own.

“This really came out of Dan Stoebel’s research interests,” Bush says. Stoebel studies gene regulation in bacteria and was interested in studying the regulation of genes arriving in genomic islands. Genomic islands are clusters of genes that have entered a genome via horizontal transfer, that is, outside of the normal method of parent-offspring inheritance. Stoebel hoped to research a large clade of E. coli bacteria, but existing software packages didn’t look at a species’ whole evolutionary history in this way. So, Bush developed a software package that helps users understand the adaptive path that has produced specific living species.

“Every gene in that group has one of two origins,” says Bush. “Either it was present in the common ancestor of those species, or it entered the group in a horizontal transfer event. The goal of the software is to distinguish these two things, for all the genes in each strain.” The ability to identify the history of genomic island insertions in a clade of bacteria makes xenoGI novel compared to previous software programs.

Genomic islands play a broad role in microbial genome evolution, providing a mechanism for strains to adapt to new ecological conditions. “XenoGI is an effective tool for studying the history of genomic island insertions because it identifies genomic islands and determines which branch they inserted on within the phylogenetic tree for the clade,” says Bush.

So far, Bush is pleased with the response he’s received from people using xenoGI, and there has been a fair amount of traffic on the Github page.

Bush believes the program holds potential to improve other areas of research, including whole genome alignment between bacteria. “Alignment involves figuring out which parts of one sequence correspond to which parts of another one,” he says. “These days it’s possible to make alignments between whole genomes, and many kinds of researchers make use of such alignments. I think a lot of the stuff xenoGI figures out in the process of doing its thing could be very useful for the whole genome alignment problem.”