Yet, these algorithms differ in the amino acid identities of the small number of sites that are incorrectly inferred. Our study confirms that all ASR algorithms correctly infer the vast majority of residues in ancestral sequences. Our benchmarking exercise focused on Bayesian versus maximum parsimony (MP) algorithms, the effect of rate variation when modelled as a discrete gamma distribution 14, subsamples of taxa to infer ancestral sequences, and species-tree-aware versus unaware approaches within the Bayesian framework 15, 16. In particular, we were interested in determining the accuracy of algorithms when inferring ancestral phenotypes since computer simulations have shown that these algorithms infer ancient genotypes with reasonably high accuracy 11, 12, 13. The experimental phylogeny then provides us with an opportunity to benchmark the performance of algorithms that infer ancient sequences. We elected to build the phylogeny using a single monomeric red fluorescent protein (FP), since it is known that FP colour phenotypes are readily modified by a tractable number of amino acid replacements 9, 10. The goal of the phylogeny is thus to create an opportunity to evolve sequences within a controlled framework that adds biological reality given practical limitations. The benefits of the procedure are at least twofold: (1) we can accelerate the process of evolution that generates the vertical inheritance of genetic information necessary for the functional divergence of encoded phenotypes and (2) we have a known record of the ancestral genotypes and phenotypes throughout the experimental phylogeny. To overcome these limitations, we exploited an under-utilized yet effective procedure to develop a phylogeny in the laboratory 8. Notably, genetic material is not preserved in fossils on a long enough time scale to satisfy most ASR studies (many millions to billions of years ago), and it is not yet physically possible to travel back in time to collect samples. It is difficult to benchmark ASR for many reasons. Despite such insights, a major criticism of ASR is the general inability to benchmark accuracy of the implemented algorithms. This process has produced tremendous insights into the mechanisms of molecular adaptation and functional divergence 7. These ancient sequences are most often then synthesized, recombinantly expressed in laboratory microorganisms or cell lines, and then characterized to reveal the ancient properties of the extinct biomolecules 2, 3, 4, 5, 6. Subsampling of extant sequences had minor effect on the inference of ancestral sequences.Īncestral sequence reconstruction (ASR) is the process of analyzing modern sequences within an evolutionary/phylogenetic context to infer the ancestral sequences at particular nodes of a tree 1. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. The 19 leaves then serve as ‘modern’ sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |