Few models of sequence evolution incorporate parameters describing protein structure, despite its high conservation, essential functional role and increasing availability. We present a structurally aware empirical substitution model for amino acid sequence evolution in which proteins are expressed using an expanded alphabet that relays both amino acid identity and structural information. Each character specifies an amino acid as well as information about the rotamer configuration of its side-chain: the discrete geometric pattern of permitted side-chain atomic positions, as defined by the dihedral angles between covalently linked atoms. By assigning rotamer states in 251,194 protein structures and identifying 4,508,390 substitutions between closely related sequences, we generate a 55-state “Dayhoff-like” model that shows that the evolutionary properties of amino acids depend strongly upon side-chain geometry. The model performs as well as or better than traditional 20-state models for divergence time estimation, tree inference, and ancestral state reconstruction. We conclude that not only is rotamer configuration a valuable source of information for phylogenetic studies, but that modeling the concomitant evolution of sequence and structure may have important implications for understanding protein folding and function.
SEEK ID: https://publications.h-its.org/publications/484
Research Groups: Computational Molecular Evolution
Publication type: Journal
Journal: Molecular Biology and Evolution
Publisher: Oxford University Press (OUP)
Citation: Molecular Biology and Evolution 36(9):2086-2103
Date Published: 1st Sep 2019
Registered Mode: by DOI
Views: 6001
Created: 22nd Oct 2019 at 11:25
Last updated: 5th Mar 2024 at 21:23
This item has not yet been tagged.
None