Workshop Schedule
May 4, 2018, Temple university, SERC building room 703
8:00-9:00 Registration and Coffee
|
Session I Chair: Allan Haldane |
|
- Michael L. Klein
- Temple CST Dean
- Welcome Remarks
- 9:00-9:10
|
|
- Vincenzo Carnevale
- Temple University
- From Sequence to Function: Coevolving Amino Acids Encode Structural and Functional Domains
- 9:10-9:55
-
 Abstract
Amino acids interactions within protein families are so optimized that the sole analysis of evolutionary co-mutations can identify pairs of contacting residues. It is also known that evolution conserves functional dynamics, i.e., the concerted motion or displacement of large protein regions or domains. Is it, therefore, possible to use a pure sequence-based analysis to identify these dynamical domains? I will address this question by introducing a general co-evolutionary coupling analysis strategy. This sequence-based method partitions amino acids into few clusters. When viewed in the context of the native structure, these clusters have the signature characteristics of viable protein domains: they are spatially separated but individually compact. They have direct functional bearings too, as shown for various reference cases. Thus even large-scale structural and functionally-related properties can be recovered from inference methods applied to evolutionary-related sequences.
|
|
- Andrea Pagnani
- Politecnico di Torino
- Inference of Inter-Protein Interaction from Co-Evolution sequence data
- 9:55-10:40
-
 Abstract
Interaction between proteins is a fundamental mechanism that underlies virtually all biological processes. Many important interactions are conserved across a large variety of species. The need to maintain interaction leads to a high degree of co-evolution between residues in the interface between partner proteins. The inference of protein-protein interaction networks from the rapidly growing sequence databases is one of the most formidable tasks in systems biology today. We propose here a novel approach based on the Direct-Coupling Analysis of the co-evolution between inter-protein residue pairs. Concrete examples discussed the bacterial ribosome protein-protein interaction network, the Trp operon. Finally, I will present an interesting application in the field of immunology: the inference of antigen-antibody affinity from Repertoire Sequencing Data.
|
10:40-10:55 Coffee Break
|
Session II Chair: David Liberles |
|
- Lucy Colwell
- Cambridge University
- Using evolutionary sequence variation to build predictive models of protein structure and function
- 10:55-11:40
-
 Abstract
The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. The explosive growth in the number of available protein sequences raises the possibility of using the natural variation present in homologous protein sequences to infer these constraints and thus identify residues that control different protein phenotypes. Because in many cases phenotypic changes are controlled by more than one amino acid, the mutations that separate one phenotype from another may not be independent, requiring us to understand the correlation structure of the data. We show that models constrained by this structure are capable of (i) inference of residue pair interactions accurate enough to predict all atom 3D structural models; and predictions of (ii) binding interactions between different proteins and (iii) binding between protein receptors and their target ligands.
The challenge is to distinguish true interactions from the noisy and under-sampled set of observed correlations in a large multiple sequence alignment. Current methods ignore the phylogenetic relationships between sequences, potentially corrupting the identification of covarying positions. Here, we use random matrix theory to demonstrate the existence of a power law tail that distinguishes the spectrum of covariance caused by phylogeny from that caused by phenotypic interactions. The power law is essentially independent of the phylogenetic tree topology, depending on just two parameters - the sequence length, and the average branch length of the tree. We demonstrate that these power law tails are ubiquitous in the large protein sequence alignments used to predict contacts in 3D structure, as predicted by our theory, and confirm that truncating the corresponding eigenvectors improves contact prediction.
|
|
- Robert Best
- National Institutes of Health
- Exploring sequence energy landscapes using coevolutionary information
- 11:40-12:25
-
 Abstract
Coevolutionary information can be used to construct a fitness landscape for protein sequences which, by virtue of its simplicity, allows sequence space to be sampled much more effectively than methods which explicitly consider protein coordinates. We have applied such models in three ways. First, in combination with experimental stability data for mutants, we have estimated the number of sequences which can fold to common protein structures (the "sequence capacity" of the fold), and the dependence of this quantity on the properties of the protein structure. Second, inspired by the finding by several groups that the stability of point mutants is strongly correlated with fitness derived from coevolutionary data, we tested whether such models were sufficiently good to be used for sequence design. We have shown that, for three folds representative of different protein structural classes, such models are sufficiently accurate to be used as a tool in protein design. Lastly, we have used coevolutionary fitness models to investigate chimeric sequences that could fold to different structures. We have constructed a joint fitness landscape that can capture the published results of Bryan, Orban and co-workers. We predict that the most likely paths between the two protein folds go through intrinsically disordered sequences.
|
12:25-1:45 Lunch and Poster Session (2nd floor SERC)
|
Session III Chair: Vincenzo Carnevale |
|
- Faruck Morcos
- University of Texas at Dallas
- Inferring Global Landscapes of Protein-RNA recognition to uncover and engineer specificity in RNA structured elements
- 1:45-2:30
-
 Abstract
RNA structured elements build a network of regulatory interactions with repercussions in important cell processes such as transcription, translation and mRNA decay. Although molecular interactions between proteins and RNA elements have been extensively studied, the question of how proteins preferentially interact with different sequences but similar structures is still unresolved. Collections of known binding elements are insufficient to characterize the spectrum of potential mutations that contribute to functional RNA molecules. We developed an integrated framework based on in vitro selection, high-throughput sequencing and global probabilistic modeling to quantify the landscapes of protein-RNA recognition. This approach allowed us to characterize the way that sequence and structural elements confer RNA binding recognition to proteins P22N, 1N and BIV TAT. The parameters of our global model helped discern the most important nucleotide sequence interactions that contribute to recognition. By creating a quantitative metric based on such parameter inference, we can discriminate between regulated and non-regulated elements in their genomic context as well as design functional variants that preserve or enhance specificity. We validated such predictions experimentally and use this framework to quantify pathways that reveal permissive/disruptive evolutionary trajectories. Our framework provides a detailed characterization of protein-RNA recognition with potential applications in unexplored systems.
|
|
- Allan Haldane
- Temple University
- Coevolutionary Landscapes of Kinase Family Proteins: Structural Propensities and Functional Motifs
- 2:30-3:15
-
 Abstract
The co-variation of pairs of mutations contained in multiple sequence alignments of protein families can be used to build a Potts Hamiltonian model of the sequence patterns which accurately predicts structural contacts. This observation paves the way to develop deeper connections between evolutionary fitness landscapes of entire protein families and the corresponding free energy landscapes which determine the conformational propensities of individual proteins. Using statistical energies determined from the Potts model and PDB crystal structures, we predict the propensity for particular kinase family proteins to assume a “DFG-out” conformation implicated in the susceptibility of some kinases to type-II inhibitors, and validate the predictions by comparison with the observed structural propensities of the corresponding proteins and experimental binding affinity data. We decompose the statistical energies to investigate which interactions contribute the most to the conformational preference for particular sequences and the corresponding proteins. We also show how this model has sufficient power to predict the probability of specific subsequences in the highly diverged kinase family, which we verify by comparing to experimental observations in the Uniprot database. We find that the pairwise (residue-residue) epistatic interaction terms of the statistical model are necessary and sufficient to capture higher-than-pairwise mutation patterns of natural kinase sequences.
|
3:15-3:30 Coffee Break
|
Session IV Chair: Ron Levy |
|
- John Barton
- University of California Riverside
- Putting the evolution in coevolutionary analysis
- 3:30-4:15
-
 Abstract
Statistical “coevolutionary” methods are powerful tools for the prediction of protein structure and function in a wide range of contexts. However, it is not entirely clear how energy-based models, which have no natural notion of time, are related to real evolutionary dynamics. This can make interpreting their predictions challenging. In this talk I will describe an alternative approach to infer the fitness effects of mutations, which is based on a stochastic model of population dynamics. Path integral methods from statistical physics are crucial for turning this intimidating inference problem into one that is computationally tractable. Remarkably, the expressions we obtain are similar to ones that have been applied in static coevolutionary analysis. I will describe examples from both simulations and real data of HIV evolution where our approach can be successfully applied.
|
|
- David Liberles
- Temple University
- Existing and New Models for Amino Acid Substitution in Protein Evolution
- 4:15-5:00
-
 Abstract
Comparative genomic and molecular evolutionary studies aim to uncover the relationship between selection on mutations in individual genes/proteins, phenotypic evolution, and adaptation. Theory and tools that enable this analysis have mostly not been rooted in underlying biochemical processes. My group has been involved in building this bottom-up approach to understanding the genotype-phenotype map and its behavior over evolutionary time. The first layer of this approach is to understand amino acid changes in the context of protein structure and inter-molecular interaction. Models have been developed that enable characterization of amino acid changes in the context of fit within the existing structure and the information and performance of these new models will be compared with the performance of different classes of existing models, both structure-aware and structure-independent. New structural models, still in their development, are one path towards extracting additional information about functional change for protein encoding genes under positive selection.
|
|
- Ron Levy
- Temple University
- Closing Remarks
- 5:00-5:10
|
5:30 Reception and Dinner (2nd floor SERC)