>100 Views
May 28, 24
スライド概要
Science by Computing: Classical, AI/ML Invited Talk (DAY-1 : Jan 29, 2024)
Jan Kosinski (EMBL, Hamburg)
The 6th R-CCS International Symposium
https://www.r-ccs.riken.jp/R-CCS-Symposium/2024/
R-CCS 計算科学研究推進室
Integrative structural biology in the era of accurate AI-based structure prediction @jankosinski 6th R-CCS International Symposium 29.01.2024 Jan Kosinski
Macromolecules Biological cell 2
3
Principles of protein structure https://www.onlinebiologynotes.com/level-of-structural-organization-of-protein/ 4
Why study the structure? Video credit: Harvard Online
Why study the structure? Inhibitor SARS-CoV-2 coronavirus main protease Image source: Protein Data Bank
Structure helped designing SARS-CoV-2 vaccine! MERS-CoV spike locked in the pre-fusion conformation Pallesen, …, Jason McLellan, 2017 Structure of betacoronavirus in pre-fusion conformation Kirchdoerfer, …, McLellan, Ward 2016
How are structures determined? Nuclear magnetic resonance (NMR) X-ray crystallography High resolution densities and structures Cryogenic electron microscopy (cryoEM) Cryogenic electron tomography (cryoET) Low resolution density maps
Khanh Huy Bui
Nuclear pore complex 30 proteins 1000 copies 120 MDa Art: Agnieszka Obarska-Kosinska
Credit: Shyamal Mosalaganti, EMBL Heidelberg
Khanh Huy Bui
X-ray structures of nuclear pore proteins …and more
Predicting structures by homology modeling Sequence of the human protein + X-ray structure of the homolog in yeast 14 Homology model of the human protein
?
But we have molecular rulers!
Crosslinking mass spectrometry 17
?
Integrative structural modeling EM map at 23 Å from cryo-electron tomography Crystal structures Homology models Distance restraints from crosslinking/mass spectrometry With Martin Beck, EMBL Heidelberg Nature, 2015 Science, 2016
Assembline Assembly line of macromolecular complexes Vasileios Rantos Kai Karius https://www.embl-hamburg.de/Assembline/ Nature Protocols, 2022 20 Custom pipeline based on: IMP by Andrej Sali lab
21
22
Assembline Coarse-grained multi-scale representation Metropolis Markov Chain Monte Carlo optimization algorithm “Energy” function for optimization Multi-step procedure to narrow down conformational space Nature protocols, 2022
30% Nature, 2015 Science, 2016
New cryo-ET maps at >12 Å 26 Martin Beck lab, MPI Biophysics, FRA
Attempting to update the model… 35MDa 69% of the scaffold Integrative modeling using Assembline New EM map Spring 2021 new crystal structure & homology models Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH
BBC Science Weekendavisen Nature NY Times Slide courtesy of Kresten Lindorff-Larsen
Results from CASP14 avg GDT_TS of model1 (Critical Assessment of protein Structure Prediction) Alphafold2 from DEEPMIND Slide courtesy of Sergey Ovchinnikov
How good the models are? Jumper et al., Nature 2021 31
AlphaFold confidence metrics pLDDT 89.3 pTMscore 0.577 32
What is AlphaFold? • A machine-learning-based model for predicting the 3D structure of proteins using only sequence as input • Trained on known sequences and structures from the Protein Data Bank, as well as large databases of protein sequences Slide courtesy of Kresten Lindorff-Larsen
Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty
We can CHEAT by using evolutionary information! Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty
We can CHEAT by using evolutionary information! By measuring coevolution, we can infer contacts in proteins! citations: bit.ly/3Mr8351 Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty
We can CHEAT by using evolutionary information! Contacts in proteins are evolutionarily conserved and encoded in a MSA (Multiple Sequence Alignment) due to coevolution By measuring coevolution, we can infer contacts in proteins! citations: bit.ly/3Mr8351 Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty
How does AlphaFold work? “DeepMind’s most complex piece of AI so far” – Demis Hassabis Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A. and Bridgland, A., 2021. Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), pp.583-589. 62 Page Supplementary Information, Google colab and open GitHub repository (https://github.com/deepmind/alphafold) See Jumper et al. 2021 (especially the SI) for details Slide modified from Sameer Velankar
Models of single nuclear pore proteins 39
Comparison to unpublished crystal structures 40 Structures by Andre Hoelz lab Caltech
Validation by fitting to the EM map 41
More complete building blocks Missing ~300 aa at C-terminus homology model AlphaFold model
The Great Hack 43
Modeling complexes with ColabFold Sergey Ovchinnikov, Martin Steinegger Colabfold 44
ColabFold/AlphaFold models of subcomplexes
Comparison to unpublished crystal structures 46 Structures by Andre Hoelz lab Caltech
Validation by fitting to the EM map 47
48
HOMOLOGY MODELS J Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH
Modeling procedure + Assembline No crystal structures No homology models + elastic Network to maintain AlphaFold interfaces + Isolde program to build flexible linkers between domains 50 Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH
Human Nuclear Pore Complex modeled with AlphaFold and Assembline Total mass of the nuclear pore complex: 120 MDa Models: Before AlphaFold: 35 MDa After AlphaFold: 70 MDa Science, 2022 With Martin Beck, MPI Biophysics, FRA 51 Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH
Structural basis for understanding transport
Integrative structural biology in the era of accurate AI-based structure prediction AlphaFold-multimer v2.3 RoseTTAfold OpenFold UniFold … AlphaLink Combinatorial assembly of larger complexes: MolPC, FoldDock, AFM-RL Many, if not most, cryoEM structures are now built using AlphaFold or RoseTTAfold starting models 54
AlphaPulldown – protein-protein interaction screens with AlphaFold preys bait … … pulldown mode + … homo-oligomer mode + + + + custom mode Dingquan (Geoffrey) Yu all_vs_all mode Bioinformatics, 2022
AlphaPulldown – protein-protein interaction screens with AlphaFold User friendly Workflow speed improvements Graphical summary (Jupyter notebook) Tabular summary with scores https://www.embl-hamburg.de/AlphaPulldown Dingquan (Geoffrey) Yu Bioinformatics, 2022
AlphaPulldown – protein-protein interaction screens with AlphaFold Feature generation on CPU Inference on GPU create_notebook.py … >_ sequences >_ Analysis tools MSA generation Template search create_individual_features.py protein_1.pkl … … … pulldown mode homo-oligomer mode protein_2.pkl … + alpha-analysis.sif >_ + + + + custom mode protein_n.pkl Graphical summary (Jupyter notebook) all_vs_all mode run_multimer_jobs.py Dingquan (Geoffrey) Yu iptm Tabular summary iptm + ptm with scores PI-score mpDockQ/pDockQ Bioinformatics, 2022
NEW ERA OF STRUCTURE PREDICTION 58
NEW OPPORTUNITIES IN INTEGRATIVE STRUCTURAL MODELING 59
Acknowledgements • Sergey Ovchinnikov, Harvard University • Sameer Velankar, EMBL-EBI • Arne Elofsson, Stockholm University • Kresten Lindorff-Larsen, University of Copenhagen for kindly providing slides for the slides about AlphaFold.
Modeling by: In collaboration with: Agnieszka ObarskaKosinska (MPI BP and EMBL) MPI of Biophysics: Martin Beck Assembline by: Shyamal Mosalaganti Agnieszka Obarska Christian Zimmerli Matteo Alegretti Beata Turonova Jan Kosinski with help of Vasileios Rantos Kai Karius Molecular dynamics by: Marc Siggel (MPI BP) @jankosinski 61 Funding: Gerhard Hummer Marc Siggel Two postdoc positions for computer science postdocs available in 2024!