"Integrative structural biology in the era of accurate AI-based structure prediction"

>100 Views

May 28, 24

スライド概要

Science by Computing: Classical, AI/ML Invited Talk (DAY-1 : Jan 29, 2024)
Jan Kosinski (EMBL, Hamburg)

The 6th R-CCS International Symposium
https://www.r-ccs.riken.jp/R-CCS-Symposium/2024/

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

関連スライド

各ページのテキスト
1.

Integrative structural biology in the era of accurate AI-based structure prediction @jankosinski 6th R-CCS International Symposium 29.01.2024 Jan Kosinski

2.

Macromolecules Biological cell 2

4.

Principles of protein structure https://www.onlinebiologynotes.com/level-of-structural-organization-of-protein/ 4

5.

Why study the structure? Video credit: Harvard Online

6.

Why study the structure? Inhibitor SARS-CoV-2 coronavirus main protease Image source: Protein Data Bank

7.

Structure helped designing SARS-CoV-2 vaccine! MERS-CoV spike locked in the pre-fusion conformation Pallesen, …, Jason McLellan, 2017 Structure of betacoronavirus in pre-fusion conformation Kirchdoerfer, …, McLellan, Ward 2016

8.

How are structures determined? Nuclear magnetic resonance (NMR) X-ray crystallography High resolution densities and structures Cryogenic electron microscopy (cryoEM) Cryogenic electron tomography (cryoET) Low resolution density maps

9.

Khanh Huy Bui

10.

Nuclear pore complex 30 proteins 1000 copies 120 MDa Art: Agnieszka Obarska-Kosinska

11.

Credit: Shyamal Mosalaganti, EMBL Heidelberg

12.

Khanh Huy Bui

13.

X-ray structures of nuclear pore proteins …and more

14.

Predicting structures by homology modeling Sequence of the human protein + X-ray structure of the homolog in yeast 14 Homology model of the human protein

16.

But we have molecular rulers!

17.

Crosslinking mass spectrometry 17

19.

Integrative structural modeling EM map at 23 Å from cryo-electron tomography Crystal structures Homology models Distance restraints from crosslinking/mass spectrometry With Martin Beck, EMBL Heidelberg Nature, 2015 Science, 2016

20.

Assembline Assembly line of macromolecular complexes Vasileios Rantos Kai Karius https://www.embl-hamburg.de/Assembline/ Nature Protocols, 2022 20 Custom pipeline based on: IMP by Andrej Sali lab

24.

Assembline Coarse-grained multi-scale representation Metropolis Markov Chain Monte Carlo optimization algorithm “Energy” function for optimization Multi-step procedure to narrow down conformational space Nature protocols, 2022

25.

30% Nature, 2015 Science, 2016

26.

New cryo-ET maps at >12 Å 26 Martin Beck lab, MPI Biophysics, FRA

27.

Attempting to update the model… 35MDa 69% of the scaffold Integrative modeling using Assembline New EM map Spring 2021 new crystal structure & homology models Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH

29.

BBC Science Weekendavisen Nature NY Times Slide courtesy of Kresten Lindorff-Larsen

30.

Results from CASP14 avg GDT_TS of model1 (Critical Assessment of protein Structure Prediction) Alphafold2 from DEEPMIND Slide courtesy of Sergey Ovchinnikov

31.

How good the models are? Jumper et al., Nature 2021 31

32.

AlphaFold confidence metrics pLDDT 89.3 pTMscore 0.577 32

33.

What is AlphaFold? • A machine-learning-based model for predicting the 3D structure of proteins using only sequence as input • Trained on known sequences and structures from the Protein Data Bank, as well as large databases of protein sequences Slide courtesy of Kresten Lindorff-Larsen

34.

Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty

35.

We can CHEAT by using evolutionary information! Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty

36.

We can CHEAT by using evolutionary information! By measuring coevolution, we can infer contacts in proteins! citations: bit.ly/3Mr8351 Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty

37.

We can CHEAT by using evolutionary information! Contacts in proteins are evolutionarily conserved and encoded in a MSA (Multiple Sequence Alignment) due to coevolution By measuring coevolution, we can infer contacts in proteins! citations: bit.ly/3Mr8351 Slide courtesy of Sergey Ovchinnikov Image credit: Hetu Kamisetty

38.

How does AlphaFold work? “DeepMind’s most complex piece of AI so far” – Demis Hassabis Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A. and Bridgland, A., 2021. Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), pp.583-589. 62 Page Supplementary Information, Google colab and open GitHub repository (https://github.com/deepmind/alphafold) See Jumper et al. 2021 (especially the SI) for details Slide modified from Sameer Velankar

39.

Models of single nuclear pore proteins 39

40.

Comparison to unpublished crystal structures 40 Structures by Andre Hoelz lab Caltech

41.

Validation by fitting to the EM map 41

42.

More complete building blocks Missing ~300 aa at C-terminus homology model AlphaFold model

43.

The Great Hack 43

44.

Modeling complexes with ColabFold Sergey Ovchinnikov, Martin Steinegger Colabfold 44

45.

ColabFold/AlphaFold models of subcomplexes

46.

Comparison to unpublished crystal structures 46 Structures by Andre Hoelz lab Caltech

47.

Validation by fitting to the EM map 47

49.

HOMOLOGY MODELS J Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH

50.

Modeling procedure + Assembline No crystal structures No homology models + elastic Network to maintain AlphaFold interfaces + Isolde program to build flexible linkers between domains 50 Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH

51.

Human Nuclear Pore Complex modeled with AlphaFold and Assembline Total mass of the nuclear pore complex: 120 MDa Models: Before AlphaFold: 35 MDa After AlphaFold: 70 MDa Science, 2022 With Martin Beck, MPI Biophysics, FRA 51 Agnieszka Obarska-Kosinska MPI Biophysics, FRA & EMBL HH

53.

Structural basis for understanding transport

54.

Integrative structural biology in the era of accurate AI-based structure prediction AlphaFold-multimer v2.3 RoseTTAfold OpenFold UniFold … AlphaLink Combinatorial assembly of larger complexes: MolPC, FoldDock, AFM-RL Many, if not most, cryoEM structures are now built using AlphaFold or RoseTTAfold starting models 54

55.

AlphaPulldown – protein-protein interaction screens with AlphaFold preys bait … … pulldown mode + … homo-oligomer mode + + + + custom mode Dingquan (Geoffrey) Yu all_vs_all mode Bioinformatics, 2022

56.

AlphaPulldown – protein-protein interaction screens with AlphaFold User friendly Workflow speed improvements Graphical summary (Jupyter notebook) Tabular summary with scores https://www.embl-hamburg.de/AlphaPulldown Dingquan (Geoffrey) Yu Bioinformatics, 2022

57.

AlphaPulldown – protein-protein interaction screens with AlphaFold Feature generation on CPU Inference on GPU create_notebook.py … >_ sequences >_ Analysis tools MSA generation Template search create_individual_features.py protein_1.pkl … … … pulldown mode homo-oligomer mode protein_2.pkl … + alpha-analysis.sif >_ + + + + custom mode protein_n.pkl Graphical summary (Jupyter notebook) all_vs_all mode run_multimer_jobs.py Dingquan (Geoffrey) Yu iptm Tabular summary iptm + ptm with scores PI-score mpDockQ/pDockQ Bioinformatics, 2022

58.

NEW ERA OF STRUCTURE PREDICTION 58

59.

NEW OPPORTUNITIES IN INTEGRATIVE STRUCTURAL MODELING 59

60.

Acknowledgements • Sergey Ovchinnikov, Harvard University • Sameer Velankar, EMBL-EBI • Arne Elofsson, Stockholm University • Kresten Lindorff-Larsen, University of Copenhagen for kindly providing slides for the slides about AlphaFold.

61.

Modeling by: In collaboration with: Agnieszka ObarskaKosinska (MPI BP and EMBL) MPI of Biophysics: Martin Beck Assembline by: Shyamal Mosalaganti Agnieszka Obarska Christian Zimmerli Matteo Alegretti Beata Turonova Jan Kosinski with help of Vasileios Rantos Kai Karius Molecular dynamics by: Marc Siggel (MPI BP) @jankosinski 61 Funding: Gerhard Hummer Marc Siggel Two postdoc positions for computer science postdocs available in 2024!