Simulating proteins on a millisecond time-scale – Bioinformatics Centre - University of Copenhagen

Bioinformatics Centre
Resize Print Bookmark and Share

Bioinformatics Centre > Research > Structural bioinformatics > Protein dynamics

Simulating proteins on a millisecond time-scale

Details of the grant

This project is funded by the strategiske forskningsråds programkomite for nanovidenskab og -teknologi, bioteknologi og IT between Sept. 2007 and Feb. 2011 with 7.8 million DKK. The grant's PI is Prof. Anders Krogh . The daily leadership of the project is in the hands of Assoc. Prof. Thomas Hamelryck . The project is in close collaboration with Novozymes A/S. Key scientific collaborations include Assoc. Prof. Jesper Ferkinghoff-Borg, DTU and Prof. Kanti Mardia, University of Leeds, UK

Summary

The focus of the project is the exploration of the dynamics behaviour of proteins beyond what is currently possible. Molecular dynamics simulations are widely used in science, medicine and biotechnology to obtain a detailed view of the motions in biological macromolecules. Applications include understanding protein folding, drug design, increasing protein stability, identification of mutations that underly disease, improving the properties of enzymes and understanding diseases that involve protein misfolding such as Alzheimer's and type 2 diabetes. However, the use of molecular dynamics is severely hampered by several problems: the method requires huge amounts of computer time and suffers from several inherent limitations. Many relevant biological processes take place on time scales between ten milliseconds and one second, which is totally out of reach for conventional molecular dynamics simulations. 

The structure group in KU's bioinformatics center has developed a revolutionary approach to the prediction, design and simulation of protein structure and dynamics, based on the use of probabilistic models, Bayesian machine learning methods and directional statistics, which form the first cornerstone of the project. For more information on our statistical approach to protein structure prediction, see our articles on probabilistic models of protein structure that appeared in PLoS computational Biology (2006) and PNAS (2008), and the review on probabilistic methods in structural bioinformatics (2009).

The second cornerstone is the use of efficient methods to explore the conformational space of proteins. Instead of classic molecular dynamics methods, we make use of Markov chain Monte Carlo methods (MCMC) that are based on sampling. For this, efficient MCMC methods and conformational samplng methods are necessary. This part of the project is led by and done in collaboration with Assoc. Prof. Jesper Ferkinghoff-Borg, DTU .  

Published research highlights


Personnel

Group leader Postdocs
  • Mikael Borg (Jan. 2008-April 2010)
  • Martin Paluszewski (Sept. 2008-July 2010)
  • Wouter Boomsma (July 2008-June 2009)
PhD students

Publications

  • Boomsma, W., Borg, M., Frellsen, J., Harder, T., Stovgaard, K., Ferkinghoff-Borg, J., Krogh, A., Mardia, KV. and Hamelryck, T. (2008) PHAISTOS: protein structure prediction using a probabilistic model of local structure.  Proceedings of CASP8, Cagliari, Sardinia, Italy, December 3-7 2008. pp 82-83. PDF@CASP8
  • Hamelryck, T. (2009) Probabilistic models and machine learning in structural bioinformatics. Statistical Methods in Medical Research, Review. 18, 505-526.  PDF@SMMR
  • Borg, M., Mardia, KV., Boomsma, W., Frellsen, J., Harder, T., Stovgaard, K., Ferkinghoff-Borg, J., Røgen, P., Hamelryck, T. A probabilistic approach to protein structure prediction: PHAISTOS in CASP9. LASR 2009 - Statistical tools for challenges in bioinformatics, pp. 65-70. Leeds university press, Leeds, UK. Free PDF@LASR 2009
  • Paluszewski, M., Hamelryck, T. (2010) Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks. BMC Bioinformatics, 11:126. Free PDF@BMC 
  • Harder, T., Boomsma, W., Paluszewski, M., Frellsen, J., Johansson, KE., Hamelryck, T. (2010) Beyond rotamers: A generative , probabilistic model of side chains in proteins. BMC Bioinformatics, 11:306. Free PDF@BMC
  • Paulsen, J., Paluszewski, M., Mardia, KV., Hamelryck, T. (2010) A probabilistic model of hydrogen bond geometry in proteins. LASR 2010 - High-throughput sequencing, proteins and statistics, pp. 61-64. Leeds university press, Leeds, UK. PDF@LASR
  • Stovgaard, K., Andreetta, C., Ferkinghoff-Borg, J., Hamelryck, T. (2010) Calculation of accurate small angle X-ray scattering curves from coarse-grained protein models. BMC Bioinformatics, 11:429.  PDF@BMC Bioinformatics
  • Hamelryck, T., Borg, M., Paluszewski, M., Paulsen, J.,  Frellsen, J., Andreetta, C., Boomsma, W. Bottaro, S., Ferkinghoff-Borg, J. (2010) Potentials of mean force for protein structure prediction vindicated, formalized and generalized. PLoS ONE, 5(11): e13714. PDF@PLoS ONE , Preprint@arXiv
  • Mardia, KV.,  Frellsen, J.,  Borg, M.,  Ferkinghoff-Borg, J., Hamelryck, T. A statistical view on the reference ratio method, LASR 2011 - High-throughput sequencing, proteins and statistics, pp. 55-61. Leeds university press, Leeds, UK. PDF@LASR
  • Olsson, S., Boomsma, W., Frellsen, J., Bottaro, S., Harder, T., Ferkinghoff-Borg, J., Hamelryck, T. (2011) Generative probabilistic models extend the scope of inferential structure determination. J. Magn. Reson. 213(1), 182-6. PDF
  • Harder, T., Borg, M., Boomsma, W., Røgen,  P., Hamelryck, T. (2012) Fast large-scale clustering of protein structures using Gauss integrals. Bioinformatics. 28(4), 510-515. PDF@Bioinformatics.
  • Bottaro, S., Boomsma, W., Johansson, K.E., Andreetta, C., Hamelryck, T., Ferkinghoff-Borg, J. (2012) Subtle Monte Carlo updates in dense molecular systems. J. Chem. Theory Comput.  Accepted. Preliminary PDF@ACS

Publications in preparation

  • A null model for a protein's conformational space. Harder, T. et al.
  • Muninn: a C++ toolkit for generalized ensemble Markov chain Monte Carlo sampling. Frellen, J. et al.
  • Modeling flexible multidomain proteins using SAXS data. Andreetta, C., Stovgaard, K, et al.

Software

All software released under this project is open source, released under the GPL license through SourceForge .

  • Mocapy++: a toolkit implemented for training and using dynamic Bayesian networks, with special facilities for the formulation of probabilistic models of protein structure.
  • Phaistos: a molecular modelling toolkit implemented in C++, containing probabilistic models of proteins structure, several force fields, a generalized ensemble MCMC method, and a highly efficient conformational sampling method.
  • Muninn : a generalized ensemble MCMC method, implemented in C++.