Menu

Home / C2D3 Computational Biology

C2D3 Computational Biology

C2D3 Computational Biology logo

We are living in a very exciting time for biology: whole-genome sequencing has opened up the field of genome-scale biology and with this a trend to larger-scale experiments, whether based on DNA sequencing or other technologies such as microscopy.  However it is also a time of great opportunity for small-scale biology as there is a new wealth of data to build from: one can turn to a computer to ask questions that previously might have taken months to answer in the laboratory. One of the great challenges for the field is analysing the large amounts of complex data generated, and synthesising them into useful systems-wide models of biological processes. Whether operating on a large or small scale the use of mathematical and computational methods is becoming an integral part of biological research.

There remains a world-wide shortage of skilled computational biologists. An important part of C2D3 Computational Biology is an MPhil course based at the Centre for Mathematical Sciences. The 11-month course introduces students to bioinformatics and other quantitative aspects of modern biology and medicine. It is intended especially for those whose first degree is in mathematics and computer science and others wishing to learn about the subject in preparation for a PhD course or a career in industry. Complementing the MPhil course is the Wellcome Trust PhD programme in Mathematical Genomics and Medicine.  Run jointly with the Wellcome Trust Sanger Institute this programme provides opportunities for collaborative research across the Cambridge region at the exciting interfaces between mathematics, genomics and medicine.

History and financial support 

C2D3 Computational Biology came about by the merger of the Cambridge Computational Biology Institute (CCBI) into C2D3 in 2021. The CCBI was established in 2003 to promote computational biology, interpreted broadly, within the University and in the region. It established (2004) the MPhil in Computational Biology programme, founded (2011) the Wellcome Trust Mathematical Genomics and Medicine 4-year PhD programme, and, among other activities, started a popular computational biology annual symposium. The CCBI was involved in setting up and helping to run the Cambridge Big Data (CBD) Strategic Research Initiative out of which the C2D3 Interdisciplinary Research Centre was formed. Similarly the CCBI was part of the group that helped set up the Alan Turing Institute.  

The CCBI received financial support equally from the four science schools of the University: 

  • The School of the Biological Sciences      
  • The School of Clinical Medicine      
  • The School of the Physical Sciences (via DAMTP, Physics, Chemistry)      
  • The School of Technology (via Engineering, Computer Science) 

Space was kindly provided by the Department of Applied Mathematics and Theoretical Physics, within the Centre for Mathematical Sciences. 

MPhil in Computational Biology  

The Cambridge-MIT Institute provided funds to establish the MPhil in Computational Biology and subsequently studentships have been provided by: 

  • Biotechnology and Biological Sciences Research Council      
  • Cancer Research UK      
  • Engineering and Physical Sciences Research Council      
  • Medical Research Council      
  • Microsoft Research 

MGM PhD Programme 

The PhD programme in Mathematical Genomics and Medicine is funded by the Wellcome Trust.

Mailing list

To sign-up to the mailing list, with option to join the C2D3 main mailing list, please complete the appropriate form here.

Talks

Protein Evolution in Sequence Landscapes - From Data to Models and Back

Monday, 25 November 2024, 12.30pm to 1.30pm
Speaker: Professor Martin Weigt, Institute of Biology, Paris
Venue: CRUK CI Lecture Theatre

In the course of evolution, proteins diversify their sequences via a complex interplay between random mutations and neutral selection. As a consequence, we can today observe protein sequences of common evolutionary origin, with almost identical three-dimensional folds and biological functions, which however differ by as much as 70-80% of their amino acids. In my presentation, I will review our efforts to model protein evolution across multiple timescales, from the emergence of single mutations in a protein up to deep evolutionary time scales. To this aim, we first model protein fitness landscapes via generative probabilistic models trained on genomic data, and we show that these models are able to predict the effect of individual mutations, and to generate non-natural but biologically functional proteins. Second, we describe evolution as a stochastic process in these landscapes. The proposed framework accurately reproduces the sequence statistics of both short-time (experimental) and long-time (natural) protein evolution, suggesting applicability also to relatively data-poor intermediate evolutionary time scales, which are currently inaccessible to evolution experiments. Our model uncovers a highly collective nature of epistasis, gradually changing the fitness effect of mutations in a diverging sequence context, rather than acting via strong interactions between individual mutations. This collective nature triggers the emergence of a long evolutionary time scale, separating fast mutational processes inside a given sequence context, from the slow evolution of the context itself.

How to Fold Every Protein: (Mission Accomplished?)

Wednesday, 11 December 2024, 11.00am to 12.00pm
Speaker: Stephen D Fried, John Hopkins University
Venue: CRUK CI Lecture Theatre

Recent advances in artificial intelligence have addressed a long-standing question in protein biophysics: What is the relationship between a protein’s primary sequence and its native three-dimensional structure? On the other hand, the process by which proteins navigate to these native states during their biosynthesis or following their denaturation is perilous, complex, and much less predictable. Many proteins misfold, a process which can sometimes be reverted through chaperones, but which is also associated with a wide range of ailments, particularly neurodegenerative diseases. We became interested in delineating which (kinds of) proteins are capable of refolding into their native conformations spontaneously versus which ones require chaperone assistance. To do so, we developed limited proteolysis mass spectrometry (LiP-MS) methods, a structural proteomic approach that can interrogate protein conformation and misfolding on the proteome scale. These experiments provide a holistic view of what properties facilitate refoldability and have highlighted an important and unexpected role for intrinsically disordered regions. I will also highlight a more recent study wherein we discovered a link between nonrefoldability and cognitive decline, by using LiP-MS to compare hippocampal proteins in old rats that have impaired cognition to those from age-matched animals that retain their spatial memory. These experiments uncover several hundred proteins that endure cognition-associated structural changes (CASCs), and provide evidence that the intersection between protein misfolding and age-related neurological disease expands beyond a small number of amyloid-forming proteins.

Engineering genomes and proteomes as foundational biotechnology for translational research

Monday, 31 March 2025, 1.30pm to 2.30pm
Speaker: Jesse Rinehart, PhD, Associate Professor, Yale University School of Medicine
Venue: CRUK CI Lecture Theatre

Abstract not available

About us

The Cambridge Centre for Data-Driven Discovery (C2D3) brings together researchers and expertise from across the academic departments and industry to drive research into the analysis, understanding and use of data science and AI. C2D3 is an Interdisciplinary Research Centre at the University of Cambridge.

  • Supports and connects the growing data science and AI research community 
  • Builds research capacity in data science and AI to tackle complex issues 
  • Drives new research challenges through collaborative research projects 
  • Promotes and provides opportunities for knowledge transfer 
  • Identifies and provides training courses for students, academics, industry and the third sector 
  • Serves as a gateway for external organisations 

Join us