Researchers have located surprisingly sophisticated concepts in large language models, such as truthfulness and emotional valence. What makes a good concept representation, and what methods reliably give us control over how models use these concepts? I will present recent progress in evaluating concept representations, from directions (e.g., from sparse autoencoders or steering vectors) to affine and convex hulls. Then, I will turn to the question of how to locate and control relevant features.
How can we move from observing what a model does to understanding why it does it? In this talk, I argue that causality is the key to uncovering the mechanisms underlying model predictions. First, I examine a “macro” view of model analysis, showing how econometric tools—such as regression discontinuity or difference-in-differences—can isolate the causal impact of specific design choices, like tokeniser and training data selection, on a model’s outputs.
In this talk, I will present CodeScaler, a novel framework designed to overcome the scalability bottlenecks of Reinforcement Learning from Verifiable Rewards (RLVR) in code generation. While traditional RLVR relies heavily on the availability of high-quality unit tests—which are often scarce or unreliable—CodeScaler introduces an execution-free reward model that scales both training and test-time inference.
We warmly invite you to the C2D3 Computational Biology Annual Symposium 2026. This event is open to everyone in the Computational Biology Community.
https://www.c2d3.cam.ac.uk/events/comp-bio-2026
Early Career Researcher: Abstract Submission
We are inviting Early Career Researchers to present their research during the symposium. Talks should be 17 minutes each, and a short Q&A will follow. Abstract submission - Deadline 9am 1st April 2026.
Registrations
Registration is essential. A waitlist will open if capacity is reached. Registrations - Deadline 9am Monday 4th May 2026.
This free event is open only to members of the University of Cambridge (and affiliated institutes). Please be aware that we are unable to offer consultations outside clinic hours.
If you would like to participate, please sign up as we will not be able to offer a consultation otherwise. Please sign up through the following link: https://forms.gle/Tbk2JKH6Sm3CbA8SA. Sign-up is possible from May 7 midday (12pm) until May 11 midday or until we reach full capacity, whichever is earlier. If you successfully signed up, we will confirm your appointment by May 13 midday.
Michael Sparks - Software Sustainability Institute
The Research Software Quality Toolkit (RSQKit; https://everse.software/RSQKit/), developed by the EVERSE project, lists curated best practices for improving the quality of research software. It is intended for researchers, research software engineers, as well as those running research infrastructures involving software or engaged in research software policy and funding.
Lotem Peled-Cohen (Technion - Israel Institute of Technology)
This talk presents my PhD research, supervised by Prof. Roi Reichart, exploring the intersection of Large Language Models (LLMs) and Alzheimer’s and related dementias. I begin by presenting our survey and perspective paper, in which we map the field’s current state and identify critical research gaps, such as data scarcity and the need for LLM-based simulation.
Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
Scientific discovery emerges not from isolated reasoning, but from the intersection of diverse epistemic traditions. This talk proposes that the modern AI ecosystem, a structured network of heterogeneous reasoning agents spanning approximate and rigorous inference, constitutes a new form of collaborative intelligence for scientific inquiry. Drawing on Simon's conception of reasoning as adaptive search, we argue that such ecosystems do not merely accelerate known reasoning pathways, but create conditions under which genuinely novel representations may emerge.
Monotonicity is a common and often necessary assumption in biomedical research. In multiplex assays, biomarker expression is expected to have a monotonic association with disease outcome; similarly, in dose-finding studies, the probability of a response or toxicity outcome is expected to increase with dose.
Kirsty Pringle - Software Sustainability Institute; EPCC, University of Edinburgh
Research Software Engineers (RSEs) collaborate with researchers to develop and maintain software, helping to embed best practices that improve reliability and reduce inefficiencies in research workflows.
As awareness grows of the environmental impact of computational research, a new specialism - Green RSE - is beginning to emerge.
Green RSEs integrate sustainability into software development, ensuring environmental considerations are addressed alongside performance and usability.
Abstract: Neural networks have shown remarkable performance across data domains, especially in regimes of increasing compute budgets. However, fundamental insights into how neural networks process information, share representations and traverse loss landscapes remain uncertain. In this work, we quantify the functional impact of distribution matching, facilitated by knowledge sharing mechanisms such as knowledge distillation, under student-teacher optimisation strategies.
This free event is open only to members of the University of Cambridge (and affiliated institutes). Please be aware that we are unable to offer consultations outside clinic hours.
If you would like to participate, please sign up as we will not be able to offer a consultation otherwise. Please sign up through the following link: https://forms.gle/5dHfs6vJrrvTbqst5. Sign-up is possible from May 21 midday (12pm) until May 25 midday or until we reach full capacity, whichever is earlier. If you successfully signed up, we will confirm your appointment by May 27 midday.