Research
Research Areas
Interpretability
Understanding how AI systems make decisions and what they learn from data.
Evaluations
Developing methods to assess AI capabilities, alignment, and safety properties.
AI Agency
Understanding and managing autonomous AI behavior and decision-making.
Security
Protecting AI systems from adversarial attacks and malicious use.
Research Opportunities
Undergraduates
We can advise and support you on dissertation and individual study projects.
Faculty
We can signpost promising research directions and funding opportunities, and support you throughout.
Recent Research by Durham AISI Members
Academic Publications
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Leask, P., & Al Moubayed, N. (2025, July). Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models. Presented at International Conference on Machine Learning (ICML 2025), Vancouver, Canada.
Sparse Autoencoders Do Not Find Canonical Units of Analysis
Leask, P., Bussmann, B., Pearce, M. T., Isaac Bloom, J., Tigges, C., Al Moubayed, N., Sharkey, L., & Nanda, N. (2025, April). Sparse Autoencoders Do Not Find Canonical Units of Analysis. Presented at The Thirteenth International Conference on Learning Representations (ICLR 2025), Singapore.
Probing by Analogy: Decomposing Probes into Activations for Better Interpretability and Inter-Model Generalization
Leask, P., & Al Moubayed, N. (2025). Probing by Analogy: Decomposing Probes into Activations for Better Interpretability and Inter-Model Generalization. Presented at Mechanistic Interpretability Workshop at NeurIPS 2025.
Order by Scale: Relative-Magnitude Relational Composition in Attention-Only Transformers
Farrell, T., Leask, P., & Al Moubayed, N. (2025). Order by Scale: Relative-Magnitude Relational Composition in Attention-Only Transformers. Presented at Socially Responsible and Trustworthy Foundation Models at NeurIPS 2025.
Other Research & Projects
Ghost Marks in the Machine: A Critical Review of SynthID for Code Provenance Monitoring
Sherratt-Cross, E., Farrell, T., Ogden, S., & Ryley, O. (2025, November). Ghost Marks in the Machine: A Critical Review of SynthID for Code Provenance Monitoring. Presented at Apart Research Sprint.