Patrick Leask

PhD Candidate

Durham University Computer Science

He/him

Within this scope...

Conference poster
16 Jul 2025
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Public symposium
2 Jul 2025
Interpreting intelligent machines?
Paper & Poster
24 Apr 2025
Sparse Autoencoders Do Not Find Canonical Units of Analysis
Conference poster
15 Dec 2024
BatchTopK Sparse Autoencoders
Conference poster
15 Dec 2024
Stitching Sparse Autoencoders of Different Sizes
Analysis
17 Aug 2024
Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Workshop
6 May 2024
AI Forensics IRL Meeting
Preprint
1 Nov 2023
CoinRun: Solving Goal Misgeneralisation