Representations

Sparse Autoencoders Do Not Find Canonical Units of Analysis

SAEs Mechanistic Interpretability Representations

Patrick Leask and Noura Al Moubayed present a paper and poster on SAEs at ICLR'25