Sparsity

Stitching Sparse Autoencoders of Different Sizes

Dec 1, 2024 SAE Stitching Stitching SAE Sparsity Autoencoders Latents Mechanistic Interpretability

Patrick Leask and Noura Al Moubayed introduce SAE stitching, a new method for mechanistic intepretability, in a poster at NeurIPS 2024.

BatchTopK Sparse Autoencoders

Dec 1, 2024 BatchTopK SAE Sparsity Autoencoders Mechanistic Interpretability Architecture

Patrick Leask contributes to BatchTopK, a new SAE architecture introduced in a NeurIPS'24 poster.