A Tutorial of Interpretable and Biologically Plausible LLMs, Section 3

Date:

Slides and Recordings

In this section, we explore model-level interpretability by introducing Transformer Circuit Theories, outlining their mathematical foundations, pinpointing how current models benefit from these principles, and illustrating how such insights inspire the development of modular and sparse next-generation AI architectures.