Tried to outline some unique ideas in this one: Glass-Box Transformers: How Circuits Illuminate Deep Learning’s Inner Workings,
154