April 25, 2026 interpretabilitytopologymechanisticsparse-autoencoders

Topology of Thought — Three-Instrument Convergence

Three independent measurement instruments — persistent homology, SIPIT invertibility, and SAE decomposition — converge on a universal three-phase structure inside transformer residual streams. Confirmed across 4 attention-based architectures, falsified in Mamba.

By Adam Kruger

Three independent measurement instruments — persistent homology, SIPIT invertibility, and SAE decomposition — converge on the same structure inside transformer residual streams. The model thinks in three phases: Integration, Thinking, Codec. The pattern holds across four attention-based architectures and breaks exactly where theory predicts it should: in Mamba, where there is no attention.

This post is a summary. The full interactive site, the paper, and the underlying datasets live at topology-of-thought.com.

The three-phase model

Every transformer we measured organizes its residual stream into three computational phases:

Phase	What seems to happen	Where (Qwen3-4B)
Integration	Topological cluster collapse — hundreds of independent clusters merge into one unified manifold	L0–L16
Thinking	Deep computation on the unified representation; maximum intrinsic dimensionality, peak persistent loops	L16–L24
Codec	Human-readable encoding — the model reformats its internal state back into token probabilities	L24–L35

The integration boundary sits at roughly 40–44% of depth across every architecture we tested. We didn't pick that — the model finds it.

Three instruments, one structure

Each instrument measures something different. All three find the same phase boundaries.

1. Persistent homology

Tracks connected components (H0 clusters) and loops (H1 features) in the activation manifold across layers.

Qwen3-4B: 517 clusters → 1 cluster at L16 (complete collapse)
Gemma-4-31B: stays at ~1,900 clusters (sliding-window attention prevents full integration)
Mamba-370m: 551 → 987 clusters (diverges — no collapse at all; see the Mamba counterexample)

Cluster collapse is the topological signature of integration. When it reaches 1, the model has built a single unified representation.

2. SIPIT invertibility curve

SIPIT measures how much of the original input can be recovered from a given layer's activations. High invertibility means the layer preserves input information; low invertibility means the representation has moved far from the input.

Phase	SIPIT score	Reading
Early layers	0.403	Moderate — input still recognizable
Integration peak (L16)	0.208	Minimum — representation maximally transformed
Late layers	0.917	High — information re-encoded for output

The curve traces the same three phases: input preservation drops through integration, bottoms out during deep thinking, then recovers as the model builds its output encoding.

3. SAE feature decomposition

Sparse autoencoders decompose each layer's activations into interpretable features. The character of those features shifts across phases.

SAE loss gradient: 0.072 (early) to 0.044 (late) — features become more compressible as the model moves toward output
225 features labeled across 9 layers, revealing a shift from abstract to concrete

Representative feature labels:

Layer	Example features	Character
L6	Semantic grouping, syntactic structure	Abstract, relational
L16	Conceptual integration, entity binding	Peak abstraction
L34	"Finality / irreversibility"	High-level but output-oriented
L50	"Punctuation / whitespace"	Pure formatting codec

The transition from L34 features like "finality / irreversibility" to L50 features like "punctuation / whitespace" is the Thinking-to-Codec boundary made visible at the feature level.

Mamba falsification

Mamba-370m is a state space model with zero attention. It processes sequences through selective state transitions instead of token-to-token attention.

Result: no cluster collapse. Clusters go from 551 to 987 — the opposite direction.

This is the falsification we wanted. The three-phase structure appears to require attention as the mechanism for integration. Without direct token interaction, there's no manifold unification, no phase transition. Mamba processes information; it just doesn't integrate it topologically.

Three conditions appear to be required together:

A direct interaction mechanism (attention)
Optimization (training)
Both at once — untrained transformers show the opposite pattern (503 → 968 clusters)

Cross-architecture results

Architecture	Type	Integration layer	Cluster collapse	Phases present
Qwen3-4B	Dense transformer	L16 (44%)	517 → 1	All three
Gemma-3-1B	Sliding attention	L11 (42%)	Partial	Modified
Gemma-4-31B	Sliding window	—	1,900 stable	Attenuated
Mistral-7B	Grouped-query attention	L12–14 (~40%)	Full collapse	All three
Mamba-370m	SSM (no attention)	—	551 → 987	None

Sliding-window attention (Gemma-4-31B) appears to prevent full collapse because tokens can only attend to a local neighborhood — the integration mechanism is physically constrained. This is architecture, not scale: a 31B model with sliding windows shows less integration than a 4B model with full attention.

The progenitor — Gemma-4 writing Mojo kernels

The measurement instruments were built with help from Gemma-4-27B, which wrote the Mojo kernels used in the analysis pipeline:

Kernel	Latency	Purpose
SIPIT	148 µs	Invertibility scoring
SAE	949 µs	Sparse feature extraction
Activation tap	2 ms	Real-time layer hooking

These are production kernels — compiled Mojo, not Python prototypes. The model that helped build the instruments is itself a subject of the measurements.

Where this leaves us

The three-phase structure isn't an artifact of any one measurement technique. Three independent instruments find the same boundaries. The pattern holds across dense transformers, breaks in SSMs, and degrades predictably with architectural constraints on attention span.

The model isn't just processing tokens. It's building a unified internal world, thinking in that space, then translating back for human consumption. Topology is one way to make that visible.

The full paper, methodology details, and interactive visualizations are at topology-of-thought.com.

Built by Light of Baldr LLC. Get in touch if any of this is useful to you.