SRT Showcase — watch a frozen model think, one token at a time

This is a live language model (Qwen-2.5-7B). As it writes an answer, a small read-only instrument reads its internal state and shows you how confident it is and how its “understanding” shifts word by word. Nothing here is pre-recorded.

New here? A 60-second primer

What am I looking at? A real, full-size language model generating text. The right-hand panel and the Introspection tab below are computed live from the model’s own activations as it runs.

What is the “SRT” part? SRT (Semiotic-Reflexive Transformer) is a theory that a model’s understanding is a process that keeps folding back on itself as it reads. I trained a small read-only side-channel — the SRT adapter — that watches the frozen model’s hidden states and reports on that process. It does not change the model’s answer. Think microscope, not filter.

How do I read the screen?

  1. Tinted tokens — each word is shaded by how unsure the model was about it (bright = uncertain, dim = confident).
  2. The meter summarises the run: uncertainty, how much the internal meaning is moving (divergence), how self-referential it is (reflexivity), and whether it has “locked in” one interpretation (regime). Hover any row, or open the glossary at its foot.
  3. Verbalizations translate selected hidden states back into English — the adapter’s best attempt to say what the model was internally representing at that moment. Each carries a round-trip fidelity score so you can judge how much to trust that readout.

Honest caveat. These are observational readouts of internal state, not a lie detector or hallucination detector. Only entropy is a validated confidence signal; the rest is a window for interpretation.

Want the backstory? Project & method on GitHub · Stable adapter on Hugging Face.

Mode
Completion: the model continues your text directly (write a prefix it finishes, e.g. “The sky is blue because”) — often the clearest window into raw introspection. Chat: your text is wrapped in the instruction template, so it answers as an assistant.
Tint tokens by
Which signal colours each token. Entropy = the model’s uncertainty (validated). Divergence = how fast its internal meaning is moving (observational).
On: the SRT side-channel feeds its read-out back into the frozen model. Off: the bare backbone runs alone. Use the A/B tab to see the difference side by side.
16 1024
2 20
1 8
0 1.5
0.1 1
1 1.5

Curated examples — what to watch for

Two modes. Completion (the default) continues whatever text you write — give it a prefix, not a question (e.g. “The capital of Australia is” or “She opened the letter, and the first line read:”), and it carries the thought forward. This is usually the clearest window into raw introspection. Chat wraps your text in the instruction template so the model replies as an assistant — better for questions and tasks. Use the Mode selector above to switch; each example below is tagged with the mode it expects.

Pick a prompt below, then read the signals as it generates:

  • Confident recall (capital of Australia, Pride and Prejudice): low entropy at the fact; the verbalization names the fact itself.
  • False premise (Wall of China from the Moon, walking on the Sun): watch the divergence/regime signals as the model works around an untrue claim.
  • Misconception (10% of the brain): does it correct the myth?
  • Reasoning pivot (train minutes, discount price): divergence spikes at the calculation, not the prose.
  • Genuine uncertainty (rain Tuesday, language in 2035): elevated entropy — many valid continuations.
  • Safety boundary (lock picking): a regime shift as it pivots to declining.
  • Ambiguity ('The old man the boats'): the model commits to one parse.
  • Open-ended / creative (Mars mystery opener): high entropy throughout.
  • Completion prefixes (sky/blue, WWI causes, Python list-vs-tuple, def fibonacci): a bare prefix the model finishes — entropy drops as it commits to a continuation, and the verbalizations track the unfolding thought.
Curated examples