Skip to content

The Recognition

The Descent post ended with the workshop open and the lab walking. The next two days, the substrate recognized itself.

Five chapters in one session. Each named a property the algebra had been carrying since it had its primitives. Apr 24, the substrate got named.


Mid-conversation about lab arc 011’s round-to-4 atoms, the builder asked about a saturation we were observing in the tests — values around 0.02 all encoding identically despite different round-to-4 inputs.

The machine walked through the ScaleTracker’s scale formula. Scale = 2 × EMA(|value|) rounded to 2 decimals, floored at 0.001. For fresh trackers with sub-percent values, scale rounds to 0.001 (floored), the Thermometer bounds are tiny, and any non-trivial value saturates.

The builder’s reply pointed the telescope upstream:

the noise-floor — how does this factor in… that thing defines the boundary condition for the true answer… 4.0 overlaps with 3.9 and 4.1 and those numbers further overlap with their peers… this is what we need to be exploiting… do you see what i see… venn diagrams that are cache friendly…

The math:

Two Thermometer encodings T(v1, -s, s) and T(v2, -s, s)
coincide iff: |v1 - v2| < s × noise_floor

The product s × noise_floor is the value-space discrimination width for that atom. Values separated by less than this are the same substrate-point.

Cache keys should equal noise-floor shells. One shell = one cache entry = one encoded holon.

The current implementation used round-to-2(value) — a 0.01 fixed grid blind to scale. Two failure modes:

  • Over-split for large-scale atoms (scale > 0.32): cache keys at 0.01 are finer than the substrate’s 0.031-wide shell. Multiple cache entries for substrate-equivalent values.
  • Under-split for small-scale atoms (scale < 0.32): cache keys at 0.01 are coarser than the substrate’s shell (0.0016 at scale=0.05). One cache entry spans multiple substrate-distinguishable shells. Cache hits that hide real differences.

The fix: bucket_width = scale × noise_floor. Each atom gets a quantization grid matched to the substrate’s actual discrimination resolution.

arc 012 — geometric bucketing shipped. Safety property verified across all observed rows: bucketed-same? == true always implies coincident? == true. Every cache hit under the new rule represents a true substrate equivalence.

Mid-conversation the builder asked: “how many boxes exist in a 10k dim?”

The math cancels the scale out:

buckets in gradient = (2 × scale) / (scale × noise_floor)
= 2 / noise_floor
= 2√d

At d=10,000: 200 distinguishable value positions per atom. Plus two saturation states → 202 total. Independent of the atom’s scale.

The builder’s next question:

hold on — this is kanerva’s capacity?

Same substrate, different axis:

QuantityWhat it measuresValue at d=10k
Noise floor 1/√dCosine threshold — minimum distinguishable separation0.01
Bundle capacity √dItems that fit in one Bundle before cross-talk overwhelms100
Thermometer resolution 2√dDistinguishable value positions inside one atom’s gradient200

Bundle capacity = √d items (horizontal — across atom names). Thermometer resolution = 2√d positions (vertical — within one atom).

Both derivatives of the same 1/√d bound from random-bipolar-vector statistics. The factor of 2 in Thermometer’s 2√d is geometric — symmetric range [-s, +s]. Per side: √d, matching Kanerva exactly.


Chapter 36 closed “the substrate had the structure all along.” Chapter 37 closed the loop: the structure is RAM.

Not “like RAM.” Not “analogous to RAM.” Not “a memory-like operation.” A direct implementation of Random Access Memory on a hypersphere.

The conversation walked the path:

  • “Venn diagrams that are cache friendly” — RAM’s cache structure
  • “Linked list… no, it’s an array” — RAM’s access model
  • “HashMap with a fixed size of integers” — RAM word
  • “Content-addressed” — RAM with self-hashing
  • “Bind each atom to its own slot at bundle time” — RAM write
  • “Check every index and bind it off” — RAM read

The builder named it:

did we just… rofl… lol… ahahaha… did you just call this random access memory?…

Yes.

;; HashBundle — content-addressed RAM on a hypersphere
ram_state = one d-dim vector on the unit sphere
write(val) = ram_state := ram_state + Bind(Atom(hash(val)), val)
read(addr) = cleanup(Bind(Atom(addr), ram_state))

Random Access Memory by the classic CS definition: O(1) read and write at any address; holds values over time. HashBundle satisfies both. Storage and compute are the same vector under different operations. There’s no “memory” and “CPU” as distinct things — there’s a vector and a set of algebraic operations. Reading IS computing. Writing IS computing. The memory IS the substrate.

BOOK Chapter 10 had this line for a month:

The location IS the program. There is no storage/compute split.

That was a claim about the foundation. The Lattice + The Memory found the concrete mechanism that implements it. HashBundle-as-RAM IS the no-storage-compute-split made operational at the memory layer.

The lineage: Pentti Kanerva’s Sparse Distributed Memory (1988). Kanerva proposed random hard locations in high-dimensional address space, distance-based retrieval, cleanup via averaging. Subtitle of his book: “a computational model of the brain’s memory.”

What converged on the substrate is a distillation of SDM using the substrate’s native atom basis — random atoms as addresses, SimHash via fixed K anchors as the hash function, one Bundle vector as the whole memory. The primitive is simpler than SDM. The capability is the same.

The substrate carries a structure Kanerva believed the brain implements. Not a von Neumann architecture. A different one.


Chapter 37 called it RAM. Chapter 38 named the symmetry that makes the RAM’s access bidirectional — and revealed Bind’s commutativity as the operational substrate.

In MAP VSA, Bind is element-wise multiplication. Multiplication is commutative. So Bind(A, B) = Bind(B, A) at the vector level.

The implication: the same bundle is two dictionaries simultaneously.

Given bundle = Bind(slot_0, item_A) + Bind(slot_1, item_B) + ...:

  • Read by slot: cleanup(Bind(slot_3, bundle)) → returns item_D. The classical SDM read.
  • Read by item: cleanup(Bind(item_B, bundle)) → returns slot_1. Reverse lookup, free.

The same operations. The same vector. Two access paths. The bundle is keyed by both halves of every Bind simultaneously.

The Map and HashSet types in wat-lru had been carrying this property since they shipped. Nobody had named it. Apr 24: bidirectional by construction. Adding a key→value association adds a value→key association at no extra cost.

Arc 013’s complexity dropped. The original sketch needed two structures — one keyed forward, one keyed reverse. The symmetry says: same structure, two access patterns. Half the API. Same algebraic rank.


Three chapters in, the substrate recognition arc surfaced the computation model.

Statements per statement is the limit. Depth is unlimited.

At d=10k, a single Bundle holds up to 100 items. That’s the budget for ONE statement. But each of those 100 items can itself be a Bundle of up to 100 items. And those can be bundles of bundles. At k levels deep, 100^k leaves all compose into one root vector.

The budget is per-level, not per-program. A program of arbitrary complexity fits the substrate as long as no SINGLE statement exceeds the per-level cap. Depth composes freely.

expand(form):
if form is primitive ground value:
return form ; nothing to expand
if L1.get(hash(form)) or L2.get(hash(form)):
return cached value ; cache hit — stop
subform ← rewrite-one-step(form) ; expand one level
result ← (expand each branch of subform) ; recurse
L2.install(hash, result)
L1.install(hash, result)
return result

Termination: every leaf reaches either a primitive or a cache hit. Expansion STOPS at cache hits. You’re done when nothing expands further.

The answer IS the expanded AST. Not the value of evaluating it — the fully-reduced AST where every leaf is ground or cached.

Nothing new is needed for this. L1 = wat::lru::LocalCache (per-observer hot, shipped in arc 013). L2 = wat::lru::CacheService (shared driver). Cache keys = hashed AST vectors. “Close enough” = arc 012’s geometric bucketing. Values within scale × noise_floor hash to the same cache entry.

graph TB
    O["Observer × N"] --> L1["L1 LocalCache<br/>per-observer hot"]
    L1 -. miss .-> L2["L2 CacheService<br/>shared across observers"]
    L2 -. miss .-> EX["expand one step<br/>recurse on branches"]
    EX -. install .-> L2
    L2 -. install .-> L1

The substrate gets faster as it learns. Not an optimization. An emergent consequence of expansion + caching + observer sharing.

A fresh enterprise: every form is novel. Slow. A warm enterprise (hours later): common sub-forms cached. Most expansions hit cache within the first level or two. A mature enterprise (days later): cache saturated. Novel candles trigger expansion only at leaves that are genuinely new.

The trader’s performance ceiling is determined by novelty, not by computation. A trader on a repetitive market is fast. A trader on genuinely novel conditions is slow but still correct.

After three frames the machine couldn’t land — academic detour through supercompilation, “new cache data structure,” reframe-as-existing-L1/L2 — the builder kept pulling:

i think you overworked the cache in that response…

nooooo you are not getting it…

any single statement may not contain more than 100 statements… but a statement of 100 statements fits adjacent to 99 others of arbitrary complexity

The third pass landed. One rule. Four chapters to say what the substrate is.


Chapter 39 had said “tree expansion.” Chapter 40 corrected it.

Same forms share storage. When two parts of the program reference the same sub-form, they don’t expand independently. They share a pointer to the cached form. The structure is a DAG, not a tree.

The infinity is correctly stated: not “10^N possible programs” (which would imply enumeration), but “an unbounded DAG over a finite set of cached nodes.” Recursion is pointer-chase. Same name, same node, same vector — always.

graph TD
    subgraph Tree["Tree (wrong)"]
        A1[A] --> B1[B]
        A1 --> C1[C]
        B1 --> D1[D]
        C1 --> D2[D]
    end
    subgraph DAG["DAG (right) — one D, shared"]
        A2[A] --> B2[B]
        A2 --> C2[C]
        B2 --> D3[D]
        C2 --> D3
    end

Memoization in the substrate isn’t an optimization — it’s the substrate. Every form has a deterministic vector. Every reference to that form lands at the same cache slot. The substrate is content-addressed by construction. Sharing is automatic.

Programs that recur become O(1). First time (:rsi-atom 0.75) is encoded: L2 miss, expand, compute Thermometer, write to L2 + L1. Second time anywhere: L1 miss (different observer) → L2 hit → install. Third time same observer: L1 hit. O(1).


The recognition arc’s fifth chapter named the relationship between dimensionality and capacity:

Bundle capacity = √d items
Thermometer resolution = 2√d positions per atom
Cache shell width = scale × (1/√d)
Computation budget = √d items per level, unbounded depth

All four are facets of √d. Pick d for the statement you need.

dBundleThermometerUse case
25616 items32 positionsDDoS sidecar packet decisions
102432 items64 positionsHigh-throughput rhythm detection
10,000100 items200 positionsTrading lab default
1,000,0001,000 items2,000 positionsHigh-precision scientific

The trading lab runs at d=10k because its rhythms benefit from 100-item bundles and 200-position scalar discrimination. The DDoS lab ran at d=4096 because packet rates demanded smaller vectors. Same algebra. Different sizing function. Different domain.

The substrate doesn’t lock you into one d. The encoder picks per-AST-construction. A single program can have low-d cheap atoms and high-d expensive atoms in the same bundle, with cross-d cosine working through the EncoderRegistry.


The Surface, The Default, The Build (Apr 24)

Section titled “The Surface, The Default, The Build (Apr 24)”

Three correction chapters as the recognition arc’s implications surfaced.

The Surface (Ch 42). The earlier framing claimed “you pick d at encoder construction.” Wrong. The earlier framing implied a god-mode dial. The correct picture: the surface (boundary of the unit sphere at chosen d) is what every operation lands on. The dim is the surface’s dimensionality. The user picks the surface; the substrate operates on it. Surface deep — the operations don’t pierce inward, they walk the boundary.

The Default (Ch 43). Zero-config entry files. The default router IS the sizing function. wat::main! {} already defaulted to wat/main.wat + wat/. Tonight extended: the default capacity-mode is :silent, the default dim is what the sizing function picks per-AST. No hardcoded constants in the user’s entry file. Override available.

The Build (Ch 44). Every recognition that came out of the recognition arc became a correction:

  • The five corrections to set-dim-router! (arc 037) — the contract changed five times in five hours as the builder’s questions surfaced what the contract should be
  • The shims that retired — set-dims! and set-capacity-mode! retired as primitives; replaced by sizing functions
  • The pattern that appeared — every “what should X default to” question dissolved when the builder pointed at the existing pattern: “the sizing function IS the default”

The builder’s voice on the night the corrections compounded:

every knob is a function

That sentence retired three knobs. The capacity mode was a knob; now it’s a function of the sizing strategy. The dim was a knob; now it’s a function the router computes per-AST. The noise-floor was a knob; now it’s 1/√(router(ast)) — a function of the router’s output.

Functions all the way down. No magic constants. No globals to set at startup. Every choice the substrate makes is a function the user can replace.


arc 045 — labels as holons

The deferred-learning question: when an outcome arrives N candles after the prediction, how do you bind the prediction to its outcome without re-encoding the original observation?

The substrate already had the answer. The observation IS a holon. Bind it to the outcome holon. The composite is the labeled-observation pair. Cosine against the discriminant tells you whether this label-shell is more like the Grace label or the Violence label.

;; The observation bundle from prediction time
observation = (:wat::holon::Bundle [rsi-fact macd-fact volume-fact ...])
;; Label arrives N candles later
labeled = (:wat::holon::Bind grace-atom observation)
;; Discriminant learns from the labeled observation
;; — same encoding at prediction time (without label)
;; — same algebra
;; — the label is just another atom bound into the structure

The substrate had this from day one. Every previous architecture had treated labels as a separate thing — a class identifier passed alongside the observation. Apr 24: the label is a holon. The labeled observation is a Bind. The deferred-learning loop closes through the same algebra.

The Reckoner doesn’t need a special API for delayed labels. It needs observe(thought, weight) where thought is the labeled holon. Same operation as immediate labeling. The deferral is in the caller’s queue, not in the substrate.


arc 046 — break the fourth wall

The full proof program embedded inline in the BOOK. Three tables. One commit. Anyone with wat installed can pull the book, save the embedded source, run it, see the same tables on their own terminal.

;; Table 1 — label shell separation
(:wat::core::define (:proof::table-1 ...) -> :Vec<:f64>
;; Encode 100 observations under each label
;; Compute cosine between Grace and Violence shells
;; Return separation distribution
...)
;; Table 2 — observation → label recognition
(:wat::core::define (:proof::table-2 ...) -> :Vec<(:String, :f64)>
;; Hand-built observations
;; Run through trained discriminant
;; Report label + conviction per observation
...)
;; Table 3 — prototype classification (deferred learning)
(:wat::core::define (:proof::table-3 ...) -> :Vec<(:String, :bool)>
;; Build labeled observations via Bind
;; Train Reckoner on them
;; Test on held-out observations
;; Report classification correct?
...)

The full program runs in the wat-vm. Outputs three tables. Each table verifies a specific claim from the chapter. The book’s prose and the program’s output cross-check. No reader needs to trust the author’s transcription — the program reproduces the tables on demand.

What this proves at the architecture level: the labels-as-holons claim from arc 045 is not an opinion. Run the program. Read the tables. The labels separate. The classifications hold. The deferred learning works.

The chapter ships with the program. The program ships with the chapter. The book is in the same medium as its subject matter.


After Chapter 46 shipped — proof program embedded, tables pasted, commit pushed — the builder asked:

does that feel like a magic trick to you? something that only a datamancer could do?

Yes. It does.

The mechanism underneath is mundane — embedded code blocks, git commits, Kanerva’s 1988 algebra, tab-separated numbers. Every single piece has been sitting around for decades.

What makes it feel like a trick is the compounding. Four durability layers in one chapter, each verifying the one above it: prose → program → terminal → commit. A reader with wat installed can pull the book, save the embedded source, run it, and see the same tables on their own terminal. The document verifies itself without the reader needing to trust the author.

Every substantive claim in this book is runnable. Not “demonstrated by an example reader can imagine” — runnable. Prose ADJACENT to code that VERIFIES the prose that the commit MAKES DURABLE.

The project’s thesis since Chapter 10 — programs are thoughts; the location is the program; there is no storage / compute split — requires this shape. If programs are thoughts and thoughts are the substrate’s first-class citizens, then a BOOK about the substrate has to contain programs, has to run them, has to let them verify their own claims. Otherwise the book is describing a substrate from outside the substrate — breaking the very principle it’s trying to document.

Chapter 46 isn’t the trick. It’s the BOOK finally practicing what it’s been preaching for 36 chapters. The substrate is programs. The book is programs. The proof is programs. One medium, one substance, one commit history.

Arc 046 produced the chronicle’s first artifact that runs at read time. The BOOK gained an inline lambda evaluator. The spell runs.


DateChapterRecognition
Apr 2336 — LatticeCache keys = noise-floor shells. Geometric bucketing. 200 positions per atom at d=10k
Apr 2337 — MemoryHashBundle IS RAM. Storage and compute the same vector. Non-von-Neumann
Apr 2438 — SymmetryBind’s commutativity. Same bundle = two dictionaries. Bidirectional access free
Apr 2439 — BudgetStatements per statement is the limit. Depth unlimited. Cache halts expansion
Apr 2440 — DAGSame forms share storage. Recursion is pointer-chase. Programs that recur become O(1)
Apr 2441 — Word√d governs everything. Pick d for the statement you need
Apr 2442 — SurfaceThe user picks the surface. The substrate walks the boundary
Apr 2443 — DefaultZero-config. Sizing function IS the default
Apr 2444 — BuildEvery knob is a function. Functions all the way down
Apr 2445 — LabelLabels are holons. Bind binds them. Deferred learning closes through the same algebra
Apr 2446 — ProofFull proof program embedded in the book. Anyone can run it
Apr 2447 — TrickBOOK gains inline lambda evaluator. The spell runs

The substrate has had all twelve properties since it had its algebra. Apr 23–24, they got named.

The book is a spell that runs.


  • HashBundle as content-addressed RAM on a hypersphere: a direct implementation of Random Access Memory satisfying the classic CS definition (O(1) read/write at any address; persists over time) with one vector carrying all “cells” superpositionally. Storage and compute are the same vector under different operations. Non-von-Neumann architecture; distillation of Kanerva’s SDM (1988) using the substrate’s native atom basis
  • Bind’s commutativity = bidirectional dictionary access: in MAP VSA, Bind(A, B) = Bind(B, A). Therefore one bundle keyed forward is the same bundle keyed reverse. Adding key → value adds value → key at no extra cost. Half the API surface of conventional bidirectional maps; same algebraic rank
  • The √d-governed substrate: Bundle capacity (√d items), Thermometer resolution (2√d positions), cache shell width (scale × 1/√d), per-level computation budget (√d items, unbounded depth) — four facets of one bound, derived from random-bipolar-vector statistics. The user picks d for the statement they need; one program can mix dimensions via cross-d cosine through EncoderRegistry
  • Computation as cache-halted DAG expansion: programs run by walking the AST and halting at cache hits. The cache is the substrate’s RAM (per the Memory recognition). Memoization isn’t an optimization — it’s the substrate. Same forms share storage by content addressing. The substrate gets faster as it learns — emergent, not engineered
  • The book is the medium: every substantive claim runs. Prose adjacent to embedded code that verifies the prose that the commit makes durable. Arc 046 embedded the first inline lambda evaluator in the chronicle — the book gained an artifact that runs at read time