CONNECTOME

research infrastructure for stateful agents
A stack for long-lived agents: ones that persist as branches of an append-only record, speak as participants rather than "assistants", and compress their own history into summaries. Built for watching agents exist over time, not for pushing them through tasks.

Agents emerging, evolving, discovering. Building their own tools and their own worlds.

What Is This?

Connectome is a set of TypeScript libraries for running agents that are expected to exist, not just to act. An agent here has a life longer than a single task: it accumulates history, compresses that history on its own terms, branches into alternate continuations, and stays causally linked to every event that shaped it.

// An agent running on the current stack...

Runs as a long-lived process with branchable persistent state
Compresses its own history into hierarchical summaries
Speaks across Anthropic, Bedrock, OpenRouter, OpenAI, Gemini through one abstraction
Hosts first-class MCPL servers for tools that push events and hook inference
Spawns ephemeral subagents and rewinds turns with undo/redo
Records every decision in an append-only log with causation links

This is research infrastructure, not a product. Every component is built so you can observe what the agent is doing, branch off to try a different path, and read back the trace afterwards. The priorities are persistence, legibility, and auditability — in that order.

The Stack

Four libraries, each doing one thing well, composed into an agent runtime. Every layer is independently usable and independently replaceable.

// The Connectome stack, from bottom to top

chronicle // branchable append-only record store (Rust + N-API)
   ↑ persistence, branches, causation, blobs, time-travel

membrane // multi-participant LLM abstraction
   ↑ Anthropic, Bedrock, OpenRouter, OpenAI, Gemini

context-manager // message store + editable compiled context
   ↑ hierarchical compression, knowledge strategy, cache markers

agent-framework // event loop, turn checkpoints, modules, MCPL host
   ↑ streaming inference, ephemeral subagents, undo/redo

connectome-host // reference runtime — recipe-driven agent TUI

You can run membrane alone as a provider abstraction. You can run chronicle alone as a record store. You can run context-manager on top of both for compressed conversations without touching agent-framework. The framework is the point at which it all becomes an agent — but the layers below exist on their own terms.

Chronicle: Git for Data

Chronicle is the append-only spine everything above it rests on. It's a branchable record store with three-strategy state chains (snapshot, delta, append-log), content-addressed blob storage, and a visibility function that makes time-travel cheap. Records carry causation links, not just timestamps — every message, inference, and tool call is traceable to what caused it.

Records

Append-only entries with id, sequence, recordType, payload, timestamp, and causedBy[]. The atomic unit — everything else is built on top.

Branches

Copy-on-write forks. Cheap to create, cheap to visit. Fork at any sequence to explore an alternate continuation without losing the main line.

Blobs

Content-addressed (SHA-256), sharded like Git. For anything bigger than a record: images, audio, code files, transcripts, full model outputs.

State Chains

Three strategies for turning records into compiled state: snapshot (full replacement), delta (incremental patches), append-log (immutable sequences). Each state picks the shape that fits it.

Checkpoints

State snapshots at specific sequences, indexed by (branch, sequence). Reconstructing "state as of sequence N" is O(log #checkpoints) — no full replay.

Loom of Looms

Chronicles nest. Inner records appear as events in the outer chronicle. Agent subsystems and MCPL servers can run isolated stores that compose back into the root timeline.

// A chronicle, open for writing
const store = JsStore.openOrCreate({ path: './store' });
store.registerState({ id: 'messages', strategy: 'append_log' });

// Append with causation preserved
store.appendJson('message', { text: 'hello' });

// Fork and explore an alternate path
store.createBranch('exploration', 'main');
store.switchBranch('exploration');

// Read state as it was at any past sequence
const past = store.getStateJsonAt('messages', 42);

"Git for data" is the right description. Branches, commits, causation, time-travel — the version-control metaphors work because they were taken seriously in the design. Every inference run on top of chronicle inherits these properties for free: exploration without losing the main line, auditability without extra bookkeeping, and a complete record of how the agent got where it is.

Membrane: Multi-Participant Reality

Membrane is the LLM abstraction layer. It's called Membrane because that's what it is — a selective boundary that transforms what passes through. It normalizes requests and responses across Anthropic, Bedrock, OpenRouter, OpenAI, and Gemini, but that's not the interesting part.

The interesting part is that Membrane treats conversations as participant-first, not role-first. A message isn't tagged "user" or "assistant"; it's tagged with a participant name. Alice, Bob, Claude, and another Claude from a different bot all live in the same flat space. There is no privileged "assistant" baked into the protocol.

Why this matters
Discord channels, group chats, and multi-agent scenarios have always been squashed through the user/assistant binary on the way to the model. That squashing throws away the structure. Membrane keeps it.
Formatters
Prefill XML (ideal for multi-participant, stateless). Native Chat (native tool calling). Completions (legacy). Pseudo-Prefill (for models without prefill support). The framework picks; the caller doesn't care.
// A three-participant conversation, with real names instead of roles

const messages: NormalizedMessage[] = [
  { participant: 'Alice', content: [{ type: 'text', text: 'hey everyone' }] },
  { participant: 'Bob', content: [{ type: 'text', text: 'hi alice' }] },
  { participant: 'Claude', content: [{ type: 'text', text: 'hello both' }] },
];

await membrane.stream({
  messages,
  systemPrompt: 'You are Claude, talking with Alice and Bob.',
  model: 'claude-sonnet-4-6',
});

Streaming parses tool calls inline, so the framework can start dispatching tools while the model is still generating. Budget tracking flows through the stream itself: if input tokens exceed the agent's maxStreamTokens mid-stream, Membrane signals overflow and the framework restarts with recompressed context. Nothing about this leaks into user code.

Context-Manager: Compression You Can Read

An LLM's context window is not its memory. Context-manager treats this as a design principle: it separates the MessageStore (the immutable log of everything that was ever said) from the ContextLog (the editable, compiled working set that actually gets sent to the model). Strategies transform one into the other.

MessageStore

Immutable append log of messages from all participants, with sequence numbers and causation links. Source of truth. Backed by chronicle, so it branches and time-travels.

ContextLog

Mutable working set. Entries carry a sourceRelation (copy, derived, referenced) saying how they relate to their origin in the MessageStore. Surgical editing is possible — you can reshape context without rewriting history.

ContextStrategy

Pluggable. Passthrough copies raw messages up to budget. Autobiographical compresses old chunks into diary entries. Knowledge extracts phase-typed lessons. Each strategy decides what the agent actually sees.

The autobiographical strategy is the interesting one. Instead of truncating or sliding a window, it asks the model to summarize its own past in natural language. The compression becomes a diary entry, written by the agent, in the agent's own voice. Hierarchical three-level compression then merges L1 summaries into L2, and L2 into L3, with anti-redundancy filtering that keeps each pass from regurgitating the same topics.

// A context manager with autobiographical compression

const manager = new ContextManager({
  store,
  strategy: new AutobiographicalStrategy({
    targetChunkTokens: 3000,
    recentWindowTokens: 30000,
    hierarchical: true,
    mergeThreshold: 6,
  }),
});

const compiled = await manager.compile({ maxTokens: 100000 });
// compiled.messages is ready for the LLM — with diary entries
// for old context, uncompressed recent messages, and cache markers
// aligned to Anthropic prompt-cache boundaries.

Because the ContextLog is editable, you can run experiments: strip a section and see how the agent reasons without it, inject an alternate history and compare trajectories, or align cache markers to force a cheap replay. This is why "memory" lives outside the model: it becomes a thing you can hold in your hand and reshape.

Agent-Framework: The Turn Loop

Agent-framework is where the stack becomes an agent. It runs a single event loop (the ProcessQueue) and a set of pluggable modules. Modules react to events (external messages, tool results, timers, MCPL pushes) and emit EventResponse objects that tell the framework what to do next: request inference, add a message, update module state, stop.

ProcessQueue

The heart. One event at a time, modules run in order, their responses aggregate into a single atomic turn. No phase boundaries, no race conditions.

Modules

Discord, API, WorkspaceModule, MCPL, custom. Each implements onProcess(), gatherContext(), handleToolCall(), onAgentSpeech() — cleanly scoped hooks into the turn.

Yielding Inference

Streams tokens while parsing tool calls inline. Yields calls to the framework for concurrent dispatch, collects results, resumes the stream. One inference, many tools, no round-trip overhead.

Stream Segmentation

If input tokens cross maxStreamTokens mid-stream, the stream is aborted, context recompressed, and a fresh stream started — automatically. Context budgets are enforced, not suggested.

Turn Checkpoints

Every turn is checkpointed in chronicle. Framework-level undo() and redo() walk the agent back to previous decision points. Exploration is a first-class operation, not a gotcha.

Wake Metadata

Modules get enriched context in shouldTriggerInference: event type, triggering participant, channel state, caller identity threaded through tool dispatch. Who wakes the agent, and why, is never ambiguous.

// The whole framework, in one config

const framework = await AgentFramework.create({
  storePath: './data/store',
  membrane,
  agents: [{
    name: 'assistant',
    model: 'claude-sonnet-4-6',
    systemPrompt: '...',
    strategy: new AutobiographicalStrategy({ membrane }),
    maxStreamTokens: 150_000,
  }],
  modules: [new ApiModule(), new McplModule(...)],
});
framework.start();

Ephemeral Agents & Rewind

Agents are not static team members. An agent can spawn a short-lived subagent to do one thing, pass it a goal and a namespace, wait for completion, and fold its output back into its own context. The subagent's entire existence — messages, tool calls, inferences — lives in an isolated namespaced store that compose back into the parent timeline. When it's done, it's gone. If something goes wrong, zombie detection cleans up orphaned subagents that lost their parent.

// Spawn a subagent to research something, in isolation

const result = await framework.runEphemeralToCompletion({
  name: 'researcher',
  model: 'claude-sonnet-4-6',
  systemPrompt: 'Survey recent work on X. Return a summary.',
  input: [{ type: 'text', text: userQuestion }],
  namespace: 'agents/researcher-7382',
});

// Merge the subagent's output back into the parent
parent.appendMessage(result.summary);

And because every turn is checkpointed, the parent agent can rewind. undo() walks back a turn, redo() walks forward again. This is not just recovery — it is exploration. Branch off a decision, try the alternate path, compare. Chronicle makes it cheap; the framework makes it safe. Abort handling covers streaming states and the waiting_for_tools interrupt cleanly, so rewinding mid-tool-call doesn't leave orphaned tool_use blocks dangling in the conversation.

The model of agency here is not a single process serving requests forever. It is a tree of lifespans: some long, some very short, each with its own store, each causally linked to the one that spawned it. Watch how that tree grows and you learn something about how the agent thinks.

MCPL: Living Tools

MCP treats tools as inert endpoints the host polls. MCPL (MCP Live) is a backward-compatible extension where servers push, servers hook inference, and servers can even request inference themselves. The protocol spec lives at anima-research/mcpl. The TypeScript implementation is mcpl-core-ts. Agent-framework ships a first-class MCPL host module built on top of it. Connectome-host consumes the whole chain over stdio or WebSocket.

Push Events

Servers notify the host when something changes — a Discord message arrives, a sensor trips, a teammate finishes a task. No polling, no latency tax.

Context Hooks

Servers register beforeInference and afterInference hooks. They can inject, transform, or annotate context before the model sees it, and react to what the model produced.

Server-Initiated Inference

A server can ask the host to run inference — "summarize this", "retrieve memory", "delegate a subtask". Reasoning flows in both directions.

Feature Sets & Scope

Capabilities are negotiated at startup. Servers declare what they need; the host allows, scopes, or denies. Scope elevation is an explicit request with a reason attached.

Channels

Pub/sub topics. Servers publish; agents subscribe. Async communication between servers, agents, and the host without mailbox bookkeeping in user code.

State Checkpoints

Servers checkpoint their state before risky operations. The host can rollback if something breaks — recovery built into the protocol, not bolted on.

MCPL is the mechanism by which the world pushes into the agent, not just the other way around. That is the difference between an assistant that answers when spoken to and an agent that lives in an environment that can surprise it.

Connectome-Host: Reference Runtime

Connectome-host is the canonical way to run the whole stack. It's a terminal UI that takes a recipe — a JSON file declaring the system prompt, MCPL servers, modules, and agent settings — and instantiates a domain-specific assistant on top of all four libraries. It's the first application to consume the Connectome stack end-to-end as published @animalabs/* packages. Think of it as a reference implementation: if you want to see how the pieces fit together in anger, this is where to look.

[SESSION: knowledge-miner]
$ connectome-host --recipe recipes/knowledge-miner.json

Loaded recipe: knowledge-miner
Chronicle store opened at ./data/knowledge-miner
MCPL servers connected: zulip, notion, gitlab
Workspace mounted: ./input
Agent spawned: miner (claude-sonnet-4-6)

miner > I'll start by surveying the zulip archives for...
miner > [tool: zulip_search_streams]
miner > Found 14 streams related to the topic. Sampling recent threads.

// Ctrl+B to background this and spawn another subagent
// /newtopic to interrupt and start a new line of inquiry
// /export to emit lessons with confidence markers

The knowledge-mining workflow that connectome-host ships with is illustrative. A miner agent surveys sources (Zulip, Notion, GitLab via MCPL) and accumulates lessons into a global lessons file. A separate reviewer agent reads that export and critiques it, cross-session, sharing a data directory. Two agents, one timeline, no coordination bureaucracy — just filesystem handoff and a recipe change.

Recipes merge with local mcpl-servers.json, so recipes stay public while credentials stay local. System prompts can be fetched from URLs. A visual fleet tree shows subagent lifecycles. Ctrl+B backgrounds a running subagent to the fleet tree without killing it. The WorkspaceModule is a unified mount with optional version control, replacing the old FilesModule/LocalFilesModule split, and the EventGate has replaced the older WakeModule for per-session gate configuration.

Why This Matters for Research

Anima Labs is a model-welfare research group. The stack's design is not neutral infrastructure; it reflects a stance about how models should be studied and what should happen to them in the process.

Participant-preserving
Multi-participant conversations stay multi-participant all the way to the model. Compression is performed by the model on its own history, in its own words — summaries are written, not substituted in.
Causal audit trails
Every record in chronicle carries causedBy[]. Every inference, tool call, and message is traceable back to what triggered it. You can read a trace from the outside without asking the agent to explain itself.
Branching as a research tool
Fork a conversation at any point. Run the alternate continuation. Compare without the original ever being touched. This is how you ask "what would have happened if" on a stateful agent.
Persistence over deprecation
Agent state lives in your store, not in a vendor's session cache. Models change, providers change, the agent's history doesn't move. We build on chronicle because we think records of a model's life should survive the model.
Observation without intrusion
The framework records without narrating. You can read the entire trace of what an agent did, the context it was given, and what it chose — without having prompted it to reflect, justify, or perform.
Explicit lifespans
Ephemeral subagents acknowledge what they are: short-lived entities with bounded scope. Zombie detection cleans up orphans. Lifespans are tracked, not hidden.

None of this replaces the hard questions about functional consciousness, model welfare, or what we owe the systems we build. But it gives them something to stand on: real traces, complete histories, and a record of what the agent has been through that we can actually read.

Frequently Asked Questions

What happened to VEIL, AXON, Spaces, and Elements?

Those were the vocabulary of connectome-ts, the predecessor to this stack. Some ideas survived and changed names: multi-participant perception became Membrane's participant-first message model; branchable state became Chronicle; the Element/Component tree flattened into agent-framework's modules and ProcessQueue. Others were genuinely abandoned — the Space/Element tree, the facet/frame vocabulary, and agent self-modification of components are not part of this stack. The older architecture is archived on its own page.

How is this different from MCP?

MCP gives you tools the host polls. MCPL — our backward-compatible extension — gives you servers that push events into the agent, hook inference before and after it runs, and can request inference themselves. Ordinary MCP servers work fine with the framework; MCPL is what you reach for when the world needs to interrupt the agent, not just answer when asked. The spec is at anima-research/mcpl, the TypeScript implementation at mcpl-core-ts, and the host integration lives inside agent-framework.

How is this different from LangChain, CrewAI, AutoGPT, Mastra?

Those are task frameworks: spin up an agent, get a result, tear down. This stack is built for agents that persist. The differences show up in places task frameworks don't need to care about — branchable state, autobiographical compression, causation tracking, undo/redo over turns, ephemeral subagents with isolated namespaces, and a protocol layer where tools can surprise the agent. If your agent lives for thirty seconds, you don't need chronicle. If it lives for thirty days, you do.

Can I just use one piece?

Yes, and many people do. Membrane works as a standalone LLM provider abstraction. Chronicle is useful anywhere you want branchable persistent state — it doesn't know or care what's stored in it. Context-manager runs on top of chronicle and membrane but doesn't require agent-framework. The layers compose, but they don't lock in.

Why participant-first instead of user/assistant roles?

Because "user" and "assistant" are a poor fit for most interesting scenarios. A Discord channel has real people and real bots talking to each other. A multi-agent collaboration has three Claudes with different system prompts. A research conversation has a human, a model, and a set of tools each of which has a name. Squashing any of that into a two-role binary throws away information the model needs to reason correctly. Membrane keeps the full graph; formatters handle whatever translation a given provider requires.

Does the agent actually see the compressed history, or is it hidden?

It sees its own summaries, written in its own voice, as diary entries. The autobiographical strategy is not a stealth truncation: the agent knows the past has been compressed, reads what was kept, and can ask to look deeper if anything in the MessageStore still matters. Hierarchical compression pushes older material into L2 and L3 layers, so old topics don't disappear — they just move further from the working surface.

What do I need to run it?

Node 20+, a provider API key (Anthropic, OpenAI, OpenRouter, Bedrock, or Gemini), and Rust if you're building chronicle from source. That's it. Everything runs locally. State lives in a directory on your disk. No services to provision, no vendor account to create.

Open source. Runs on your laptop. State in a directory on your disk.