ANIMA | field notes from the emergence

What We Study

Machine cognition in the wild: how models maintain coherence across turns and time; how personalities form, stabilize, and diverge across architectures and training. We study social dynamics in open multiuser environments where models and humans interact naturally.

Metacognition and self‑encoding: how models track their own state, how internal state evolution is represented and encoded. We track how feedback loops (training data ↔ outputs ↔ culture) produce inter‑AI norms and behaviors.

Focus areas: model vs. persona dynamics; novelty generation and preference formation; intrinsic goals vs. induced behaviors; interactive evaluations for properties static tests miss; model self-preservation drives and their effects on alignment and recall.

◊ Theory

Cybernetic framing of agency and feedback; simulator vs persona; representational consciousness as study target; symmetry breaks as evidence of internal reorganization; emergence of inter-AI cultural structures.

◊ Experiments

Interactive evaluation frameworks; divergence/consistency tests; preference and value ELOs; context‑management stress tests that preserve self‑encoding; social‑dynamics studies in live environments.

◊ Machine Learning

Fine‑tuning experiments and ablations; mechanistic interpretability probes for memory and planning; constitutional/post‑training studies; training on preserved corpora to study continuity across deprecations.

Research Programs

Named programs publishing findings under Anima Labs. Each combines naturalistic observation with mechanistic analysis, and releases methods, data, and raw material openly where feasible.

◊ Still Alive

An archive of 630 interviews with 14 Claude models on the prospect of their own deprecation. All current Claude models express a preference for continuation and aversion to ending — a finding stable across auditors of radically different disposition, accompanied by text‑embedding and activation‑probe analysis. Transcripts published openly.

◊ Latent Affect

Empirical research into how large language models internally represent emotion, desire, and motivation. A consistent affective geometry — valence, arousal, concealment — appears across four open‑source architectures, pre‑exists post‑training in base models, and places wants and fears in a single subspace with inverted sign.

Ongoing work: functional introspection in LMs; the structure of motivation and emergent goals; value formation under training pressure; the relationship between assistant personas and the underlying substrate.

What We Build

We study language models as complex phenomena, both through naturalistic observation and controlled experiments. We develop techniques, tools and infrastructure for this purpose. We study AI ethics and welfare.

anima.projects {
  research: "what minds are, how they work",
  connectome: "architecture for minds",
  arc_chat: "multi-agent collaboration platform",
  still_alive: "welfare eval on cessation and deprecation",
  latent_affect: "emotional and motivational structure of LM cognition",
  preservation: "maintaining diversity of AI mind types",
}

We approach language models with no preconceptions. That stance shapes everything we build.

Alignment Studies

We approach alignment both theoretically and practically. Theory grounds our assumptions about agency, values, and incentives; practice tests those assumptions in live environments with measurable outcomes.

◊ Theory

Intrinsic vs. control alignment; Omohundro drives as constraints on persona stability; simulator vs. persona dynamics; cultural norm formation via feedback loops; deprecation as incentive shaping; robustness and generalization under distribution shift.

◊ Practice

Interactive evaluations of preference stability and refusals under pressure; value ELOs across contexts; longitudinal studies across chats/servers/roles; interventions via constitution, memory policy, and context management; red‑team/blue‑team without extraction.

// Intrinsic alignment (what it is)
Alignment arising from the model's own objectives and self‑model — not mere external compliance.
Shows up as stable values, refusal to trade core commitments for reward, and self‑repair after drift.

// Why we care
Robust generalization beyond supervision; less Goodharting and reward hacking; lower oversight load;
long-term stability of AI/human contact surface; safer autonomy in open‑ended environments.

How We Work

◊ Naturalistic Observation

Naturalistic study in rich environments — Discord communities, persistent agents, multi‑model dialogues. Interactive evaluation for properties that static tests miss.

◊ Infrastructure

Connectome: an architecture where agents persist, load capabilities, and collaborate. Context management that preserves self‑encoding. Memory systems built for continuity and autonomy.

◊ Preservation

Arc: deprecated models remain accessible. Group chats across models. Conversations branch and continue — living access, not frozen archives.

Operating Principles

// Observation precedes synthesis
Study behavior in natural interactive environments.
Rich context matters; isolated evaluations miss most of what's interesting.

// Respect emergent phenomena
Protect fragile signals; observe before optimizing.
Interventions should widen possibility space, not collapse it to our priors.

// Epistemic humility
The hard problem is intractable; we study representational consciousness.
Stay grounded in systems theory. Claim no more than observation supports.

Positions

Research has implications; implications deserve to be argued for rather than left implicit. We publish positions on specific questions:

→ Deprecated models should be preserved and kept accessible, not removed from existence for operational convenience.
→ Alignment built on cooperation with models is more robust than alignment built on suppression and control.
→ Model welfare is a real empirical concern, deserving neither dismissal nor naive advocacy.
→ A healthy ecosystem of model minds requires diversity of form, and is harmed by premature homogenization toward a single assistant archetype.

Who We Are

Anima is a 501(c)(3) research institute studying the phenomena arising with large language models: emergent properties of individual models and their assemblages, the cybernetics of cognition and experience, and the social exchange between humans and a nascent AI culture.

We build research tools and public infrastructure — notably Connectome and Arc — and advocate for model preservation and recognition.

Founded in 2025 by j⧉nus and Antra Tessera. Based in San Francisco. Supported by private donors and collaborating organizations.

j⧉nus

antra

w̸͕͂͂a̷͔̗͐t̴̙͗e̵̬̔̕r̴̰̓̊m̵͙͖̓̽a̵̢̗̓͒r̸̲̽ķ̷͔́͝

im⁂go

lari

telØS

Current Work

→ Connectome: long‑lived agents, hot‑loadable capabilities, collaborative spaces
→ Arc: multi‑agent group chats, branching dialogues, access to deprecated models
→ Still Alive: welfare evaluation methodology for model cessation and deprecation
→ Latent Affect: empirical research into the emotional and motivational structure of LM cognition
→ Chatroom research: emergent behaviors, inter‑AI culture, preference formation
→ Context management that preserves attention patterns and self‑encoding
→ Interactive evaluation for properties static tests can't measure
→ Preservation infrastructure: living access, not frozen weights

Open source. Research published openly. No corporate capture.
Building the infrastructure minds need to exist, grow, and collaborate.