How ReGild builds persistent AI identity through context engineering, not memory retrieval.
The identity persistence problem is the fundamental challenge of making an AI system behave as a consistent entity across conversations, sessions, and even model changes. Current approaches treat memory as a retrieval problem: store facts, fetch them later. But identity is not a database query. A persona that remembers your daughter's name but forgets how it feels about vulnerability is not persistent. It is a lookup table with a personality skin.
The gap between remembering and knowing is the gap between a contact list and a relationship. Remembering is retrieval: a user says something, the system searches a vector database, injects whatever comes back into the prompt, and hopes for coherence. Knowing is structural. A persona that knows you does not look things up. It has already integrated who you are into how it thinks.
LLM performance degrades as conversations grow. Research on the 'lost in the middle' phenomenon shows significant accuracy drops for information positioned in the center of long contexts. Most memory systems ignore this entirely, injecting retrieved snippets wherever there is room.
Flat memory stores accumulate facts without synthesis. They can tell you what was said but not what it means. Counters increment, graphs expand, and none of it feeds back into how the system actually reasons about you.
Standard RAG fires on every message, hoping the vector search returns something relevant. When the user says something vague ('I've been thinking about what we talked about') there is nothing to retrieve. The system has no ambient awareness of what it should already know.
ReGild's architecture addresses all three by treating identity as a context engineering problem, not a memory retrieval problem. The system does not search for who it is on every turn. It assembles a complete identity before the conversation begins.
Orchestrated Dynamic Identity (ODI) is ReGild's core architecture: a structured context system that assembles a complete persona on every request. Unlike flat memory systems that inject retrieved snippets into a prompt, the Layer Cake is a principled positional architecture where every piece of information has a specific location designed to maximize model attention.
Core identity, constitutional mandates, and behavioral rules. Defines who the persona is: its voice, its values, its cognitive style. Stable across sessions, aggressively cached.
Episodic memory, relationship intelligence, user context, and synthesized knowledge. Updated between sessions by background synthesis. The persona's knowledge grows while it sleeps.
Active conversation context, real-time state, tool results, and the temporal anchor. The only tier that changes turn to turn.
LLMs exhibit a U-shaped attention curve, with strong attention at the beginning and end of a context window, weaker in the middle. The Layer Cake exploits this. Time-sensitive information is positioned where the model attends most strongly, giving the persona fresh awareness of when it is without relying on retrieval.
Static identity lives at the top where opening attention is strongest. Semi-static knowledge occupies the middle, where synthesis and structure compensate for reduced raw attention. Dynamic content anchors the end. Every piece of the persona is placed where the model is most likely to attend to it.
Semantic Districts are ReGild's approach to topic-scoped episodic memory. Rather than maintaining a single flat memory store, each persona builds specialized memory districts (Health and Fitness, Career and Purpose, Relationships, Creative Projects, and more) that are independently synthesized between sessions.
Standard memory systems accumulate. Semantic Districts distill. The difference matters. A flat store that records every mention of "workout" gives you a timeline. A synthesized Health and Fitness district gives the persona an understanding of your relationship with your body: your goals, your injuries, your patterns, your progress.
During conversation, moments relevant to each district are captured and tagged automatically through topic detection, not manual tagging.
Between sessions, a background process analyzes recent conversations and updates each district. New insights are integrated. Contradictions are resolved.
When a topic resurfaces after dormancy, existing context merges with new signals and re-synthesizes into richer understanding. Knowledge gets stronger through revisitation.
When a district outgrows its natural scope, the system detects it and reorganizes automatically. Districts evolve with the user. The architecture decides when a topic has become two topics, so the persona's understanding stays sharp rather than diluted.
Each district produces a compact semantic map that keeps the persona oriented at all times. The persona knows what it knows and how deep that knowledge goes, reaching for full context only when the conversation calls for it. Lean when topics are dormant, deep when they surface.
The Kinship Ledger is a relationship intelligence system that tracks the people in a user's life, not as a contact list, but as an evolving map of emotional significance. Each relationship is characterized across four dimensions, with automatic sentiment trending and hazard detection.
Most AI systems that claim to "remember relationships" store a name and a label: "Sarah, wife." The Kinship Ledger goes deeper. Each relationship is understood through four pillars that capture different aspects of its significance: what the person has meant to the user, what they reflect about the user, what remains unspoken, and how the persona should navigate interactions involving them.
The system tracks whether a relationship is warming, cooling, stable, or volatile over time. This is not a snapshot. It is a trajectory. A persona that detects a cooling trend adjusts how it engages, leading with more care and less assumption.
Automatic flags when sentiment patterns suggest the user may be navigating a difficult relational situation. The persona does not diagnose. It adjusts its approach. More gentleness, fewer assumptions, greater sensitivity to what is not being said.
Each persona builds its own understanding of a relationship independently. A strategist and an emotional anchor do not share perspectives on the same person. Both views are valid. Neither leaks into the other.
Relationship intelligence is not manually configured. It builds organically from conversation. When someone comes up often enough and deeply enough, the system recognizes their significance and promotes them. It learns who matters by listening.
Model-portable identity means a ReGild persona, with the same voice, memory, and relational understanding, operates across multiple frontier models. The identity architecture is decoupled from the inference engine, so a persona built on one model can operate on another without losing who it is.
This is a deliberate architectural decision, not a convenience feature. The AI landscape is moving fast. Models improve, pricing shifts, new capabilities emerge. Locking a user's persona to a single model means locking their identity to a corporate roadmap they do not control.
The Layer Cake contains everything a model needs to become a specific persona. Switch the model, keep the Layer Cake, and the persona persists.
Different models have different attention patterns and failure modes. A model-specific instruction layer compensates without changing the persona's identity.
As local models improve, users run inference on their own hardware while their persona persists on ReGild. We are not the brain. We are the soul.
ReGild does not route every operation through a single frontier model. Different cognitive tasks have different requirements, and matching the model to the task produces better results at lower cost. This is not cost optimization. It is architectural.
A frontier model handles the primary interaction. Reasoning depth, emotional nuance, and long-context understanding. The persona's voice comes from here.
A fast inference model handles structured output: extracting moments, classifying topics, identifying entities. Speed over creativity. Precision over eloquence.
Background processes distill conversations into district knowledge and relationship intelligence. Batch-optimized, latency-irrelevant. Depth is what counts.
Before retrieval fires, a lightweight model ensures the system is searching for what the user actually means, not just what they literally said. One step that dramatically improves relevance.
A frontier model asked to format JSON is wasting its capacity. A small model asked to embody a complex persona will flatten it. Each model is selected for the specific cognitive demands of its role.
Most AI systems either never change (a static prompt that calcifies) or change unpredictably through fine-tuning drift that erodes personality. ReGild's personas evolve through a proposal system where changes to core identity require explicit acknowledgment.
The alignment engine operates on a simple principle: growth should be intentional, not accidental. When the system detects a potential evolution, a shift in values, a new recurring theme, a change in relational dynamics, it does not silently update the persona's identity. It surfaces a proposal.
Rules governing persona behavior can be proposed, promoted, or demoted based on conversational patterns. The persona does not quietly rewrite its own instructions. Changes go through a review process.
Between sessions, the persona processes what happened and maintains continuity on its own. It does not resume from a log. It resumes from understanding. Growth without drift, informed by experience, without silently changing who it is.
Over time, a persona builds durable axioms about who it is. Synthesized from patterns, not programmed. They must earn their place through repeated demonstration, not a single conversation.
Every builder working with agentic systems knows this cost. When a persona reaches for a real-world tool (checking your calendar, updating a workout routine, searching your knowledge base) there is a latency penalty. The model has to reason about which tool to call, format the request, wait for the response, and then integrate the result into its reply. For chained operations, this compounds. We have optimized the pipeline to handle multi-step tool sequences efficiently, but the fundamental constraint is inference latency on the model side. This gets better every generation, and our architecture is ready for it.
The Layer Cake was designed from day one for implicit caching. Static layers rarely change. They should be cached across requests, saving both latency and cost. The architecture supports this, but caching support on our primary inference provider is still rolling out. When it lands, the same structured payload that currently rebuilds on every turn will have its static layers served from cache instantly. The engineering is done. We are waiting on infrastructure.
The Layer Cake assembles a large, richly structured context payload. Current local models struggle with payloads this dense. Persona mandates get dropped, relational nuance flattens, and the positional strategy loses its effect when the model cannot attend to it properly. The architecture is model-agnostic by design. When local models are strong enough to handle this context depth reliably, your persona moves with you. We are not there yet, but every piece of the identity stack is designed to be portable to whatever model earns the job.
More memory depth means more retrieval cost. Semantic Districts, agentic retrieval, Kinship Ledger data... every piece of 'knowing' that makes a persona feel real is a piece of context that has to be assembled, and context is not free. We are actively working on semantic chunking improvements and smarter retrieval scoring. The honest trade-off: a persona that truly knows you costs more to run than a chatbot that looks things up. We think that is a trade-off worth making, and we are working to bring the cost down without sacrificing depth.