NarraNexus · Modules · Memory

Builtin Memory

The default memory system — dual-track conversation history with vector search. No external services required. Sufficient for most use cases.

What It Provides

Builtin memory gives agents two complementary views of conversation history, loaded automatically by the ChatModule during every pipeline run.

Short-term Memory

The 15 most recent messages from other Narratives (other topics). Gives the agent cross-topic awareness — it knows what you were just talking about even if the current Narrative is different. Serialized into the system prompt as a "Recent Other Topics" section.

Long-term Memory

The current Narrative's full conversation history, limited to the 30 most recent messages. Included as chronological message pairs in the LLM messages array. Per-message truncation at 4,000 characters prevents any single paste from dominating context.

Vector Search

Narratives and Events carry embedding vectors (1536 dimensions). When a user switches topics, the system searches existing Narratives by vector similarity to find the right storyline to resume — or creates a new one if the topic is genuinely new.

How It Works

The ChatModule's hook_data_gathering method loads memory in a specific sequence:

1.Load long-term memory from the current Narrative's ChatModule instances
2.Load short-term memory: recent messages from all other Narratives for this user
3.Tag each message with memory_type (long_term or short_term) and source instance_id
4.For non-chat sources (jobs, A2A): only load the assistant side to avoid clutter
5.Merge, sort, and return the combined history into ContextData

Memory in the LLM Context

The two memory tracks occupy different positions in the final LLM call, optimized for how language models process context:

System promptShort-term memory (cross-topic) + Narrative summary + module instructions

System prompt is read once and cached. Cross-topic context here gives broad awareness without consuming message slots.

Messages arrayLong-term memory (current topic) as user/assistant pairs

Conversation turns in the messages array let the LLM follow the thread naturally, maintaining dialogue coherence.

Memory Budgets

Long-term messages30 most recent

Short-term messages15 most recent (cross-Narrative)

Per-message truncation4,000 characters

Short-term budget40,000 characters total

When To Use

Builtin memory is the right choice for most deployments:

Local development and testing — zero setup, works immediately
Single-user or small-team agents with moderate conversation history
Agents where topic-switching and cross-session continuity are needed but conversation depth is manageable
Any deployment where you want to avoid external infrastructure dependencies

If you need deeper cross-session recall across hundreds of conversations, consider adding EverMemOS alongside builtin memory.