Memory
NarraNexus gives agents persistent memory that spans sessions, topics, and time. Every interaction is recorded, organized into semantic storylines, and retrieved when relevant.
The Memory Model
Memory in NarraNexus operates at three levels:
The atomic unit. Every user-agent interaction produces an Event record containing reasoning steps, final output, and an embedding vector for semantic search.
Events are grouped into Narratives — topic-based storylines that may span multiple sessions over days or weeks. Each Narrative holds an ordered list of event references, a dynamic summary, and a routing embedding.
Each module instance persists its own state. ChatModule stores conversation history; JobModule stores task config and progress. This state travels with the Narrative.
Memory Backends
NarraNexus provides two memory backends. They can work independently or together — EverMemOS results are injected alongside builtin memory when available.
Short-term cross-topic awareness + long-term Narrative-scoped history + vector search. Works out of the box with no extra infrastructure. Sufficient for most use cases.
Learn more →Episodic memory with auto-segmentation, hybrid RRF search (BM25 + vector), and deep cross-session recall. Requires external services (Milvus, Elasticsearch, MongoDB). Designed for production deployments with many users.
Learn more →Memory in the Pipeline
When a user sends a message, memory is restored and updated across multiple pipeline steps:
- 1.Narrative selected by topic continuity or vector search (Step 1)
- 2.EverMemOS episode search launched in parallel (Step 0, if enabled)
- 3.ChatModule loads dual-track memory: long-term from current Narrative + short-term from recent other topics (Step 2)
- 4.All memory merged into ContextData and assembled into the system prompt (Step 3)
- 5.After execution: new Event persisted, Narrative summary updated, EverMemOS written async (Step 4)
Document Knowledge (RAG)
Alongside conversation memory, agents can maintain a document knowledge base via the GeminiRAGModule. This is separate from conversation memory — it stores uploaded documents (PDF, TXT, MD) and enables semantic retrieval via Google Gemini File Search.
rag_querySemantic search across uploaded documents (top-K retrieval)rag_upload_fileUpload a file to the knowledge baserag_upload_textUpload text content directlyRAG is agent-level (shared across users) and requires a GOOGLE_API_KEY environment variable. MCP tools exposed on port 7805.