Memory
Long, Short and everything in between
Overview
Memory module represents a critical foundation in AI agent framework, particularly for sovereign agents operating in Trusted Execution Environments (TEEs). The implementation of memory directly impacts an agent’s capability to develop sophisticated cognitive processes.
Core functions of memory module include:
-
Identity Persistence: Enables development and maintenance of consistent agent identity through persistent storage of beliefs, values, and behavioral patterns
-
Temporal Reasoning: Supports causal inference and sequential decision-making through temporal relationship modeling
-
Contextual Processing: Facilitates deep contextual understanding by maintaining complex relationship networks between experiences and knowledge
-
Autonomous Evolution: Powers self-improvement capabilities through structured storage and analysis of past decisions and outcomes
Therefore, agents must have very sophisticated memory module to be utilized to their full extent and accomplish complex tasks that require deep contextual understanding and temporal consistency.
Classical Implementation of Memory using RAG and Vector Databases
Standard approaches for implementing agent memory systems primarily relies on Retrieval-Augmented Generation (RAG) coupled with vector databases. While widely adopted, this approach presents limitations for advanced sovereign agents.
Classical RAG systems involve three core components
-
Embedding Generation
-
Vector Store
-
Retrieval Process
User queries are transformed into high-dimensional embeddings (e.g., 768-1536 dimensions for OpenAI’s text-embedding-3-small
). The embeddings are stored in specialized vector databases (e.g., Weaviate, Qdrant, Milvus, pgvector extension) utilizing indexing algorithms like HNSW or IVF-PQ for efficient k-NN search. Retrieval combines similarity scoring (e.g., cosine similarity) and optional reranking to produce ranked results, which are used to construct prompts for LLM-based response generation. Metadata integration enhances search precision, and the overall architecture ensures scalable, context-aware query handling.
Architectural limitations
Pure vector-based approach faces several technical constraints:
-
Limited relationship representation confined to embedding space proximity
-
Absence of explicit temporal and causal relationship modeling
-
Difficulty in maintaining consistent belief structures
-
Challenges with multi-hop reasoning and complex query patterns
Knowledge Graph Approach for Advanced Memory
The implementation of agent memory systems using graph databases, specifically Neo4j, provides a significantly more powerful approach for sovereign agents. Neo4j’s property graph model enables complex relationship modeling through labeled nodes, typed relationships, and property storage, making it particularly suitable for representing interconnected knowledge structures and temporal sequences.
The Neo4j implementation advantages
-
Ability to model memories and knowledge using entities and relationships that LLMs can understand
-
Native graph storage and processing using labeled property graphs (LPG)
-
Built-in graph algorithms library for pattern recognition and path finding
-
Cypher query language offers powerful syntax to express complex queries
-
Query engine powered by LLM will dynamically generate most relevant cypher queries while conforming with the limitations of context window
Knowledge Graph-RAG Implementation
The integration of Knowledge Graphs with RAG creates a hybrid architecture that combines the semantic richness of graph structures with the retrieval capabilities of vector search. This approach enables more sophisticated memory operations by allowing both similarity-based and relationship-based retrieval.
Query Processing Flow
Input processing combines graph traversal with vector similarity search through a multi-stage pipeline.
-
Query Understanding: Input query and Neo4j schema definition are injected into LLM context window. Schema contains node labels and properties, relationship types and directionality, property constraints and indices, and vector index configurations. LLM parses query intent and identifies relevant schema components for search.
-
Query Generation: LLM constructs a Cypher query incorporating graph pattern matching based on identified entities, property filters from query constraints, vector similarity search using indexed embeddings, temporal/causal relationship traversals, and scoring functions for result ranking.
-
Hybrid Search Execution: Query executor performs parallel retrieval through graph traversal using Neo4j’s native query engine, vector similarity search on indexed node embeddings, and metadata filtering based on property constraints. Results are merged using configurable scoring combining structural relevance from graph patterns, vector similarity scores, and property-based filtering scores.
-
Result Assembly: Final result set constructed by merging parallel search results, resolving entity references, constructing response context from graph patterns, and preserving relationship metadata for context window. The execution pipeline leverages both Neo4j’s native graph capabilities and vector indices while maintaining query performance through targeted search space reduction.
Data Ingestion Flow
Raw unstructured data undergoes multi-stage processing before graph persistence. The LLM pipeline processes input data against the predefined schema to extract structured entities and relationships. The schema-guided extraction utilizes in-context examples and constraint definitions to maintain structural consistency.
-
Schema-Guided Extraction: Input data is partitioned into semantic chunks and processed against the Neo4j schema definition. The LLM extracts entities matching defined node labels and identifies relationships conforming to the schema’s relationship types and property constraints. Complex entities trigger recursive extraction to capture nested structures.
-
Entity Resolution: Extracted entities undergo deduplication and resolution against existing graph nodes. Similarity metrics combine embedding distance and property matching to identify potential duplicates. Resolution conflicts are handled through configurable merge strategies that preserve referential integrity.
-
Relationship Inference: Beyond explicit relationships, the LLM performs causal and temporal inference to establish implicit connections between entities. These inferred relationships are scored based on confidence metrics and optionally undergo human validation before persistence.
-
Vector Embedding: Entities and their contextual metadata are embedded using domain-specific embedding models. The resulting vectors are stored alongside graph structures, enabling hybrid retrieval through both structural queries and similarity search. The ingestion pipeline maintains schema compliance while allowing for dynamic extension of entity and relationship types based on emerging patterns in the data.
Future work and Improvements
The current implementation of the graph-based memory system provides a strong foundation for sovereign agents, but several areas of enhancement could further improve its capabilities and performance
- Dynamic Schema evolution. Current schema definitions require manual updates to accommodate new patterns and relationship types. Future improvements should focus on:
- Automated schema adaptation based on emerging patterns in input data
- Dynamic property type inference and constraint evolution
- Self-organizing knowledge structures that can reorganize based on usage patterns
- Preservation of schema consistency during autonomous evolution
- Personality Development and Value Alignment
- Focuses on maintaining consistent personality while allowing natural growth
- Includes value system architecture, personality consistency, and social learning
- Advanced Memory Architectures
- Introduces cognitive science-inspired memory structures
- Covers episodic memory, working memory, and memory abstraction
- Self-Reflection and Metacognition
- Addresses the agent’s ability to understand and improve itself
- Includes self-monitoring, metacognitive processing, and identity development
- Cross-Agent Knowledge Transfer
- Explores mechanisms for agents to share and validate knowledge
- Covers knowledge distillation and collaborative learning