Most AI agents forget everything the moment a conversation ends. Coherence, the company behind an XRM (extended relationship management) platform, just published a detailed breakdown of the five-layer memory system they built to fix that problem for their autonomous agent, Nash.
The core issue is familiar to anyone who's used AI tools for ongoing work: every new session starts from zero. Your assistant doesn't remember that you prefer bullet points, that your client hates Monday meetings, or that the last three attempts at a particular task failed for the same reason. Coherence argues this is the gap between a tool and a teammate.
How the Five Layers Work
The architecture stacks from simple to complex:
Layer 1 - Atomic Memories stores individual observations as 1536-dimensional vector embeddings (essentially converting facts into mathematical coordinates so similar ideas cluster together). Each memory gets a confidence score that changes over time.
Layer 2 - Automatic Extraction runs a lightweight LLM after each task to decide what's worth keeping. Batch processing here cuts costs by 60-70% compared to real-time analysis.
Layer 3 - Semantic Deduplication prevents the system from storing the same insight fifty times. When a new memory is 90%+ similar to an existing one, it reinforces the old memory instead of creating a duplicate.
Layer 4 - The Dreaming Job is the most interesting piece. A background process periodically reviews stored memories, promotes frequently-reinforced observations, and runs "specialist passes" that draw logical conclusions from accumulated data. Think of it as the agent sleeping on a problem.
Layer 5 - Peer Cards are stable biographical profiles containing up to 40 categorical facts about people the agent interacts with. These are always loaded into context, no search required.
Memory is scoped at three levels: user-private, account-shared, and anonymized platform-wide patterns. The platform-wide tier is especially interesting because it means insights from one organization can improve the agent's behavior for others, without exposing any specific data.
This is a thoughtful architecture, but it's built for Coherence's specific use case of managing client relationships. The real question is whether these patterns become standard infrastructure for AI agents broadly. Right now, ChatGPT's memory is a flat list of facts. Claude's memory is similar. Neither has anything close to a "dreaming" consolidation layer or confidence-weighted reinforcement. As AI agents take on longer-running, multi-session work, something like this five-layer approach will probably become table stakes.