This is a beta feature according to Algolia’s Terms of Service (“Beta Services”).
- Lost context: agents lose context, requiring users to repeat information
- Repetitive interactions: agents ask the same questions in every session
- Missed opportunities: agents can’t provide personalised recommendations based on past behavior
- Poor user experience: the experience may feel impersonal
How memory works
When a user interacts with your agent, memory operates in two stages:- Retrieval (automatic). At the start of a conversation, relevant memories are loaded for context
- During a conversation, if enabled, agents can use the following memory tools:
algolia_memorize: saves semantic memories (facts, preferences)algolia_ponder: saves episodic memories (experiences, observations)algolia_memory_search: searches existing memories
Memory types
Agent Studio supports two types of memory, inspired by human cognitive architecture: semantic and episodic.Semantic memory
Stores timeless facts, preferences, and general knowledge about the user. For example:- “User is allergic to peanuts”
- “User prefers dark mode in apps”
- “User lives in Madrid and speaks Spanish and English”
- “User’s job title is Software Engineer”
- User profile information
- Preferences and settings
- Dietary restrictions
- Accessibility needs
- Communication style
Semantic memory structure
Semantic memory structure
JSON
Episodic memory
Captures the agent’s reasoning chain from conversations: what it observed, thought, did, and learned. Use episodic memory to extract meta-learnings for process improvement and analysis. Use episodic memory when you want to understand the agent’s reasoning process. To structure this information, Agent Studio uses the OTAR pattern:- Observation: what happened (user input, context, problem)
- Thoughts: why the agent chose this approach (reasoning, constraints)
- Action: what the agent did (tool calls, responses, workflow)
- Result: what happened and what was learned
- Analyze agent performance across customer segments
- Identify successful problem-solving patterns
- Review conversations to improve prompts and instructions
- Answer questions like “How does prompt A perform for premium customers asking about returns?”
Episodic memory structure
Episodic memory structure
JSON
Enable memory
To enable memory, you must:- Enable the feature on your agent
- Verify data retention
- Set up user authentication
Enable on your agent
- From the dashboard
- With the API
From the Agent Studio agent edit view:
- Open your agent’s settings
- Go to the Customizations section
- Find the Memory toggle
- Click Configure to check prerequisites
- Enable memory if prerequisites are met
- Save changes
The dashboard validates prerequisites automatically and guides you through any missing configuration.
Verify data retention
Memory learns from conversation history, so conversations must be stored (retention > 0 days).From the Algolia dashboard, go to Agent Studio > Settings:Why this is required:
- Check the Retention period is set to 30, 60, or 90 days (not 0)
- Available retention values: only four discrete values are supported:
0,30,60, or90days. - Application-wide setting: retention applies to all agents in your Algolia application, not individual agents.
- Memory extracts information from past conversations
- If conversations are deleted immediately (0 days retention), there’s nothing to learn from
- Longer retention enables better memory extraction and consolidation
Set up user authentication
Memory requires user authentication to identify which user’s memories to load and save. Each user has their own set of memories that are isolated from other users.For complete setup instructions, see User authentication. This guide covers:Once set up, include the
- Getting your secret key from the dashboard
- Generating JWTs (JSON Web Tokens) on your backend
- Security guidance and token management
The same secure JWTs work for both memory and conversations.
If you’ve already set up JWT authentication for conversations, you can reuse the same setup.
X-Algolia-Secure-User-Token header in your completion requests to enable user-scoped memory:JavaScript
Memory tools
WhentoolsEnabled is true, your agent has access to three memory tools.
Each tool includes default logic for when to activate but you can customize this behavior in your agent’s instructions.
algolia_memorize
Saves semantic memories (facts and preferences) during conversation.
Default triggers (built into tool prompt):
- User explicitly says “remember X”
- Agent detects a stable preference or fact (for example, dietary restrictions, account type)
- User provides information useful for future interactions
algolia_memorize to save this self-contained fact.
algolia_ponder
Saves episodic memories (the agent’s reasoning chain) during conversation.
Default triggers (built into tool prompt):
- User says “remember this conversation” or “learn from this interaction”
- After solving a problem worth learning from
- After a successful workflow that could help similar future cases
algolia_ponder to record what it observed, how it reasoned, what it did, and what it learned (OTAR pattern).
algolia_memory_search
Searches existing memories during conversation using Algolia Search.
Default triggers (built into tool prompt):
- Before claiming “I don’t know” about the user
- Before answering questions about user preferences or history
- When user asks “what did I say about X?”
- When context from previous sessions would improve the response
algolia_memory_search to find dietary preferences before recommending.
Customizing tool behavior
The default triggers work for most cases, but you can override them in your agent’s instructions:Use cases
Personalise user experiences
Personalise user experiences
Problem: generic responses don’t account for individual user preferences and context.Solution: memory enables agents to tailor responses based on what they know about each user.For example,
an ecommerce agent remembers a user’s size preferences, favourite brands, and past purchases, providing relevant recommendations without asking repetitive questions.Business impact:
- Increase conversion rates through personalisation
- Improve user satisfaction with relevant suggestions
- Reduce friction in repeat interactions
- Build long-term user relationships
Reduce repetitive questions
Reduce repetitive questions
Problem: users get frustrated repeating the same information in every conversation.Solution: agents recall previously shared information, eliminating redundant questions.For example,
a support agent remembers a user’s account type, previous issues, and preferred contact method, jumping straight to solving the current problem.Business impact:
- Improve user satisfaction and retention
- Reduce conversation length and support costs
- Create seamless experiences across sessions
- Demonstrate that you value users’ time
Improve agent performance through analysis
Improve agent performance through analysis
Problem: you can’t see how your agent reasons through problems or identify what approaches work best.Solution: episodic memory captures the agent’s reasoning chain (OTAR) for each conversation, enabling analysis across customer segments and scenarios.For example.
you can export episodic memories from users who mentioned “returns”,
especially those with premium accounts.
Then, analyze how the agent handled those conversations:
- Did it resolve return requests effectively?
- Are there patterns in failed resolutions?
- Identify successful problem-solving patterns
- Improve prompts and instructions based on real reasoning
- Compare agent performance across customer segments
- Make data-driven decisions about agent configuration
Enable continuous conversations
Enable continuous conversations
Problem: conversations reset with every new session, breaking continuity.Solution: memory retains user context between sessions, even long after the initial conversation.For example,
a shopping agent recalls that a user was considering a laptop last week and proactively asks if they’re still interested or need more information.Business impact:
- Increase engagement through follow-up opportunities
- Convert consideration into purchases
- Build trust through demonstrated attention
- Create differentiated user experiences
How memory extraction works
When the agent calls memory tools, Agent Studio uses AI to process and store the information.Intelligent extraction
Quality filters ensure only valuable information is stored:- Utility: would this fact improve future responses?
- Specificity: is it concrete and factual (not mood or chitchat)?
- Effect on behavior: can you think of a query where it changes behavior?
- Factual statements about user preferences
- Important events and interactions
- Skills, knowledge, and relationships
- Patterns inferred from past experiences
- Greetings and pleasantries (“Hello”, “I appreciate it”)
- Generic traits without specifics (“User is friendly”)
- Temporary moods or states
- Duplicate information already stored
Memory lifecycle
Memory retrieval happens automatically before the agent generates a response. You can configure two retrieval modes: preload (recent memories) and preflight (query-relevant memories).Retrieval modes compared
| Feature | Preload | Preflight | Tools |
|---|---|---|---|
| Timing | Conversation start | Before each response | During response |
| Search method | Most recent N | Query-based semantic search | Agent-initiated |
| Extra latency | None | None | +1 roundtrip |
| Best for | Always-on context, few memories | Large memory sets (100+) | Dynamic, explicit recall |
Recent memories (preload)
Preload fetches the N most recent memories when a conversation starts, regardless of what the user asks.- Identify user: extract user ID from the JWT token
- Retrieve memories: fetch up to N recent memories (configurable limit)
- Filter by type: semantic, episodic, or both
- Include in context: memories are added to the agent’s initial prompt
JSON
- Small memory sets where all memories fit in context
- Always-on personalization (user preferences should always be available)
- Predictable use cases where recent memories are likely relevant
Query-based retrieval (preflight)
Preflight searches memories based on what the user is asking, not just recency. It runs before the agent responds, injecting only relevant memories into context. Configuration example:JSON
limit: maximum memories to retrieve per queryconversationWindow: number of recent messages to analyze for search context
- Large memory sets (100+ memories) where loading all recent memories is less useful
- Diverse memory content where only some memories apply to each query
- When you want to maximize relevant context without wasting tokens
During conversation (tools)
Agents can dynamically save and search memories during the conversation using memory tools. For example, if a user reports an error similar to one resolved before, the agent callsalgolia_memory_search to find past resolutions with matching symptoms.
Common integration issues
Memory not enabled - prerequisites not met
Memory not enabled - prerequisites not met
Symptoms: cannot enable memory toggle in dashboard.If you can’t enable the memory toggle in the dashboard,
check the following:
- Verify data retention is greater than 0 days
- Ensure you have
settingsRankingpermission to modify retention settings
Agent doesn't remember information
Agent doesn't remember information
Symptoms: the agent doesn’t recall previous information, even when memory is enabled.Possible causes:
- No JWT token passed: conversations must include
X-Algolia-Secure-User-Tokenheader - Memory tools not enabled: set
toolsEnabled: truein the agent configuration - Preload limit too low: increase the number of memories loaded at conversation start
- Wrong memory type: if you set the preload type to semantic, episodic memories won’t be loaded
Memories not relevant to user's query
Memories not relevant to user's query
Symptoms: the agent loads memories, but they aren’t relevant to what the user is asking about.Possible causes:You can use both preload and preflight together: preload provides baseline context while preflight adds query-specific memories.
- Using preload with large memory sets: preload fetches recent memories, not the most relevant
- Preflight not configured: query-based retrieval isn’t enabled
JSON