Memory - Algolia

This is a beta feature according to Algolia’s Terms of Service (“Beta Services”).

AI agents use memory to retain and recall information across multiple conversations with the same user, creating personalised experiences that improve over time. Without memory, agents start fresh with every conversation:

Lost context: agents lose context, requiring users to repeat information
Repetitive interactions: agents ask the same questions in every session
Missed opportunities: agents can’t provide personalised recommendations based on past behavior
Poor user experience: the experience may feel impersonal

How memory works

When a user interacts with your agent, memory operates in two stages:

Retrieval (automatic). At the start of a conversation, relevant memories are loaded for context
During a conversation, if enabled, agents can use the following memory tools:
- algolia_memorize: saves semantic memories (facts, preferences)
- algolia_ponder: saves episodic memories (experiences, observations)
- algolia_memory_search: searches existing memories

For example, if a user mentions “I’m vegetarian” in one conversation. In the next conversation, when asked for restaurant recommendations, the agent recalls this preference and suggests vegetarian options.

Memory types

Agent Studio supports two types of memory, inspired by human cognitive architecture: semantic and episodic.

Semantic memory

Stores timeless facts, preferences, and general knowledge about the user. For example:

“User is allergic to peanuts”
“User prefers dark mode in apps”
“User lives in Madrid and speaks Spanish and English”
“User’s job title is Software Engineer”

Use cases:

User profile information
Preferences and settings
Dietary restrictions
Accessibility needs
Communication style

Semantic memory structure

JSON

{
  "text": "User prefers organic vegetables and shops at farmer's markets weekly",
  "rawExtract": "I love buying organic vegetables at the farmer's market every Saturday",
  "keywords": ["organic", "vegetables", "farmer's market", "weekly", "shopping"],
  "topics": ["food", "preferences", "shopping"],
  "recallTriggers": ["vegetables", "shopping habits", "organic food"],
  "memoryType": "semantic"
}

Semantic memories are self-contained facts that remain relevant regardless of when they were created.

Episodic memory

Captures the agent’s reasoning chain from conversations: what it observed, thought, did, and learned. Use episodic memory to extract meta-learnings for process improvement and analysis. Use episodic memory when you want to understand the agent’s reasoning process. To structure this information, Agent Studio uses the OTAR pattern:

Observation: what happened (user input, context, problem)
Thoughts: why the agent chose this approach (reasoning, constraints)
Action: what the agent did (tool calls, responses, workflow)
Result: what happened and what was learned

Use cases:

Analyze agent performance across customer segments
Identify successful problem-solving patterns
Review conversations to improve prompts and instructions
Answer questions like “How does prompt A perform for premium customers asking about returns?”

Episodic memory structure

JSON

{
  "text": "Experience: Resolved account lockout via password reset",
  "episode": {
    "observation": "User unable to log in despite correct password. Error: 'invalid credentials'. User tried 3 times.",
    "thoughts": "Account lockout likely triggered by security measure after multiple failed attempts. Password reset would unlock account, not just retry.",
    "action": "search_kb(query:'account locked') → found lockout policy → Explained 3-attempt limit → Sent password reset link → User confirmed receipt",
    "result": "User successfully logged in with new password. Learning: account locks after 3 failed attempts always require password reset."
  },
  "keywords": ["login", "account lock", "password reset", "security"],
  "topics": ["technical"],
  "memoryType": "episodic"
}

OTAR captures reasoning chains to inform similar future situations.

Enable memory

To enable memory, you must:

Enable the feature on your agent
Verify data retention
Set up user authentication

Enable on your agent

From the dashboard
With the API

From the Agent Studio agent edit view:

Open your agent’s settings
Go to the Customizations section
Find the Memory toggle
Click Configure to check prerequisites
Enable memory if prerequisites are met
Save changes

The dashboard validates prerequisites automatically and guides you through any missing configuration.

Enable memory when creating or updating an agent:

JSON

{
  "name": "My Agent",
  "config": {
    "memory": {
      "enabled": true,
      "toolsEnabled": true,
      "preload": {
        "limit": 10,
        "type": "semantic"
      },
      "preflight": {
        "limit": 5,
        "conversationWindow": 3
      }
    }
  }
}

enabled: activates memory retrieval
toolsEnabled: gives the agent access to memory tools
preload.limit: number of recent memories to load at conversation start
preload.type: "semantic", "episodic", or "all"
preflight.limit: maximum memories to retrieve per (semantic search)
preflight.conversationWindow: recent messages to use as search context

Memory configuration is nested inside the config object, not as a top-level field.For more information, see the Agent Studio API reference.

Verify data retention

Memory learns from conversation history, so conversations must be stored (retention > 0 days).From the Algolia dashboard, go to Agent Studio > Settings:

Check the Retention period is set to 30, 60, or 90 days (not 0)

Available retention values: only four discrete values are supported: 0, 30, 60, or 90 days.
Application-wide setting: retention applies to all agents in your Algolia application, not individual agents.

Why this is required:

Memory extracts information from past conversations
If conversations are deleted immediately (0 days retention), there’s nothing to learn from
Longer retention enables better memory extraction and consolidation

A 30-day retention usually provides a good balance of memory quality, privacy, and compliance.

Set up user authentication

Memory requires user authentication to identify which user’s memories to load and save. Each user has their own set of memories that are isolated from other users.For complete setup instructions, see User authentication. This guide covers:

Getting your secret key from the dashboard
Generating JWTs (JSON Web Tokens) on your backend
Security guidance and token management

The same secure JWTs work for both memory and conversations. If you’ve already set up JWT authentication for conversations, you can reuse the same setup.

Once set up, include the X-Algolia-Secure-User-Token header in your completion requests to enable user-scoped memory:

JavaScript

const appID = "ALGOLIA_APPLICATION_ID";
const agentId = "AGENT_ID";
const apiKey = "ALGOLIA_SEARCH_API_KEY";

const response = await fetch(
  `https://${appID}.algolia.net/agent-studio/1/agents/${agentId}/completions`,
  {
    method: 'POST',
    headers: {
      'X-Algolia-Application-Id': appID,
      'X-Algolia-API-Key': apiKey,
      'X-Algolia-Secure-User-Token': userToken  // Enables user-scoped memory
    },
    body: JSON.stringify({
      messages: [{ role: 'user', content: 'Hello' }]
    })
  }
);

Memory tools

When toolsEnabled is true, your agent has access to three memory tools. Each tool includes default logic for when to activate but you can customize this behavior in your agent’s instructions.

`algolia_memorize`

Saves semantic memories (facts and preferences) during conversation. Default triggers (built into tool prompt):

User explicitly says “remember X”
Agent detects a stable preference or fact (for example, dietary restrictions, account type)
User provides information useful for future interactions

For example, if a user says “I’m allergic to shellfish”, the agent calls algolia_memorize to save this self-contained fact.

`algolia_ponder`

Saves episodic memories (the agent’s reasoning chain) during conversation. Default triggers (built into tool prompt):

User says “remember this conversation” or “learn from this interaction”
After solving a problem worth learning from
After a successful workflow that could help similar future cases

For example, after resolving a support ticket, the agent calls algolia_ponder to record what it observed, how it reasoned, what it did, and what it learned (OTAR pattern).

`algolia_memory_search`

Searches existing memories during conversation using Algolia Search. Default triggers (built into tool prompt):

Before claiming “I don’t know” about the user
Before answering questions about user preferences or history
When user asks “what did I say about X?”
When context from previous sessions would improve the response

For example, if a user asks “What restaurants would I like?”, the agent calls algolia_memory_search to find dietary preferences before recommending.

Customizing tool behavior

The default triggers work for most cases, but you can override them in your agent’s instructions:

# Memory guidelines
- Always ponder after resolving support tickets
- Memorize product preferences when users browse categories
- Never memorize payment information

This gives you fine-grained control over what your agent remembers and when.

Use cases

Personalise user experiences

Problem: generic responses don’t account for individual user preferences and context.Solution: memory enables agents to tailor responses based on what they know about each user.For example, an ecommerce agent remembers a user’s size preferences, favourite brands, and past purchases, providing relevant recommendations without asking repetitive questions.Business impact:

Increase conversion rates through personalisation
Improve user satisfaction with relevant suggestions
Reduce friction in repeat interactions
Build long-term user relationships

Reduce repetitive questions

Problem: users get frustrated repeating the same information in every conversation.Solution: agents recall previously shared information, eliminating redundant questions.For example, a support agent remembers a user’s account type, previous issues, and preferred contact method, jumping straight to solving the current problem.Business impact:

Improve user satisfaction and retention
Reduce conversation length and support costs
Create seamless experiences across sessions
Demonstrate that you value users’ time

Improve agent performance through analysis

Problem: you can’t see how your agent reasons through problems or identify what approaches work best.Solution: episodic memory captures the agent’s reasoning chain (OTAR) for each conversation, enabling analysis across customer segments and scenarios.For example. you can export episodic memories from users who mentioned “returns”, especially those with premium accounts. Then, analyze how the agent handled those conversations:

Did it resolve return requests effectively?
Are there patterns in failed resolutions?

Business impact:

Identify successful problem-solving patterns
Improve prompts and instructions based on real reasoning
Compare agent performance across customer segments
Make data-driven decisions about agent configuration

Enable continuous conversations

Problem: conversations reset with every new session, breaking continuity.Solution: memory retains user context between sessions, even long after the initial conversation.For example, a shopping agent recalls that a user was considering a laptop last week and proactively asks if they’re still interested or need more information.Business impact:

Increase engagement through follow-up opportunities
Convert consideration into purchases
Build trust through demonstrated attention
Create differentiated user experiences

How memory extraction works

When the agent calls memory tools, Agent Studio uses AI to process and store the information.

Intelligent extraction

Quality filters ensure only valuable information is stored:

Utility: would this fact improve future responses?
Specificity: is it concrete and factual (not mood or chitchat)?
Effect on behavior: can you think of a query where it changes behavior?

What gets extracted:

Factual statements about user preferences
Important events and interactions
Skills, knowledge, and relationships
Patterns inferred from past experiences

What gets filtered out:

Greetings and pleasantries (“Hello”, “I appreciate it”)
Generic traits without specifics (“User is friendly”)
Temporary moods or states
Duplicate information already stored

Memory lifecycle

Memory retrieval happens automatically before the agent generates a response. You can configure two retrieval modes: preload (recent memories) and preflight (query-relevant memories).

Retrieval modes compared

Feature	Preload	Preflight	Tools
Timing	Conversation start	Before each response	During response
Search method	Most recent N	Query-based semantic search	Agent-initiated
Extra latency	None	None	+1 roundtrip
Best for	Always-on context, few memories	Large memory sets (100+)	Dynamic, explicit recall

You can enable both modes together: preload provides baseline context while preflight adds query-specific memories.

Recent memories (preload)

Preload fetches the N most recent memories when a conversation starts, regardless of what the user asks.

Identify user: extract user ID from the JWT token
Retrieve memories: fetch up to N recent memories (configurable limit)
Filter by type: semantic, episodic, or both
Include in context: memories are added to the agent’s initial prompt

Configuration example:

JSON

{
  "memory": {
    "enabled": true,
    "preload": {
      "limit": 10,
      "type": "semantic"
    }
  }
}

When to use preload:

Small memory sets where all memories fit in context
Always-on personalization (user preferences should always be available)
Predictable use cases where recent memories are likely relevant

Query-based retrieval (preflight)

Preflight searches memories based on what the user is asking, not just recency. It runs before the agent responds, injecting only relevant memories into context. Configuration example:

JSON

{
  "memory": {
    "enabled": true,
    "preflight": {
      "limit": 5,
      "conversationWindow": 3
    }
  }
}

limit: maximum memories to retrieve per query
conversationWindow: number of recent messages to analyze for search context

When to use preflight:

Large memory sets (100+ memories) where loading all recent memories is less useful
Diverse memory content where only some memories apply to each query
When you want to maximize relevant context without wasting tokens

During conversation (tools)

Agents can dynamically save and search memories during the conversation using memory tools. For example, if a user reports an error similar to one resolved before, the agent calls algolia_memory_search to find past resolutions with matching symptoms.

Common integration issues

Memory not enabled - prerequisites not met

Symptoms: cannot enable memory toggle in dashboard.If you can’t enable the memory toggle in the dashboard, check the following:

Verify data retention is greater than 0 days
Ensure you have settingsRanking permission to modify retention settings

Solution: follow the configuration modal’s guidance to set up missing prerequisites.

Agent doesn't remember information

Symptoms: the agent doesn’t recall previous information, even when memory is enabled.Possible causes:

No JWT token passed: conversations must include X-Algolia-Secure-User-Token header
Memory tools not enabled: set toolsEnabled: true in the agent configuration
Preload limit too low: increase the number of memories loaded at conversation start
Wrong memory type: if you set the preload type to semantic, episodic memories won’t be loaded

Solution: verify JWT authentication is working, ensure memory tools are enabled, and adjust preload configuration.

Memories not relevant to user's query

Symptoms: the agent loads memories, but they aren’t relevant to what the user is asking about.Possible causes:

Using preload with large memory sets: preload fetches recent memories, not the most relevant
Preflight not configured: query-based retrieval isn’t enabled

Solution: for users with many memories (100+), enable preflight to retrieve query-relevant memories:

JSON

{
  "memory": {
    "enabled": true,
    "preflight": {
      "limit": 5,
      "conversationWindow": 3
    }
  }
}

You can use both preload and preflight together: preload provides baseline context while preflight adds query-specific memories.

​How memory works

​Memory types

​Semantic memory

​Episodic memory

​Enable memory

​Memory tools

​algolia_memorize

​algolia_ponder

​algolia_memory_search

​Customizing tool behavior

​Use cases

​How memory extraction works

​Intelligent extraction

​Memory lifecycle

​Retrieval modes compared

​Recent memories (preload)

​Query-based retrieval (preflight)

​During conversation (tools)

​Common integration issues

​See also

How memory works

Memory types

Semantic memory

Episodic memory

Enable memory

Memory tools

`algolia_memorize`

`algolia_ponder`

`algolia_memory_search`

Customizing tool behavior

Use cases

How memory extraction works

Intelligent extraction

Memory lifecycle

Retrieval modes compared

Recent memories (preload)

Query-based retrieval (preflight)

During conversation (tools)

Common integration issues

See also