Configure your agents and app

This is a beta feature according to Algolia’s Terms of Service (“Beta Services”).

You can configure Agent Studio at two levels: app-wide and per agent. Both are updated from the API with changes taking effect immediately.

App settings

Configure app-wide behavior using the /configuration endpoint.

Data retention

Control how long Agent Studio retains your data:

Command line

curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/configuration" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "maxRetentionDays": 30 }'

This operation requires an API key with the logs ACL.

Value	Effect
`90` (default)	Data retained for 90 days
`60`	Data retained for 60 days
`30`	Data retained for 30 days
`0`	Privacy mode (see below)

Data affected by retention settings

Data	Behavior
Completion cache	Cached responses expire after the retention period
Conversations	Conversation history deleted after the retention period
Messages	Message content deleted after the retention period

Privacy mode (`maxRetentionDays: 0`)

When set to 0, Agent Studio operates in privacy mode:

Completion caching is turned off (every request calls the LLM)
Agent Studio saves conversation metadata but the message content isn’t stored.
Ideal for strict data privacy requirements

In privacy mode, the agent only sees the messages your client sends in each completion request. To preserve context across turns, include the full message history every time. Sending only the latest user message makes each request stateless.

Conversation history

Conversations are automatically stored per retention settings. Each conversation gets an auto-generated title based on content. What’s stored:

Conversation metadata (ID, timestamps, user token)
Message content (user queries, assistant responses, tool calls)
Auto-generated titles for browsing

For GDPR compliance, users can export or delete their data with the GET /user-data/{userToken} and DELETE /user-data/{userToken} endpoints. For more information, see the API reference.

Agent settings

Configure individual agents using the /agents/{agentId} endpoint.

Agent properties

Property	Type	Description
`name`	string	Display name (1-128 chars)
`description`	string	Optional description
`providerId`	UUID	LLM provider credentials
`model`	string	Model identifier. For example, `gpt-5`, `gemini-2.5-pro`
`instructions`	string	System prompt
`config`	object	Feature flags and settings
`tools`	array	Algolia search and custom tools

Update agent settings

Update any property without affecting others:

Command line

curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "instructions": "You are a helpful shopping assistant." }'

This operation requires an API key with the editSettings ACL.

Configuration options

The config object controls agent behavior:

Option	Type	Default	Description
`sendUsage`	boolean	`false`	Include token usage in response
`sendReasoning`	boolean	`false`	Include model reasoning (if supported)
`useCache`	boolean	`true`	Enable response caching
`features`	array	`[]`	Experimental features
`suggestions`	object	`null`	Prompt suggestions (see below)
`max_tokens`	integer	`0`	Cap on output tokens per LLM call (see Cost control)
`max_iterations`	integer	`50`	Maximum tool or reasoning loops per request
`thread_depth`	object	`null`	Conversation length limit
`rate_limit`	object	`null`	Per-agent and per-IP request limits (see Rate limiting)

Prompt suggestions

Generate contextual follow-up questions after each agent response. Suggestions help users discover capabilities and continue conversations naturally.

{
  "config": {
    "suggestions": {
      "enabled": true,
      "model": "gpt-5-mini"
    }
  }
}

When enabled, the agent streams a suggestions-chunk after the main response:

{
  "type": "suggestions-chunk",
  "suggestions": ["How do I filter by price?", "Show me trending products", "What categories are available?"]
}

Configuration options

Option	Type	Default	Description
`enabled`	boolean	`false`	Enable prompt suggestions
`model`	string	Agent’s model	Model for generating suggestions
`system_prompt`	string	Built-in	Custom prompt for suggestion generation

Generation settings (suggestions.generation):

Option	Range	Default	Description
`max_count`	1-5	3	Number of suggestions
`max_words`	5-15	8	Max words per suggestion
`timeout_seconds`	1-30	10	Timeout for generation

Context settings (suggestions.context):

Option	Range	Default	Description
`max_messages`	1-50	10	Conversation history to include
`include_tool_outputs`	-	`false`	Include tool results in context

Client-side handling

With AI SDK:

React

import { useChat } from '@ai-sdk/react';

function Chat() {
  const { messages, data } = useChat({ /* ... */ });

  // Suggestions arrive in the data stream
  const suggestions = data?.find(d => d.type === 'suggestions-chunk')?.suggestions;

  return (
    <>
      {/* Chat messages */}
      {suggestions && (
        <div className="suggestions">
          {suggestions.map(s => <button key={s}>{s}</button>)}
        </div>
      )}
    </>
  );
}

Use a faster, cheaper model (like gpt-5-mini) for suggestions. They don’t need the same reasoning depth as the main response.

Cost control

Cost control settings limit these sources of token usage: output per call, the number of reasoning or tool loops, and conversation history size.

JSON

{
  "config": {
    "max_tokens": 1500,
    "max_iterations": 20,
    "thread_depth": {
      "max_messages": 100
    }
  }
}

Option	Type	Default	Description
`max_tokens`	integer	`0`	Maximum output tokens per LLM call. `0` uses the model or provider default.
`max_iterations`	integer	`50`	Maximum tool or reasoning loops per request. Each loop is a separate LLM call
`thread_depth.max_messages`	integer	`null`	Maximum messages (user and assistant) in a conversation. The API rejects new requests when a conversation exceeds this limit.

For example, to update an agent’s cost control settings:

Command line

curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "config": { "max_tokens": 1500, "max_iterations": 20, "thread_depth": { "max_messages": 100 } } }'

Default values and no-limit behavior

Each cost control setting has either a default value or no limit:

max_tokens: 0 (or omitted) uses the model or provider default.
max_iterations: 0 (or omitted) uses the default of 50.
thread_depth.max_messages: null, 0, or omitted: the conversation doesn’t have a message limit.

Each iteration is billed as a separate LLM call. Lower max_iterations if your agent doesn’t need long tool chains.

Rate limiting

Limit how often clients can call an agent’s /completions endpoint. You can configure two independent rate limits:

Per-agent: maximum requests an agent can receive within a time interval
Per-IP: maximum requests a client IP can make to an agent within a time interval

When a limit is exceeded, the API returns a 429 response.

JSON

{
  "config": {
    "rate_limit": {
      "agent": {
        "enabled": true,
        "max_requests": 100,
        "window_seconds": 60
      },
      "ip": {
        "enabled": true,
        "max_requests": 300,
        "window_seconds": 60
      }
    }
  }
}

`rate_limit.agent`

Field	Type	Default	Description
`enabled`	boolean	`true`	If you set any fields on this layer, `enabled` defaults to `true`. If you set `enabled: false`, there’s no request limit.
`max_requests`	integer	none	Maximum requests allowed per time interval (minimum is 1). Required when this rate limit is enabled
`window_seconds`	integer	`60`	Time interval in seconds. Must be `30` or `60`

`rate_limit.ip`

Field	Type	Default	Description
`enabled`	boolean	`true` (when any field is set)	If `false`, this layer is unlimited
`max_requests`	integer	none	Maximum requests allowed per IP per time interval (minimum is 1). Required when this rate limit is enabled
`window_seconds`	integer	`60`	Time interval in seconds. Must be `30` or `60`

For example, to update an agent’s rate limit settings:

Command line

curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "config": { "rate_limit": { "agent": { "max_requests": 50, "window_seconds": 60 } } } }'

Default behavior and when limits don’t apply

You can configure the agent and IP rate limits independently:

If you omit rate_limit, the API doesn’t enforce agent or IP request limits.
To turn off either the agent or IP rate limit, set enabled: false for that limit

429 response

When a limit is exceeded, the API returns:

JSON

{
  "error": "TOO_MANY_REQUESTS",
  "message": "Rate limit exceeded. Retry after 60 seconds."
}

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the current time interval
`X-RateLimit-Remaining`	Remaining requests in the current time interval
`Retry-After`	Seconds until the current time interval resets

On successful responses, X-RateLimit-Limit and X-RateLimit-Remaining reflect the configured per-agent limit.

Publish workflow

Agents have two states:

Draft: test changes in preview.
Published: live for API consumers.

Command line

curl -X POST "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID/publish" \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY"

When you make changes to an agent using the PATCH /agents/{agentId} endpoint, you’re modifying the draft version of the agent. These changes aren’t visible to API consumers until you publish the agent using the POST /agents/{agentId}/publish endpoint.

​App settings

​Data retention

​Data affected by retention settings

​Privacy mode (maxRetentionDays: 0)

​Conversation history

​Agent settings

​Agent properties

​Update agent settings

​Configuration options

​Prompt suggestions

​Configuration options

​Client-side handling

​Cost control

​Default values and no-limit behavior

​Rate limiting

​rate_limit.agent

​rate_limit.ip

​Default behavior and when limits don’t apply

​429 response

​Publish workflow

​See also

App settings

Data retention

Data affected by retention settings

Privacy mode (`maxRetentionDays: 0`)

Conversation history

Agent settings

Agent properties

Update agent settings

Configuration options

Prompt suggestions

Configuration options

Client-side handling

Cost control

Default values and no-limit behavior

Rate limiting

`rate_limit.agent`

`rate_limit.ip`

Default behavior and when limits don’t apply

429 response

Publish workflow

See also