Skip to main content
This is a beta feature according to Algolia’s Terms of Service (“Beta Services”).
You can configure Agent Studio at two levels: app-wide and per agent. Both are updated from the API with changes taking effect immediately.

App settings

Configure app-wide behavior using the /configuration endpoint.

Data retention

Control how long Agent Studio retains your data:
Command line
curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/configuration" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "maxRetentionDays": 30 }'
This operation requires an API key with the logs ACL.
ValueEffect
90 (default)Data retained for 90 days
60Data retained for 60 days
30Data retained for 30 days
0Privacy mode (see below)

Data affected by retention settings

DataBehavior
Completion cacheCached responses expire after the retention period
ConversationsConversation history deleted after the retention period
MessagesMessage content deleted after the retention period

Privacy mode (maxRetentionDays: 0)

When set to 0, Agent Studio operates in privacy mode:
  • Completion caching is turned off (every request calls the LLM)
  • Agent Studio saves conversation metadata but the message content isn’t stored.
  • Ideal for strict data privacy requirements
In privacy mode, the agent only sees the messages your client sends in each completion request. To preserve context across turns, include the full message history every time. Sending only the latest user message makes each request stateless.

Conversation history

Conversations are automatically stored per retention settings. Each conversation gets an auto-generated title based on content. What’s stored:
  • Conversation metadata (ID, timestamps, user token)
  • Message content (user queries, assistant responses, tool calls)
  • Auto-generated titles for browsing
For GDPR compliance, users can export or delete their data with the GET /user-data/{userToken} and DELETE /user-data/{userToken} endpoints. For more information, see the API reference.

Agent settings

Configure individual agents using the /agents/{agentId} endpoint.

Agent properties

PropertyTypeDescription
namestringDisplay name (1-128 chars)
descriptionstringOptional description
providerIdUUIDLLM provider credentials
modelstringModel identifier. For example, gpt-5, gemini-2.5-pro
instructionsstringSystem prompt
configobjectFeature flags and settings
toolsarrayAlgolia search and custom tools

Update agent settings

Update any property without affecting others:
Command line
curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "instructions": "You are a helpful shopping assistant." }'
This operation requires an API key with the editSettings ACL.

Configuration options

The config object controls agent behavior:
OptionTypeDefaultDescription
sendUsagebooleanfalseInclude token usage in response
sendReasoningbooleanfalseInclude model reasoning (if supported)
useCachebooleantrueEnable response caching
featuresarray[]Experimental features
suggestionsobjectnullPrompt suggestions (see below)
max_tokensinteger0Cap on output tokens per LLM call (see Cost control)
max_iterationsinteger50Maximum tool or reasoning loops per request
thread_depthobjectnullConversation length limit
rate_limitobjectnullPer-agent and per-IP request limits (see Rate limiting)

Prompt suggestions

Generate contextual follow-up questions after each agent response. Suggestions help users discover capabilities and continue conversations naturally.
{
  "config": {
    "suggestions": {
      "enabled": true,
      "model": "gpt-5-mini"
    }
  }
}
When enabled, the agent streams a suggestions-chunk after the main response:
{
  "type": "suggestions-chunk",
  "suggestions": ["How do I filter by price?", "Show me trending products", "What categories are available?"]
}

Configuration options

OptionTypeDefaultDescription
enabledbooleanfalseEnable prompt suggestions
modelstringAgent’s modelModel for generating suggestions
system_promptstringBuilt-inCustom prompt for suggestion generation
Generation settings (suggestions.generation):
OptionRangeDefaultDescription
max_count1-53Number of suggestions
max_words5-158Max words per suggestion
timeout_seconds1-3010Timeout for generation
Context settings (suggestions.context):
OptionRangeDefaultDescription
max_messages1-5010Conversation history to include
include_tool_outputs-falseInclude tool results in context

Client-side handling

With AI SDK:
React
import { useChat } from '@ai-sdk/react';

function Chat() {
  const { messages, data } = useChat({ /* ... */ });

  // Suggestions arrive in the data stream
  const suggestions = data?.find(d => d.type === 'suggestions-chunk')?.suggestions;

  return (
    <>
      {/* Chat messages */}
      {suggestions && (
        <div className="suggestions">
          {suggestions.map(s => <button key={s}>{s}</button>)}
        </div>
      )}
    </>
  );
}
Use a faster, cheaper model (like gpt-5-mini) for suggestions. They don’t need the same reasoning depth as the main response.

Cost control

Cost control settings limit these sources of token usage: output per call, the number of reasoning or tool loops, and conversation history size.
JSON
{
  "config": {
    "max_tokens": 1500,
    "max_iterations": 20,
    "thread_depth": {
      "max_messages": 100
    }
  }
}
OptionTypeDefaultDescription
max_tokensinteger0Maximum output tokens per LLM call. 0 uses the model or provider default.
max_iterationsinteger50Maximum tool or reasoning loops per request. Each loop is a separate LLM call
thread_depth.max_messagesintegernullMaximum messages (user and assistant) in a conversation. The API rejects new requests when a conversation exceeds this limit.
For example, to update an agent’s cost control settings:
Command line
curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "config": { "max_tokens": 1500, "max_iterations": 20, "thread_depth": { "max_messages": 100 } } }'

Default values and no-limit behavior

Each cost control setting has either a default value or no limit:
  • max_tokens: 0 (or omitted) uses the model or provider default.
  • max_iterations: 0 (or omitted) uses the default of 50.
  • thread_depth.max_messages: null, 0, or omitted: the conversation doesn’t have a message limit.
Each iteration is billed as a separate LLM call. Lower max_iterations if your agent doesn’t need long tool chains.

Rate limiting

Limit how often clients can call an agent’s /completions endpoint. You can configure two independent rate limits:
  • Per-agent: maximum requests an agent can receive within a time interval
  • Per-IP: maximum requests a client IP can make to an agent within a time interval
When a limit is exceeded, the API returns a 429 response.
JSON
{
  "config": {
    "rate_limit": {
      "agent": {
        "enabled": true,
        "max_requests": 100,
        "window_seconds": 60
      },
      "ip": {
        "enabled": true,
        "max_requests": 300,
        "window_seconds": 60
      }
    }
  }
}

rate_limit.agent

FieldTypeDefaultDescription
enabledbooleantrueIf you set any fields on this layer, enabled defaults to true. If you set enabled: false, there’s no request limit.
max_requestsintegernoneMaximum requests allowed per time interval (minimum is 1). Required when this rate limit is enabled
window_secondsinteger60Time interval in seconds. Must be 30 or 60

rate_limit.ip

FieldTypeDefaultDescription
enabledbooleantrue (when any field is set)If false, this layer is unlimited
max_requestsintegernoneMaximum requests allowed per IP per time interval (minimum is 1). Required when this rate limit is enabled
window_secondsinteger60Time interval in seconds. Must be 30 or 60
For example, to update an agent’s rate limit settings:
Command line
curl -X PATCH "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID" \
  -H 'Content-Type: application/json' \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY" \
  -d '{ "config": { "rate_limit": { "agent": { "max_requests": 50, "window_seconds": 60 } } } }'

Default behavior and when limits don’t apply

You can configure the agent and IP rate limits independently:
  • If you omit rate_limit, the API doesn’t enforce agent or IP request limits.
  • To turn off either the agent or IP rate limit, set enabled: false for that limit

429 response

When a limit is exceeded, the API returns:
JSON
{
  "error": "TOO_MANY_REQUESTS",
  "message": "Rate limit exceeded. Retry after 60 seconds."
}
HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current time interval
X-RateLimit-RemainingRemaining requests in the current time interval
Retry-AfterSeconds until the current time interval resets
On successful responses, X-RateLimit-Limit and X-RateLimit-Remaining reflect the configured per-agent limit.

Publish workflow

Agents have two states:
  • Draft: test changes in preview.
  • Published: live for API consumers.
Command line
curl -X POST "https://$ALGOLIA_APPLICATION_ID.algolia.net/agent-studio/1/agents/$AGENT_ID/publish" \
  -H "x-algolia-application-id: $ALGOLIA_APPLICATION_ID" \
  -H "x-algolia-api-key: $ALGOLIA_API_KEY"
When you make changes to an agent using the PATCH /agents/{agentId} endpoint, you’re modifying the draft version of the agent. These changes aren’t visible to API consumers until you publish the agent using the POST /agents/{agentId}/publish endpoint.

See also

Last modified on May 21, 2026