To ensure good performance, Algolia limits the size of each record. Long content, like a detailed Wikipedia page, might be too big to fit into one of these . To handle this, divide long pages into smaller “chunks”. This not only helps you stay within the size limit but also makes your search more relevant. Break the page into sections or even paragraphs, and store each as a separate record. When splitting into chunks, organize them based on the page structure. For instance, if you’re dealing with a lengthy Wikipedia article, create separate records for each section like “Introduction” or “History”.Documentation Index
Fetch the complete documentation index at: https://algolia.com/llms.txt
Use this file to discover all available pages before exploring further.
If you’re using the Algolia Crawler and the record size exceeds the limit, use the
helpers.splitContentIntoRecords() helper to split the page into smaller chunks.Avoid duplicates
When you split a page, the same content might appear in multiple records. By setting thedistinct parameter to true,
Algolia ensures only the most relevant of these duplicate records is shown.
You decide what counts as ‘distinct’ by choosing a meaningful attribute,
like the title of a section.
Example
In the following example, you’ve structured your records for a long page. To make sure that search results show only one entry per section, you:- Set
distincttotrue - Choose
sectionas yourattributeForDistinct.
JSON
How to enable the distinct feature
You can enabledistinct from Algolia’s dashboard or API.
Using the dashboard
- Go to the Algolia dashboard and select your Algolia .
- On the left sidebar, select Search.
- Select your Algolia and go to the Configuration tab.
- In the Search behavior section, select Deduplication and Grouping.
- Set the Distinct option to true.
- In the Attribute for Distinct field, select your attribute.
- Save your changes.
Using the API
To run the code examples on this page, install the latest API client. If using the API to enable ‘distinct`, you can either do it at indexing time (when you add records to your indices) or at query time (when users search).- Set an attribute, such as
section, as theattributeForDistinct - Set
distincttotrueto deduplicate your results.