Algolia’s ranking strategy
Ranking is the second half of the relevance equation. Once the engine has found matching records, it needs to rank them, putting the best matches at the top.The ranking criteria
Algolia uses a tie-breaking algorithm, which reuses much of the same textual matching criteria to determine the best matches. A good example is typos and spelling differences. When searching, you might accidentally swap letters, miss or add extra characters, or use a different spelling from the one in the index. For example, you could type “iphoen” instead of “iphone”, or use the British spelling “theatre” while the index you search into uses “theater”. Thanks to typo tolerance, the engine returns exact matches and records with typos. Now, Algolia can reuse typo tolerance to rank records: all exact matches rank higher than records with typos. The tie-breaking algorithm privileges the best matches. If your index contained both “theater” and “theatre”, Algolia would return records for both, with the ones containing “theatre” first. The ranking criteria are, in order:- Number of typos
- Geolocation
- Number of words in the query matching the result
- Filters
- Distance between words
- Best matching attribute in the record
- Number of words matching exactly (without typos)
- Custom ranking
You can change the order of the default criteria, but you shouldn’t.
The out-of-the-box ranking order works for most uses.
How tie-breaking works
Most search engines use a coefficient-based approach and rank results based on a unique float value, and that’s hard, if not impossible, to decipher. Algolia uses a tie-breaking algorithm:- It orders all matching records according to the first criterion (number of typos, so exact matches rank first).
- For all tied records, it orders them according to the second criterion (geolocation).
- If there are still tied records (with the exact geolocation), it orders them according to the third criterion (filters) and so on until each record in the search results has a distinct ranking position.
- If, after going through the first seven criteria, there are still tied records, Algolia uses each of your custom ranking attributes to break the tie.
Custom ranking
Finding matching records using typos, geolocation, filters, and so on is only part of what makes a compelling search experience. Algolia’s default ranking formula handles this kind of record-matching relevance well. However, ordering on such properties alone undermines the value of thoughtfully ordering results based on custom metrics within each record. Custom ranking gives you direct control and is often the deciding factor on which records appear in the first set of results. Here are some examples that apply popularity to the ranking formula:- For a movie database app, if users type “spielberg films”, the custom ranking puts Spielberg’s most popular films at the top of the results.
- For a retail store, if users type “t-shirt”, the most popular t-shirts appear at the top.
- For a blog website, if users type “positive thinking”, the most popular articles on that subject appear at the top.
- Create custom ranking attributes
- Boost or penalize records
- Use business performance data to improve search (blog)
- Use custom ranking to reflect popularity (blog)
- Use custom ranking to personalize a Google-like search (blog)
popularity
attribute (descending):
JSON
Custom ranking precision
Tie-breaking only works if there are tied records to begin with. If a particular custom ranking metric is too precise, the next custom metric might never come into play. That’s why reducing precision is critical to effective tie-breaking. Consider a custom ranking configuration for movies that sets bothrating
and views
(in that order) as custom ranking attributes. If the rating across records is unnecessarily precise—say “4.321321”—the views may never be used to break the tie. However, the search results may not be more relevant. A movie with a slightly lower rating but many more views would rank lower, while you would expect it to rank higher.
To fix the situation, create another attribute (say truncated_rating
) with values rounded to one decimal place. In this example, “4.321321” becomes “4.3”. By reducing the precision of the rating
attribute, it makes it more likely to have several records with the same truncated_rating
, allowing them to tie-break on the count
attribute.