Precision of custom ranking metrics

When you rank records with business metrics, you want to make sure the difference between those values is significant. Sometimes, the difference is too small to be taken into account. Yet, when the engine performs tie-breaking between two records, it doesn’t make a difference whether you have 1,000 or 1,003 more views on a blog article. If one has more than the other, it goes first. Therefore, you want to make sure that the values you use for custom ranking are properly weighted to avoid false positives. To fix that issue, what you can do is reduce precision. An example is when you have blog articles, ranked per different attributes like unique page views and number of comments. You may have two posts with a close number of unique page views (1,000 and 1,003). Therefore, you don’t want the engine to break the tie on this attribute. Instead, you want to move on to the next criteria to get better tie-breaking and more relevant results. To run the code examples on this page, install the latest API client.

Modifying the data: an example

Before

In this example, you’re adding search to a blog. Besides Algolia’s default ranking formula, you’ve set two custom ranking attributes: pageviews first, followed by comments. Here’s what the dataset would look like:

JSON

[
  {
    "title": "Algolia + SeaUrchin.IO",
    "author": "Nicolas Dessaigne",
    "pageviews": 1023,
    "comments": 23
  },
  {
    "title": "Search Analytics: Gain Insights from User Search Data",
    "author": "Nicolas Baissas",
    "pageviews": 508,
    "comments": 42
  },
  {
    "title": "Algolia Vault – Bringing Physical & Digital Data Security to Search",
    "author": "Liam Boogar",
    "pageviews": 1022,
    "comments": 54
  }
]

And here’s the setting:

var response = await client.SetSettingsAsync(
  "INDEX_NAME",
  new IndexSettings
  {
    CustomRanking = new List<string> { "desc(pageviews)", "desc(comments)" },
  }
);

Because custom ranking applies its criteria sequentially, pageviews has more weight than comments. When you search for “Algolia”, the “Algolia + SeaUrchin IO” article comes before “Algolia Vault” only because it has one more unique page view. However, this additional view isn’t meaningful. It doesn’t make the article more relevant. Nonetheless, the engine ignores the next custom ranking attribute because of it. For these two records, it would be better if the article with the most comments appeared first: namely, that “Algolia Vault” article should come first because it has significantly more comments. But changing the order of the attributes in the customRanking parameter isn’t a good solution:

It doesn’t necessarily make sense business-wise (in this example, unique page views matter more than the number of comments),
You may face the same issue with the number of comments, so this doesn’t solve anything.

That’s why you should reduce precision to effectively use tie-breaking.

After

You can solve the issue by adding a rounded_pageviews attribute where you reduce the precision of pageviews, and use it for custom ranking instead. For example, round page views to the nearest hundred.

JSON

[
  {
    "title": "Algolia + SeaUrchin.IO",
    "author": "Nicolas Dessaigne",
    "pageviews": 1023,
    "rounded_pageviews": 1000,
    "comments": 23
  },
  {
    "title": "Search Analytics: Gain Insights from User Search Data",
    "author": "Nicolas Baissas",
    "pageviews": 508,
    "rounded_pageviews": 500,
    "comments": 42
  },
  {
    "title": "Algolia Vault – Bringing Physical & Digital Data Security to Search",
    "author": "Liam Boogar",
    "pageviews": 1022,
    "rounded_pageviews": 1000,
    "comments": 54
  }
]

Then, you would use the rounded_pageviews attribute for custom ranking instead of pageviews. Now, searching for “Algolia” would return the “Algolia Vault” article first, based on the comments attribute.

var response = await client.SetSettingsAsync(
  "INDEX_NAME",
  new IndexSettings
  {
    CustomRanking = new List<string> { "desc(rounded_pageviews)", "desc(comments)" },
  }
);

Note that this doesn’t remove pageviews from the dataset. You may want to keep it for display purposes. To know more on how to rank per custom attributes, see the guide.

​Modifying the data: an example

​Before

​After

Modifying the data: an example

Before

After