You can’t store images directly in Algolia.
Instead, store the image on a content delivery network (CDN) or web server and add the image URL to a field in your records.
When you retrieve a record from Algolia, use this URL to display the image in your app.
Why use image classification?
Retailers spend a lot of time building their catalogs. To offer a relevant search and discovery experience, they often manually classify each item, adding meta features like item type, material, style, etc. Removing some of this manual work lets you focus on the data that is important to your business—for example price, stock quantity, and popularity. Visual recognition enables automatic extraction of this information by analyzing each product image. It makes feature tagging more consistent. For example, you may have various names for the color “blue” within your product descriptions. Item descriptions could include “cerulean” or “sapphire,” but not “blue”. Without consistently having an attribute with the value “blue,” you could fail to surface all relevant products to your users when they filter on or search for “blue” items. Image classification lets you add the “blue” tag consistently for all blue products. Image classification is particularly valuable in C2C marketplaces where users may not describe their products consistently nor fully. Tags from image classification can increase the number of product attributes, increasing their discoverability. Image classification is valuable not only in C2C marketplaces, but anywhere your team is manually tagging different features like “type,” “neckline,” and “sleeve length.”What does image classification and tagging entail?
This guide outlines how to use a third party API or platform to classify images and enrich your Algolia records using these classifications. It provides examples for Google Cloud Vision API and ViSenze, but the process is the same for other providers like Amazon Rekognition. The goal is to enrich your records so that each one includes additional descriptive text. This text comes from running the product image through an image classifier, which returns classifications. Another way of thinking of classifications is “tags” or “labels.” By adding these classifications to your Algolia records, you make it easier to surface them in searches, whether users are searching with text or images. Enriching your records with classifications is a two-step process:- Image classification - sending image URLs to a third-party image recognition platform to retrieve classifications.
- Indexing - adding the relevant classification information to your Algolia records.
Platform considerations
Google Cloud Vision API is an all-purpose image recognition API. Since it draws from a large corpus of image data, it can give a wide variety of classifications with high accuracy. The downside is that the classifications it provides aren’t highly specialized or structured. All-purpose image recognition platforms can introduce irrelevant classifications. An image of model wearing a t-shirt could return relevant classifications, like “t-shirt” and the color and style of shirt, but it could also return classifications like “neck” and “arm,” if these are present in the image. Google Cloud Vision API returns tags and confidence scores of all objects that it identifies in an image. If a platform exists for your particular use case, for example ViSenze for fashion retail, it’s best to use the specialized platform over the general one. Using case specific platforms usually produces better classifications. These platforms tailor their classifications to industry relevant terms and structure them consistently. For example, ViSenze would take an image of a model wearing a t-shirt and identify all fashion related objects only, excluding objects like “neck” and “arm.” For each included item—“t-shirt,” for example—it returns relevant attributes like “neckline,” “fit,” and “sleeve length,” and their values: “v-neck,” “trim,” and “short,” respectively. You can be sure that all shirt images retrieve these same attributes in the same structure.Before you begin
The following tutorial requires a set of Algolia records, each containing an image URL, and access to an image recognition platform such as Google Cloud Vision API. Algolia doesn’t search in your original data source, but in the data you index to Algolia. Algolia accepts and stores JSON data, meaning it doesn’t store image files. Instead, it’s common to index an image URL, so that you can display the image in your results.JSON
Image classification
Image classification takes an image and returns a set of classifications or labels for it. Thanks to advances in AI, image classification is getting better and easier for non-experts to use. When using the Google Cloud Vision API, ViSenze, or other similar platforms, it can be as straightforward as feeding the platform an image URL and receiving the classifications in its response.Using Google Vision API
If you haven’t already, create a Google Account and enable the Google Vision API for it. Set up authentication so that you can retrieve credentials and use the Vision API client library. The Google Vision API returns an array of classifications: JSON objects with different properties. Of these thedescription
and score
, which is how certain the API is about the description, are particularly useful.
After initializing an instance of Google Cloud Vision’s Node.js client, you can write a function to retrieve labels from an image URL. The example below creates a getImageLabels
function that takes a public image URL, the Algolia record’s objectID
, and a scoreLimit
. The scoreLimit
is the threshold for how certain the platform must be about an object to include it in the classifications.
Since score
is a number between 0 and 1, the scoreLimit
should be between 0 and 1 too. The higher the scoreLimit
, the more certain the API must be about the label for it to be included.
You can write a function to retrieve just these or any other attributes you find useful. The getImageLabels
example returns an object with a labels
array. The array contains only label descriptions
and score
s, where score
s were higher than the scoreLimit
. The returned object also includes the original imageURL
and objectID
. The objectID
is important for sending this data to your Algolia index later.
JavaScript
JavaScript
When fetching images from HTTP(S)URLs, Google can’t guarantee that the request succeeds. Your request may fail if the specified host denies the request (for example, due to request throttling or denial of service prevention), or if Google throttles requests to the site for abuse prevention. Google advises against depending on externally hosted images for production applications.
Using ViSenze
When using a case specific platform like ViSenze, the general idea is the same. Setup an account and credentials, and send public image URLs to their Recognition API to receive classifications. You need to tailor your function to the data structure the platform returns. For example, thegetImageLabels
function below takes a public image URL, the Algolia record’s objectID
, and a scoreLimit
. The scoreLimit
is the threshold for how certain the platform must be about an object to include it in the classifications.
Since score
is a number between 0 and 1, the scoreLimit
should be between 0 and 1 too. The higher the scoreLimit
, the more certain the API must be about the label for it to be included.
The function returns an object with an objects
array. The objects
array contains all relevant identified objects (for example, “t-shirt,” or “belt”) and their coordinates, labels, and scores.
The returned object also includes the original imageURL
and objectID
. The objectID
is important for sending this data to your Algolia index later.
JavaScript
JavaScript
Indexing image classifications
Once you’ve retrieved the classifications from your third-party image recognition platform, you need to index them to Algolia. You can include classifications either when you initially index your data, or within the context of thebrowse
method. The browse
lets you retrieve your data and update it according to your needs.
Using Google Vision API
This example uses of thegetImageLabels
function from the classification section to retrieve labels for each record while using browse
. It then uses the partialUpdateObjects
method to add the labels to the record.
labels
attribute:
JSON
labels.description
to your searchableAttributes
.
To implement search by image, or if you want to filter on labels, you must include labels.description
in attributesForFaceting
.
Using ViSenze
This example uses of thegetImageLabels
function from the classification section to retrieve labels for each record while using browse
. It then uses the partialUpdateObjects
method to add the labels to the record.
It then updates the index settings to include each object’s labels in attributesForFaceting
and searchableAttributes
.
JSON