30 x 50 or 30mm x 50mm, while your records may use a different format like 30x50mm.
Algolia’s features support some formatting variations but may not handle all dimension formats reliably.
This is because:
- User query formats vary. A query like
30mm x 50mmwon’t necessarily match a record with30x50mm. - Unit differences cause mismatches. Queries may include
",inches,mm,cm, orft, while records might use only one format. - Typo tolerance has limits. While typo tolerance can match slight variations such as
30by50or30 x 50, but not different units or separators. - AI doesn’t consistently interpret dimensions. Although NeuralSearch can identify dimensions, it doesn’t do so consistently, due to the ambiguity of the input.
Transform data into dimension-friendly formats
To address this issue, pre-process your data with a transformation function like the one below. For each record you pass to it, thetransform function returns a transformed record or undefined if no dimensions are found.
To run this function, create a Push to Algolia connector, using the following transformation code.
JavaScript
Customization
You can customize the function to support other measurement units or non-standard formats. To add new units (for example,yd, mil, µm, kg) or handle alternative patterns (for example, D30, H50, Ø20mm x 40mm),
update the following:
DIMENSIONS_RE. Extend the regular expression to detect new unit symbols or structural patterns. Consider using AI-assisted tools to build and test regular expressions.normalizeUnit(raw, fallback). Map any new unit symbol or abbreviation to a standard form. For example, ‘yard’ and ‘yd’ become ‘yd’.dimensionKeywords(). The function defaults tommif it doesn’t find a unit. To change this default (for example, tocmorin), update the second argument in thedimensionKeywords()call.unitForms(unit). Add alternative spellings and symbols for each unit. For example,["yd", "yard", "yards"].
How the transformation function works
The function improves search by extracting keyword variants from dimension patterns, by performing the following steps:- Identify dimensions with a regular expression
- Extract numbers and units
- Generate variants
- Attach keywords to a new attribute
Identify dimensions
The function uses theDIMENSIONS_RE regular expression to detect one-part, two-part, or three-part dimensions,
such as:
- One part:
600mm,2.4m - Two-part:
30x50,3"x6",20mm x 30mm - Three-part:
245x148x65mm,30mm x 50mm x 2m
x, *, by) and units (including mm, ", inches, and ft),
with or without spaces.
The regular expression handles a wide range of edge cases,
but test it against your data to confirm it captures the formats you use.
Extract numbers and units
For each match, the function extracts the numbers and their associated units. The function standardizes unit variants like", inch,
and inches to in.
Generate variants
Each detected dimension expands into these keyword-friendly formats:- Bare numbers:
30,50,2 - Normalized units:
30mm,2m,3in - Commonly-accepted synonyms:
30in,30",30inch,30inches - Joined forms:
- Without spacing:
30mm50mm - With separators:
30x50,30 mm by 50 mm,30mmx50mm
- Without spacing:
30mm x 50mm,
the function generates:
JavaScript
Attach keywords to an attribute
The function adds a new attribute,dimension_keywords, to each record it processes.
Add dimension keywords to your index
To use the generated keywords in Algolia:- With the
taskIDgenerated by the Push to Algolia connector, send your data to Algolia with the Ingestion APIpushTaskmethod, making sure each one includes thedimension_keywordsattribute. - Configure
dimension_keywordsas a searchable attribute in your index settings.