GEM: Translation-free zero-shot global entity matcher for global catalogs

Karim Bouyarmane

Publication

GEM: Translation-free zero-shot global entity matcher for global catalogs

By Karim Bouyarmane

2021

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

We propose a modular BiLSTM/ CNN /Transformer deep-learning encoder architecture, together with a data synthesis and training approach, to solve the problem of matching catalog products across different languages, different local catalogs, and different catalog data contributors. The end-to-end model relies solely on raw natural language textual data in the catalog entries and on images of the products, without any feature engineering, and is entirely translation-free, not requiring the translation of the catalog natural language data to the same base language for inference. We report experiments results on a 4-languages-scope model (English, French, German, Spanish) matching entities from 4 local catalogs (UK, France, Germany, Spain) of a retail website. We demonstrate that the model achieves performance comparable to state-of-the-art existing entity matchers that operate within a single language, and that the model achieves high-performance zero-shot inference on language pairs not seen in training.

GEM: Translation-free zero-shot global entity matcher for global catalogs

Latest news

Work with us