SST: Semantic and structural transformers for hierarchy-aware language models in e-commerce

Karan Samel; Houyu Zhang; Jun Ma; Haoming Jiang; Qing Ping; sheng wang; Yi Xu; Belinda Zeng; Trishul Chilimbi

Publication

SST: Semantic and structural transformers for hierarchy-aware language models in e-commerce

By Karan Samel, Houyu Zhang, Jun Ma, Haoming Jiang, Qing Ping, sheng wang, Yi Xu, Belinda Zeng, Trishul Chilimbi

2023

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Hierarchies are common structures used to organize data, such as e-commerce hierarchies associated with product data. With these product hierarchies, we aim to learn hierarchy-aware product text embeddings to improve fine-tuning performance on a variety of downstream e-commerce tasks. Existing methods leverage hierarchies by either aligning the text embeddings to separate hierarchical embeddings or by aligning the hierarchical information implicitly within a unified text Transformer. Although these models optimize to predict hierarchy information, performing further fine-tuning on new tasks is non-trivial. To bridge this gap, we propose a pre-training architecture to implicitly encode the hierarchy within the product text and then directly leverage a sub-set of the pre-training model during fine-tuning. Pre-training is done through Semantic and Structural Transformers (SST) where the Semantic-Transformer first encodes the product text into a contextual embedding, which is then used by the Structural-Transformer to infer the product’s path in the hierarchy. Fine-tuning is done using only the initial Semantic-Transformer, now that hierarchy-aware text embeddings are learned. With this design, we eliminate the need of linking each fine-tuning dataset with corresponding hierarchies. This leads to fine-tuning performance improvements on critical e-commerce downstream tasks over the existing state-of-the-art hierarchy models, even when hierarchy data is available during fine-tuning. Moreover, this improvement is consistent even after augmenting our baseline models to support fine-tuning. We conclude by discussing how such implicit structural encodings can be leveraged beyond the e-commerce domain.

SST: Semantic and structural transformers for hierarchy-aware language models in e-commerce

Latest news

Work with us