Customer-obsessed science


Research areas
-
July 29, 2025New cost-to-serve-software metric that accounts for the full software development lifecycle helps determine which software development innovations provide quantifiable value.
Featured news
-
CIKM 2024 Workshop on Generative AI for E-commerce2024We introduce VARM, variant relationship matcher strategy, to identify pairs of variant products in e-commerce catalogs. Traditional definitions of entity resolution are concerned with whether product mentions refer to the same underlying product. However, this fails to capture product relationships that are critical for e-commerce applications, such as having similar, but not identical, products listed
-
RecSys 20242024Improving search functionality poses challenges such as data scarcity for model training, metadata enrichment for comprehensive document indexing, and the labor-intensive manual annotation for evaluation. Traditionally, iterative methods relying on human annotators and customer feedback have been used. However, recent advancements in Large Language Models (LLMs) offer new solutions. This paper focuses on
-
E-commerce stores enable multilingual product discovery which require accurate product title translation. Multilingual large language models (LLMs) have shown promising capacity to perform machine translation tasks, and it can also enhance and translate product titles cross-lingually in one step. However, product title translation often requires more than just language conversion because titles are short
-
CIKM 2024 Workshop on Generative AI for E-commerce2024Large Language Models (LLMs) have been employed as crowd-sourced annotators to alleviate the burden of human labeling. However, the broader adoption of LLM-based automated labeling systems encounters two main challenges: 1) LLMs are prone to producing unexpected and unreliable predictions, and 2) no single LLM excels at all labeling tasks. To address these challenges, we first develop fast and effective
-
CIKM 2024 Workshop on Data-Centric AI, EMNLP 2024 Workshop on Multilingual Representation Learning2024Dense retrieval systems are commonly used for information retrieval (IR). They rely on learning text representations through an encoder and usually require supervised modeling via labelled data which can be costly to obtain or simply unavailable. In this study, we introduce a novel unsupervised text representation learning technique via instruction-tuning the pre-trained encoder-decoder large language models
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all