Using large language models to improve product information in e-commerce catalogs
2025
To give customers good experience, an e-commerce retailer needs high-quality product information in its catalog. Yet, the raw product information often lacks sufficient quality. For a large catalog that can contain billions of products, manually fixing this information is highly labor-intensive. To address this issue, we propose using the tool use functionality of large language models to automatically improve product information. In this talk, we show why existing data cleaning methods are not well suited for this task and how we designed our automated system to improve product information. When evaluated on a random sample of products from an e-commerce catalog, our system improved product information completeness by 78% with no major drop in information accuracy.
Research areas