Customer-obsessed science


Research areas
-
July 22, 2025Generating diverse synthetic prior distributions leads to a tabular foundation model that outperforms task-specific baselines.
Featured news
-
2025Products on e-commerce platforms are usually organized based on seller-provided product attributes. Customers looking for a product typically have certain needs or use cases in mind, such as headphones for gym classes, or a printer for school projects. However, they often struggle to map these use cases to product attributes, thereby failing to find the product they need. To help customers shop online confidently
-
ECIR 20252025Traditional Query Auto-completion (QAC) systems optimise for query relevance based on past user interactions. This approach excels at surfacing frequently searched queries, but ensuring a diverse range of suggestions and incorporating new products or trends often requires post-processing heuristics. This limitation stems from relying on user search logs, which may not fully capture the evolving product
-
2025Audio-Visual Speech-to-Speech Translation (AVS2S) typically prioritizes improving translation quality and naturalness. However, an equally critical aspect in audio-visual content is lip-synchrony—ensuring that the movements of the lips match the spoken content—essential for maintaining realism in dubbed videos. Despite its importance, the inclusion of lip-synchrony constraints in AVS2S models has been largely
-
2025We propose a lightweight neural front-end framework for on-device speech generation and highlight its benefits towards low-resource language scaling. While data-driven models have shown potential in front-end literature, especially since they can enable fast language expansion, they are often extremely large and of high latency. There is limited work focusing on their usability in real-time settings, and
-
2025Self-supervised pretraining has transformed speech representation learning, enabling models to generalize across various downstream tasks. However, empirical studies have highlighted two notable gaps. First, different speech tasks require varying levels of acoustic and semantic information, which are encoded at different layers within the model. This adds the extra complexity of layer selection on downstream
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all