Customer-obsessed science
Research areas
-
September 26, 2025To transform scientific domains, foundation models will require physical-constraint satisfaction, uncertainty quantification, and specialized forecasting techniques that overcome data scarcity while maintaining scientific rigor.
-
Featured news
-
CHIIR 20242024To help customers who are still in the exploration phase, Web search engines and e-commerce websites often provide relevant Q&As in widgets, such as ‘People Also Ask’ and ‘Customers Also Ask Alexa’, with additional information. In this work, we propose to enrich this customer experience by rendering related products under each Q&A based on an automated online query recommendation. We define what are the
-
AAAI 2024 Workshop on Scientific Document Understanding2024Watermark text spotting in document images can offer access to an often unexplored source of information, providing crucial evidence about a record’s scope, audience and sometimes even authenticity. Stemming from the problem of text spotting, detecting and understanding watermarks in documents inherits the same hardships - in the wild, writing can come in various fonts, sizes and forms, making generic recognition
-
2024While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recognition results. Existing assistant systems collect large numbers of these unsuccessful interactions, but these systems usually fail to learn from these interactions
-
2024In this work, we propose a novel sequence-discriminative training criterion for automatic speech recognition (ASR) based on the Conformer Transducer. Inspired by the large-margin classifier framework, we separate the “good” and the “bad” hypotheses in an N-best list produced from a pre-trained transducer model by a margin (τ ), hence the term, Max-Margin Transducer (MMT) loss. It is observed that fine-tuning
-
2024Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this problem
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all