Customer-obsessed science
Research areas
-
November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
-
-
-
September 2, 20253 min read
-
Featured news
-
AAAI 2024 Workshop on Scientific Document Understanding2024Watermark text spotting in document images can offer access to an often unexplored source of information, providing crucial evidence about a record’s scope, audience and sometimes even authenticity. Stemming from the problem of text spotting, detecting and understanding watermarks in documents inherits the same hardships - in the wild, writing can come in various fonts, sizes and forms, making generic recognition
-
2024While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recognition results. Existing assistant systems collect large numbers of these unsuccessful interactions, but these systems usually fail to learn from these interactions
-
2024In this work, we propose a novel sequence-discriminative training criterion for automatic speech recognition (ASR) based on the Conformer Transducer. Inspired by the large-margin classifier framework, we separate the “good” and the “bad” hypotheses in an N-best list produced from a pre-trained transducer model by a margin (τ ), hence the term, Max-Margin Transducer (MMT) loss. It is observed that fine-tuning
-
2024Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this problem
-
2024The traditional cascading Entity Resolution (ER) pipeline suffers from propagated errors from upstream tasks. We address this issue by formulating a new end-to-end (E2E) ER problem, Signal-to-Entity (S2E), resolving query entity mentions to actionable entities in textual catalogs directly from audio queries instead of audio transcriptions in raw or parsed format. Additionally, we extend the E2E Spoken Language
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all