Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
Interspeech 20232023Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by the expert
-
Interspeech 20232023In interactive automatic speech recognition (ASR) systems, low-latency requirements limit the amount of search space that can be explored during decoding, particularly in end-to-end neural ASR. In this paper, we present a novel streaming ASR architecture that outputs a confusion network while maintaining limited latency, as needed for interactive applications. We show that 1-best results of our model are
-
KDD 20232023In Learning-to-Rank (LTR) problems, the task of delivering relevant search results and allocating fair exposure to items of a protected group can conflict. Previous works in Fair LTR have attempted to resolve this by combining the objectives of relevant ranking and fair ranking into a single linear combination, but this approach is limited by the nonconvexity of the objective functions and can result in
-
KDD 20232023Learning to Rank (LTR) technique is ubiquitous in Information Retrieval systems, especially in search ranking applications. The relevance labels used to train ranking models are often noisy measurements of human behavior, such as product ratings in product searches. This results in non-unique ground truth rankings and ambiguity. To address this, Multi-Label LTR (MLLTR) is used to train models using multiple
-
Interspeech 20232023Speaker diarization (SD) is typically used with an automatic speech recognition (ASR) system to ascribe speaker labels to recognized words. The conventional approach reconciles outputs from independently optimized ASR and SD systems, where the SD system typically uses only acoustic information to identify the speakers in the audio stream. This approach can lead to speaker errors especially around speaker
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all