Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
ICLR 2023 Tiny Papers2023For deep learning training, learning rate schedules are often picked through trial and error, or hand-crafted optimization algorithms that focus mostly on maintaining stability and convergence without systemic incorporation of higher order derivative information to optimize the convergence slope. In this paper, we consider a stochastic version of Non-negative Matrix Factorization (NMF) where only a noisy
-
ICML 20232023We study efficient mechanisms for differentially private kernel density estimation (DP-KDE). Prior work for the Gaussian kernel described algorithms that run in time exponential in the number of dimensions d. This paper breaks the exponential barrier, and shows how the KDE can privately be approximated in time linear in d, making it feasible for high-dimensional data. We also present improved bounds for
-
ACL 20232023Bias in machine learning models can be an issue when the models are trained on particular types of data that do not generalize well, causing under performance in certain groups of users. In this work, we focus on reducing the bias related to new customers in a digital voice assistant system. It is observed that natural language understanding models often have lower performance when dealing with requests
-
Machine Translation Summit 2023 (MTS)2023Brand translations need to be consistently localized in e-commerce stores. Emerging brands and their localized forms are constantly appearing in the dynamic e-commerce landscape. These variant brand forms and aliases pose a challenge to brand handling in MT. This study examines the enforcement of brand consistency in MT at scale on the e-commerce sites worldwide. We propose various practical and sustainable
-
Interspeech 20232023Speech-to-text errors made by automatic speech recognition (ASR) systems negatively impact downstream models. Error correction models as a post-processing text editing method have been recently developed for refining the ASR outputs. However, efficient models that meet the low latency requirements of industrial grade production systems have not been well studied. We propose PATCorrect-a novel non-autoregressive
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all