Customer-obsessed science


Research areas
-
June 25, 2025With large datasets, directly generating data ID codes from query embeddings is much more efficient than performing pairwise comparisons between queries and candidate responses.
Featured news
-
Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing tasks. However, their practical application in high-stake domains, such as fraud and abuse detection, remains an area that requires further exploration. The existing applications often narrowly focus on specific tasks like toxicity or hate speech detection. In this paper, we present a comprehensive benchmark
-
The increasing use of Retrieval-Augmented Generation (RAG) systems in various applications necessitates stringent protocols to ensure RAG systems’ accuracy, safety, and alignment with user intentions. In this paper, we introduce VERA (Validation and Evaluation of Retrieval-Augmented Systems), a framework designed to enhance the transparency and reliability of outputs from large language models (LLMs) that
-
VLDB 20242024Forecasting extrapolates the values of a time series into the future, and is crucial to optimize core operations for many businesses and organizations. Building machine learning (ML)-based forecasting applications presents a challenge though, due to non-stationary data and large numbers of time series. As there is no single dominating approach to forecasting, forecasting systems have to support a wide variety
-
2024We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples
-
2024Explicitly adding language information to multilingual ASR models during training has been shown to improve their performance. However, this also requires using language information during inference. In cascaded systems, this language label may come from external language identification models, which are susceptible to errors. In this work, we characterize the sensitivity to errors in language inputs of
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all