Customer-obsessed science


Research areas
-
June 25, 2025With large datasets, directly generating data ID codes from query embeddings is much more efficient than performing pairwise comparisons between queries and candidate responses.
Featured news
-
2025In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets. However, the large size and high computation demands of LLMs limit their practicality in many applications, especially when further fine-tuning is required. To address these limitations, smaller models are typically preferred for deployment. However, their training is
-
ACM SIGOPS 2025 Workshop on Hot Topics in Operating Systems2025A metastable failure is a self-sustaining congestive collapse in which a system degrades in response to a transient stressor (e.g., a load surge) but fails to recover after the stressor is removed. These rare but potentially catastrophic events are notoriously hard to diagnose and mitigate, sometimes causing prolonged outages affecting millions of users. Ideally, we would discover susceptibility to metastable
-
2025Recent advancements in speech encoders have drawn attention due to their integration with Large Language Models for various speech tasks. While most research has focused on either causal or full-context speech encoders, there’s limited exploration to effectively handle both streaming and non-streaming applications, while achieving state-of-the-art performance. We introduce DuRep, a Dual-mode Speech Representation
-
2025The use of human speech to train LLMs poses privacy concerns due to these models’ ability to generate samples that closely resemble artifacts in the training data. We propose a speaker privacy-preserving representation learning method through the Universal Speech Codec (USC), a computationally efficient codec that disentangles speech into: (i) privacy-preserving semantically rich representations, capturing
-
2025This paper introduces MO-LightGBM, an open-source library built upon LightGBM, specifically designed to offer an integrated, versatile, and easily adaptable framework for Multi-objective Learning to Rank (MOLTR). MO-LightGBM supports diverse Multi-objective optimization (MOO) settings and incorporates 12 state-of-the-art optimization strategies. Its modular architecture enhances usability and flexibility
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all