Customer-obsessed science


Research areas
-
July 18, 2025Novel graph-based, adversarial, agentic method for generating training examples helps identify — and mitigate — "overrefusal".
Featured news
-
ACL Findings 20242024Large language models (LLMs) have demonstrated remarkable open-domain capabilities. LLMs tailored for a domain are typically trained entirely on a domain corpus to excel at handling domain-specific tasks. In this work, we explore an alternative strategy of continual pre-training as a means to develop domain-specific LLMs over an existing open-domain LLM. We introduce FinPythia-6.9B, developed through domain-adaptive
-
EDM 20242024Bayesian Knowledge Tracing (BKT) is a probabilistic model of a learner’s state of mastery for a knowledge component. The learner’s state is a “hidden” binary variable updated based on the correctness of the learner’s responses to questions corresponding to that knowledge component. The parameters used for this update are inferred/learned from historical ground truth data. For this, BKT is often represented
-
2024An important requirement for the reliable deployment of pre-trained large language models (LLMs) is the well-calibrated quantification of the uncertainty in their outputs. While the likelihood of predicting the next token is a practical surrogate of the data uncertainty learned during training, model uncertainty is challenging to estimate, i.e., due to lack of knowledge acquired during training. Prior efforts
-
2024In many real-world applications, it is hard to provide a reward signal in each step of a Reinforcement Learning (RL) process and more natural to give feedback when an episode ends. To this end, we study the recently proposed model of RL with Aggregate Bandit Feedback (RL-ABF), where the agent only observes the sum of rewards at the end of an episode instead of each reward individually. Prior work studied
-
2024In this paper, we tackle the challenge of inadequate and costly training data that has hindered the development of conversational question answering (ConvQA) systems. Enterprises have a large corpus of diverse internal documents. Instead of relying on a searching engine, a more compelling approach for people to comprehend these documents is to create a dialogue system. In this paper, we propose a robust
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all