Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
2025Recent advancements in code completion models have primarily focused on local file contexts (Ding et al., 2023b; Jimenez et al., 2024). However, these studies do not fully capture the complexity of real-world software development, which often requires the use of rapidlyevolving public libraries. To fill the gap, we introduce LIBEVOLUTIONEVAL, a detailed study requiring an understanding of library evolution
-
2025The rise of LLMs has deflected a growing portion of human-computer interactions towards LLM-based chatbots. The remarkable abilities of these models allow users to interact using long, diverse natural language text covering a wide range of topics and styles. Phrasing these messages is a time and effort consuming task, calling for an autocomplete solution to assist users. We present ChaI-TeA: Chat Interaction
-
2025The training and fine-tuning of large language models (LLMs) often involve diverse textual data from multiple sources, which poses challenges due to conflicting gradient directions, hindering optimization and specialization. These challenges can undermine model generalization across tasks, resulting in reduced downstream performance. Recent research suggests that fine-tuning LLMs on carefully selected,
-
As the demand for online A/B testing continues to rises for tech companies, the opportunity cost of conducting these experiments becomes increasingly significant. Consequently, there is a rising need for an efficient continuous monitoring system capable of early terminating experiments when necessary. Existing literature and tools primarily focuses on early terminating experiments with evidently significant
-
2025The instruction hierarchy, which establishes a priority order from system messages to user messages, conversation history, and tool outputs, is essential for ensuring consistent and safe behavior in language models (LMs). Despite its importance, this topic receives limited attention, and there is a lack of comprehensive benchmarks for evaluating models’ ability to follow the instruction hierarchy. We bridge
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all