Customer-obsessed science
Research areas
-
April 27, 20264 min readA new framework provides a statistical method for estimating the likelihood of catastrophic failures in large language models in adversarial conversations.
-
April 15, 20268 min read
-
April 7, 202613 min read
-
April 1, 20265 min read
Featured news
-
2024Trained on vast corpora of human language, language models demonstrate emergent human-like reasoning abilities. Yet they are still far from true intelligence, which opens up intriguing opportunities to explore the parallels of humans and model behaviors. In this work, we study the ability to skip steps in reasoning—a hallmark of human expertise developed through practice. Unlike humans, who may skip steps
-
NeurIPS 2024 Workshop on GenAI for Health2024We explore how private synthetic text can be generated by suitably prompting a large language model (LLM). This addresses a challenge for organizations like hospitals, which hold sensitive text data like patient medical records, and wish to share it in order to train machine learning models for medical tasks, while preserving patient privacy. Methods that rely on training or finetuning a model may be out
-
ICSR 20242024In human-robot interaction (HRI) design, it is critical to con-sider the needs of users early on in development. One ideation technique for doing so is bodystorming, in which participants act out the parts of humans and robots in a certain scenario. This practice allows potential users to draw on and express insights for how robot should help, or interact around, them in a given context. In this paper,
-
NeurIPS 2024 Workshop on Safe Generative AI2024Data sanitization in the context of language modeling involves identifying sensitive content, such as personally identifiable information (PII), and redacting them from a dataset corpus. It is a common practice used in natural language processing (NLP) to maintain privacy. Nevertheless, the impact of data sanitization on the language understanding capability of a language model remains less studied. This
-
2024How can we precisely estimate a large language model’s (LLM) accuracy on questions belonging to a specific topic within a larger question-answering dataset? The standard direct estimator, which averages the model’s accuracy on the questions in each subgroup, may exhibit high variance for subgroups (topics) with small sample sizes. Synthetic regression modeling, which leverages the model’s accuracy on questions
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all