-
2024Task-oriented Dialog (ToD) systems have to solve multiple subgoals to accomplish user goals, whereas feedback is often obtained only at the end of the dialog. In this work, we propose SUIT (= SUbgoal-aware ITerative Training), an iterative training approach for improving ToD systems. We sample dialogs from the model we aim to improve and determine subgoals that contribute to dialog success using distant
-
Creating children’s stories through text generation is a creative task that requires stories to be both entertaining and suitable for young audiences. However, since current story generation systems often rely on pre-trained language models fine-tuned with limited story data, they may not always prioritize child-friendliness. This can lead to the unintended generation of stories containing problematic elements
-
Getting a good understanding of the customer intent is essential in e-commerce search engines. In particular, associating the correct product type to a search query plays a vital role in surfacing correct products to the customers. Query product type classification (Q2PT) is a particularly challenging task because search queries are short and ambiguous, the number of existing product categories is extremely
-
2024Training large language models (LLMs) for external tool usage is a rapidly expanding field, with recent research focusing on generating synthetic data to address the shortage of available data. However, the absence of systematic data quality checks poses complications for properly training and testing models. To that end, we propose two approaches for assessing the reliability of data for training LLMs
-
2024Current instruction-tuned language models are exclusively trained with textual preference data and thus are often not aligned with the unique requirements of other modalities, such as speech. To better align language models with the speech domain, we explore (i) prompting strategies grounded in radio-industry best practices and (ii) preference learning using a novel speech-based preference data of 20K samples
Related content
-
August 28, 2023AWS service enables machine learning innovation on a robust foundation.
-
August 23, 2023Senior principal scientist Jasha Droppo on the shared architectures of large language models and spectrum quantization text-to-speech models — and other convergences between the two fields.
-
August 18, 2023Speech recognition predominates, but Amazon's research takes in data representation, dialogue management, question answering, and more.
-
August 16, 2023Learning to represent truncated sentences with semantic graphs improves models’ ability to infer missing content.
-
August 15, 2023Guo's second internship is linked to a fellowship awarded through the Amazon–Virginia Tech Initiative for Efficient and Robust Machine Learning.
-
August 09, 2023Combining low-rank approximation, a residual binary autoencoder, and a new loss function enables a fivefold increase in compression ratio.