-
Task-oriented dialogue systems are essential for applications ranging from customer service to personal assistants and are widely used across various industries. However, developing effective multi-domain systems remains a significant challenge due to the complexity of handling diverse user intents, entity types, and domain-specific knowledge across several domains. In this work, we propose DARD (Domain
-
2024The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems, leading to the prevalence of AI-generated content and challenges in detecting misinformation and managing conflicting information, or "inter-evidence conflicts." This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation
-
In-Context Learning (ICL) has enabled Large Language Models (LLMs) to excel as generalpurpose models in zero and few-shot task settings. However, since LLMs are often not trained on the downstream tasks, they lack crucial contextual knowledge from the data distributions, which limits their task adaptability. This paper explores using data priors to automatically customize prompts in ICL. We extract these
-
2024Pre-trained language models, trained on largescale corpora, demonstrate strong generalizability across various NLP tasks. Finetuning these models for specific tasks typically involves updating all parameters, which is resource-intensive. Parameter-efficient finetuning (PEFT) methods, such as the popular LoRA family, introduce low-rank matrices to learn only a few parameters efficiently. However, during
-
In-context learning (ICL) is a powerful paradigm where large language models (LLMs) benefit from task demonstrations added to the prompt. Yet, selecting optimal demonstrations is not trivial, especially for complex or multi-modal tasks where input and output distributions differ. We hypothesize that forming taskspecific representations of the input is key. In this paper, we propose a method to align representations
Related content
-
Based on a figure from "TernaryBERT: Distillation-aware ultra-low bit BERT"June 06, 2022Combination of distillation and distillation-aware quantization compresses BART model to 1/16th its size.
-
June 01, 2022Knowledge distillation and discriminative training enable efficient use of a BERT-based model to rescore automatic-speech-recognition hypotheses.
-
May 27, 2022Amazon Scholar and Columbia professor Kathleen McKeown on model compression, data distribution shifts, language revitalization, and more.
-
May 17, 2022Papers focus on speech conversion and data augmentation — and sometimes both at once.
-
May 12, 2022Multimodal training, signal-to-interpretation, and BERT rescoring are just a few topics covered by Amazon’s 21 speech-related papers.
-
May 10, 2022Topics range from the predictable, such as speech recognition and signal processing, to time series forecasting and personalization.