-
ACL 20222022Multiple metrics have been introduced to measure fairness in various natural language processing tasks. These metrics can be roughly categorized into two categories: 1) extrinsic metrics for evaluating fairness in downstream applications and 2) intrinsic metrics for estimating fairness in upstream contextualized language representation models. In this paper, we conduct an extensive correlation study between
-
ACL 20222022Large-scale pre-trained sequence-to-sequence models like BART and T5 achieve state-of-the-art performance on many generative NLP tasks. However, such models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency. To alleviate this issue, we propose to jointly distill and quantize the model, where knowledge is transferred from the full-precision
-
ACL 20222022In this study, we investigate robustness against covariate drift in spoken language understanding (SLU). Covariate drift can occur in SLU when there is a drift between training and testing regarding what users request or how they request it. To study this we propose a method that exploits natural variations in data to create a covariate drift in SLU datasets. Experiments show that a state-of-the-art BERT-based
-
ACL 20222022Several methods have been proposed for classifying long textual documents using Transformers. However, there is a lack of consensus on a benchmark to enable a fair comparison among different approaches. In this paper, we provide a comprehensive evaluation of the relative efficacy measured against various baselines and diverse datasets — both in terms of accuracy as well as time and space overheads. Our
-
ICLR 20222022In NLP, a large volume of tasks involve pairwise comparison between two sequences (e.g., sentence similarity and paraphrase identification). Predominantly, two formulations are used for sentence-pair tasks: bi-encoders and cross-encoders. Bi-encoders produce fixed-dimensional sentence representations and are computationally efficient, however, they usually underperform cross-encoders. Cross-encoders can
Related content
-
September 10, 2021Data augmentation makes examples more realistic, while continual-learning techniques prevent “catastrophic forgetting”.
-
September 09, 2021Model using ASR hypotheses as extra inputs reduces word error rate of human transcriptions by almost 11%.
-
September 02, 2021Branching encoder networks make operation more efficient, while “neural diffing” reduces bandwidth requirements for model updates.
-
August 27, 2021Liu discusses her work in speech recognition and understanding, prosody modeling, summarization, and natural language processing.
-
August 27, 2021New voice for Alexa’s Reading Sidekick feature avoids the instabilities common to models with variable prosody.
-
August 25, 2021Katrin Kirchhoff, director of speech processing for Amazon Web Services, on the many scientific challenges her teams are tackling.