RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering

Rujun Han; Peng Qi; Yuhao Zhang; Lan Liu; Juliette Burger; William Wang; Zhiheng Huang; Bing Xiang; Dan Roth

Publication

RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering

By Rujun Han, Peng Qi, Yuhao Zhang, Lan Liu, Juliette Burger, William Wang, Zhiheng Huang, Bing Xiang, Dan Roth

2023

Download Copy BibTeX GitHub

Share

Download

Copy BibTeX

GitHub

Share

Open-domain question answering (ODQA) is a crucial task in natural language processing. A typical ODQA system relies on a retriever module to select relevant contexts from a large corpus for a downstream reading comprehension model. Existing ODQA datasets consist mainly of Wikipedia corpus, and are insufficient to study models’ generalizability across diverse domains, as models are trained and evaluated on the same genre of data. We propose RobustQA¹, a novel benchmark consisting of datasets from 8 different domains, which facilitates the evaluation of ODQA’s domain robustness. To build RobustQA, we annotate QA pairs in retrieval datasets with rigorous quality control. We further examine improving QA performances by incorporating unsupervised learning methods with target-domain corpus and adopting large generative language models. These methods can effectively improve model performances on RobustQA. However, experimental results demonstrate a significant gap from in-domain training, suggesting that RobustQA is a challenging benchmark to evaluate ODQA domain robustness.

RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering

Latest news

Work with us