Go beyond plain fine-tuning: Improving pre-trained models for social commonsense

Ting-Yun Chang; Yang Liu; Karthik Gopalakrishnan; Behnam Hedayatnia; Pei Zhou; Dilek Hakkani-Tür

Publication

Go beyond plain fine-tuning: Improving pre-trained models for social commonsense

By Ting-Yun Chang, Yang Liu, Karthik Gopalakrishnan, Behnam Hedayatnia, Pei Zhou, Dilek Hakkani-Tür

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Pre-trained language models have demonstrated outstanding performance in many NLP tasks recently. However, their social intelligence, which requires commonsense reasoning about the current situation and mental states of others, is still developing. Towards improving language models’ social intelligence, in this study we focus on the Social IQA dataset, a task requiring social and emotional commonsense reasoning. Building on top of the pre-trained RoBERTa and GPT2 models, we propose several architecture variations and extensions, as well as leveraging external commonsense corpora, to optimize the model for Social IQA. Our proposed system achieves competitive results as those top-ranking models on the leaderboard. This work demonstrates the strengths of pre-trained language models, and provides viable ways to improve their performance for a particular task.

Go beyond plain fine-tuning: Improving pre-trained models for social commonsense

Latest news

Work with us