CliniQG4QA

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

  • Advisor: Dr. Huan Sun
  • Duration: July 2019 - May 2020
  • Publication venue: IEEE BIBM 2021 Best Paper Award
  • Summary: Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts. Studies show that neural QA models trained on one corpus may not generalize well to new clinical texts from a different institute or a different patient group, where large-scale QA pairs are not readily available for model retraining. To address this challenge, we propose a simple yet effective framework, CliniQG4QA, which leverages question generation (QG) to synthesize QA pairs on new clinical contexts and boosts QA models without requiring manual annotations. In order to generate diverse types of questions that are essential for training QA models, we further introduce a seq2seq-based question phrase prediction (QPP) module that can be used together with most existing QG models to diversify the generation. Our comprehensive experiment results show that the QA corpus generated by our framework can improve QA models on the new contexts (up to 8% absolute gain in terms of Exact Match), and that the QPP module plays a crucial role in achieving the gain. Our dataset and code are available at: https://github.com/sunlab-osu/CliniQG4QA/.