
The Library
Boosting low-resource biomedical QA via entity-aware masking strategies
Tools
Pergola, Gabriele, Kochkina, Elena, Gui, Lin, Liakata, Maria and He, Yulan (2021) Boosting low-resource biomedical QA via entity-aware masking strategies. In: EACL 2021: The 16th Conference of the European Chapter of the Association for Computational Linguistics, Virtual conference, 19-23 Apr 2021. Published in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume pp. 1977-1985. doi:10.18653/v1/2021.eacl-main.169
|
PDF
WRAP-Boosting-low-resource-biomedical-QA-entity-aware-masking-2021.pdf - Accepted Version - Requires a PDF viewer. Download (904Kb) | Preview |
Official URL: https://doi.org/10.18653/v1/2021.eacl-main.169
Abstract
Biomedical question-answering (QA) has gained increased attention for its capability to provide users with high-quality information from a vast scientific literature. Although an increasing number of biomedical QA datasets has been recently made available, those resources are still rather limited and expensive to produce; thus, transfer learning via pre-trained language models (LMs) has been shown as a promising approach to leverage existing general-purpose knowledge. However, fine-tuning these large models can be costly and time consuming and often yields limited benefits when adapting to specific themes of specialised domains, such as the COVID-19 literature. Therefore, to bootstrap further their domain adaptation, we propose a simple yet unexplored approach, which we call biomedical entity-aware masking (BEM) strategy, encouraging masked language models to learn entity-centric knowledge based on the pivotal entities characterizing the domain at hand, and employ those entities to drive the LM fine-tuning. The resulting strategy is a downstream process applicable to a wide variety of masked LMs, not requiring additional memory or components in the neural architectures. Experimental results show performance on par with the state-of-the-art models on several biomedical QA datasets.
Item Type: | Conference Item (Paper) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software R Medicine > R Medicine (General) |
||||||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||||||||
Library of Congress Subject Headings (LCSH): | Question-answering systems, Transfer learning (Machine learning), Data sets, Medicine -- Research -- Data processing | ||||||||||||
Journal or Publication Title: | Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume | ||||||||||||
Publisher: | Association for Computational Linguistics | ||||||||||||
Official Date: | April 2021 | ||||||||||||
Dates: |
|
||||||||||||
Page Range: | pp. 1977-1985 | ||||||||||||
DOI: | 10.18653/v1/2021.eacl-main.169 | ||||||||||||
Status: | Peer Reviewed | ||||||||||||
Publication Status: | Published | ||||||||||||
Access rights to Published version: | Restricted or Subscription Access | ||||||||||||
Copyright Holders: | Copyright © 1963–2021 ACL | ||||||||||||
Date of first compliant deposit: | 3 March 2021 | ||||||||||||
Date of first compliant Open Access: | 13 December 2021 | ||||||||||||
RIOXX Funder/Project Grant: |
|
||||||||||||
Conference Paper Type: | Paper | ||||||||||||
Title of Event: | EACL 2021: The 16th Conference of the European Chapter of the Association for Computational Linguistics | ||||||||||||
Type of Event: | Conference | ||||||||||||
Location of Event: | Virtual conference | ||||||||||||
Date(s) of Event: | 19-23 Apr 2021 | ||||||||||||
Related URLs: | |||||||||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year