
The Library
Evaluating the generalisability of neural rumour verification models
Tools
Kochkina, Elena, Hossain, Tamanna, Logan, Robert L., Arana-Catania, Miguel, Procter, Rob, Zubiaga, Arkaitz, Singh, Sameer, He, Yulan and Liakata, Maria (2023) Evaluating the generalisability of neural rumour verification models. Information Processing & Management, 60 (1). 103116. doi:10.1016/j.ipm.2022.103116 ISSN 0306-4573.
|
PDF
WRAP-evaluating-generalisability-neural-rumour-verification-models-2022.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (1773Kb) | Preview |
Official URL: http://doi.org/10.1016/j.ipm.2022.103116
Abstract
Research on automated social media rumour verification, the task of identifying the veracity of questionable information circulating on social media, has yielded neural models achieving high performance, with accuracy scores that often exceed 90%. However, none of these studies focus on the real-world generalisability of the proposed approaches, that is whether the models perform well on datasets other than those on which they were initially trained and tested. In this work we aim to fill this gap by assessing the generalisability of top performing neural rumour verification models covering a range of different architectures from the perspectives of both topic and temporal robustness. For a more complete evaluation of generalisability, we collect and release COVID-RV, a novel dataset of Twitter conversations revolving around COVID-19 rumours. Unlike other existing COVID-19 datasets, our COVID-RV contains conversations around rumours that follow the format of prominent rumour verification benchmarks, while being different from them in terms of topic and time scale, thus allowing better assessment of the temporal robustness of the models. We evaluate model performance on COVID-RV and three popular rumour verification datasets to understand limitations and advantages of different model architectures, training datasets and evaluation scenarios. We find a dramatic drop in performance when testing models on a different dataset from that used for training. Further, we evaluate the ability of models to generalise in a few-shot learning setup, as well as when word embeddings are updated with the vocabulary of a new, unseen rumour. Drawing upon our experiments we discuss challenges and make recommendations for future research directions in addressing this important problem.
Item Type: | Journal Article | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subjects: | B Philosophy. Psychology. Religion > BC Logic H Social Sciences > HM Sociology Q Science > Q Science (General) |
||||||||||||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||||||||||||||
Library of Congress Subject Headings (LCSH): | Rumor, Rumor in mass media, Truth, Verification (Logic), Deep learning (Machine learning) | ||||||||||||||||||
Journal or Publication Title: | Information Processing & Management | ||||||||||||||||||
Publisher: | Elsevier | ||||||||||||||||||
ISSN: | 0306-4573 | ||||||||||||||||||
Official Date: | January 2023 | ||||||||||||||||||
Dates: |
|
||||||||||||||||||
Volume: | 60 | ||||||||||||||||||
Number: | 1 | ||||||||||||||||||
Article Number: | 103116 | ||||||||||||||||||
DOI: | 10.1016/j.ipm.2022.103116 | ||||||||||||||||||
Status: | Peer Reviewed | ||||||||||||||||||
Publication Status: | Published | ||||||||||||||||||
Access rights to Published version: | Open Access (Creative Commons) | ||||||||||||||||||
Date of first compliant deposit: | 20 December 2022 | ||||||||||||||||||
Date of first compliant Open Access: | 20 December 2022 | ||||||||||||||||||
RIOXX Funder/Project Grant: |
|
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year