Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

PHEME dataset of rumours and non-rumours

Tools
- Tools
+ Tools

Zubiaga, Arkaitz, Wong Sak Hoi, Geraldine, Liakata, Maria and Procter, Rob (2016) PHEME dataset of rumours and non-rumours. [Dataset]

Research output not available from this repository.

Request-a-Copy directly from author or use local Library Get it For Me service.

Official URL: https://wrap.warwick.ac.uk/134772

Request Changes to record.

Abstract

Breaking news leads to situations of fast-paced reporting in social media, producing all kinds of updates related to news stories, albeit with the caveat that some of those early updates tend to be rumours, i.e., information with an unverified status at the time of posting. Flagging information that is unverified can be helpful to avoid the spread of information that may turn out to be false. Detection of rumours can also feed a rumour tracking system that ultimately determines their veracity. In this paper we introduce a novel approach to rumour detection that learns from the sequential dynamics of reporting during breaking news in social media to detect rumours in new stories. Using Twitter datasets collected during five breaking news stories, we experiment with Conditional Random Fields as a sequential classifier that leverages context learnt during an event for rumour detection, which we compare with the state-of-the-art rumour detection system as well as other baselines. In contrast to existing work, our classifier does not need to observe tweets querying a piece of information to deem it a rumour, but instead we detect rumours from the tweet alone by exploiting context learnt during the event. Our classifier achieves competitive performance, beating the state-of-the-art classifier that relies on querying tweets with improved precision and recall, as well as outperforming our best baseline with nearly 40% improvement in terms of F1 score. The scale and diversity of our experiments reinforces the generalisability of our classifier.

Item Type: Dataset
Alternative Title: Data for Detection and resolution of rumours in social media : a survey
Subjects: H Social Sciences > HM Sociology
Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Type of Data: Observational data
Library of Congress Subject Headings (LCSH): Social media, Natural language processing (Computer science), Data mining
Publisher: University of Warwick, Department of Computer Science
Official Date: 24 October 2016
Dates:
DateEvent
24 October 2016Published
Status: Not Peer Reviewed
Publication Status: Published
Media of Output (format): .json
Access rights to Published version: Open Access (Creative Commons)
Copyright Holders: University of Warwick
Description:

Data record consists of a zip archive containing sub-folders organised according to event and an accompanying readme file.
The data is structured as follows. Each event has a directory, with two subfolders, rumours and non-rumours. These two folders have folders named with a tweet ID. The tweet itself can be found on the 'source-tweet' directory of the tweet in question, and the directory 'reactions' has the set of tweets responding to that source tweet. Twitter retains the ownership and rights of the content of the tweets.

RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
611233Seventh Framework Programmehttp://dx.doi.org/10.13039/100011102
EP/K000128/1[EPSRC] Engineering and Physical Sciences Research Councilhttp://dx.doi.org/10.13039/501100000266
687847H2020 European Research Councilhttp://dx.doi.org/10.13039/100010663
654024H2020 European Research Councilhttp://dx.doi.org/10.13039/100010663
UNSPECIFIEDAlan Turing Institutehttp://dx.doi.org/10.13039/100012338
Related URLs:
  • Related item in WRAP
  • Other
  • Other
  • Other
Contributors:
ContributionNameContributor ID
Contact PersonZubiaga, Arkaitz64180

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item
twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us