Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Making the most of tweet-inherent features for social spam detection on Twitter

Tools
- Tools
+ Tools

Wang, Bo, Zubiaga, Arkaitz, Liakata, Maria and Procter, Rob (2015) Making the most of tweet-inherent features for social spam detection on Twitter. In: 5th Workshop on Making Sense of Microposts, Florence, Italy, 18 May 2015. Published in: Proceedings of the the 5th Workshop on Making Sense of Microposts, 1395 pp. 10-16. ISSN 1613-0073.

[img]
Preview
PDF
WRAP_paper_07.pdf - Published Version - Requires a PDF viewer.

Download (739Kb) | Preview
Official URL: http://ceur-ws.org/Vol-1395/paper_07.pdf

Request Changes to record.

Abstract

Social spam produces a great amount of noise on social media services such as Twitter, which reduces the signal-tonoise ratio that both end users and data mining applications observe. Existing techniques on social spam detection have focused primarily on the identification of spam accounts by using extensive historical and network-based data. In this paper we focus on the detection of spam tweets, which optimises the amount of data that needs to be gathered by relying only on tweet-inherent features. This enables the application of the spam detection system to a large set of tweets in a timely fashion, potentially applicable in a realtime or near real-time setting. Using two large hand labelled datasets of tweets containing spam, we study the suitability of five classification algorithms and four different feature sets to the social spam detection task. Our results show that, by using the limited set of features readily available in a tweet, we can achieve encouraging results which are competitive when compared against existing spammer detection systems that make use of additional, costly user features. Our study is the first that attempts at generalising conclusions on the optimal classifiers and sets of features for social spam detection over different datasets.

Item Type: Conference Item (Paper)
Alternative Title:
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Spam filtering (Electronic mail), Microblogs, Online social networks
Journal or Publication Title: Proceedings of the the 5th Workshop on Making Sense of Microposts
ISSN: 1613-0073
Official Date: 2015
Dates:
DateEvent
2015Published
Volume: 1395
Page Range: pp. 10-16
Status: Peer Reviewed
Date of first compliant deposit: 7 April 2016
Date of first compliant Open Access: 7 April 2016
Conference Paper Type: Paper
Title of Event: 5th Workshop on Making Sense of Microposts
Type of Event: Workshop
Location of Event: Florence, Italy
Date(s) of Event: 18 May 2015
Related URLs:
  • Publisher

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us