Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

On the efficiency of data collection for multiple Naïve Bayes classifiers

Tools
- Tools
+ Tools

Manino, Edoardo, Tran-Thanh, Long and Jennings, Nicholas R. (2019) On the efficiency of data collection for multiple Naïve Bayes classifiers. Artificial Intelligence, 275 . pp. 356-378. doi:10.1016/j.artint.2019.06.010

[img] PDF
WRAP-On-effiency-data-collection-multiple-Tran-Thanh-2019.pdf - Accepted Version
Embargoed item. Restricted access to Repository staff only - Requires a PDF viewer.
Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0.

Download (1035Kb)
Official URL: http://dx.doi.org/10.1016/j.artint.2019.06.010

Request Changes to record.

Abstract

Many classification problems are solved by aggregating the output of a group of distinct predictors. In this respect, a popular choice is to assume independence and employ a Naïve Bayes classifier. When we have not just one but multiple classification problems at the same time, the question of how to assign the limited pool of available predictors to the individual classification problems arises. Empirical studies show that the policies we use to perform such assignments have a strong impact on the accuracy of the system. However, to date there is little theoretical understanding of this phenomenon. To help rectify this, in this paper we provide the first theoretical explanation of the accuracy gap between the most popular policies: the non-adaptive uniform allocation, and the adaptive allocation schemes based on uncertainty sampling and information gain maximisation. To do so, we propose a novel representation of the data collection process in terms of random walks. Then, we use this tool to derive new lower and upper bounds on the accuracy of the policies. These bounds reveal that the tradeoff between the number of available predictors and the accuracy has a different exponential rate depending on the policy used. By comparing them, we are able to quantify the advantage that the two adaptive policies have over the non-adaptive one for the first time, and prove that the probability of error of the former decays at more than double the exponential rate of the latter. Furthermore, we show in our analysis that this result holds both in the case where we know the accuracy of each individual predictor, and in the case where we only have access to a noisy estimate of it.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science > Computer Science
Library of Congress Subject Headings (LCSH): Machine learning, Classification, Semantics -- Data processing, Crowdsourcing, Pattern perception
Journal or Publication Title: Artificial Intelligence
Publisher: Elsevier BV
ISSN: 0004-3702
Official Date: October 2019
Dates:
DateEvent
October 2019Published
2 July 2019Available
30 June 2019Accepted
Volume: 275
Page Range: pp. 356-378
DOI: 10.1016/j.artint.2019.06.010
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
EP/I011587/1Research Councils UKhttp://dx.doi.org/10.13039/501100000690

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item
twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us