The Library
Learning fixeddimension linear thresholds from fragmented data
Tools
Goldberg, Paul W. (1999) Learning fixeddimension linear thresholds from fragmented data. University of Warwick. Department of Computer Science. (Computer science research report). (Unpublished)

PDF (Department of Computer Science Research Report)
WRAP_csrr362.pdf  Requires a PDF viewer. Download (429Kb)  Preview 
Abstract
We investigate PAClearning in a situation in which examples (consisting of an input vector and 0/1 label) have some of the components of the input vector concealed from the learner. This is a special case of Restricted Focus of Attention (RFA) learning. Our interest here is in 1RFA learning, where only a single component of an input vector is given, for each example. We argue that 1RFA learning merits special consideration within the wider field of RFA learning. It is the most restrictive form of RFA learning (so that positive results apply in general), and it models a typical "datafusion" scenario, where we have sets of observations from a number of separate sensors, but these sensors are uncorrelated sources. Within this setting we study the wellknown class of linear threshold functions, the characteristic functions of Euclidean halfspaces. The sample complexity (i.e. samplesize requirement as a function of the parameters) of this learning problem is affected by the input distribution. We show that the sample complexity is always finite, for any given input distribution, but we also exhibit methods for defining "bad" input distributions for which the sample complexity can grow arbitrarily fast. We identify fairly general sufficient conditions for an input distribution to give rise to sample complexity that is polynomial in the PAC parameters e1 and d1. We give an algorithm (using an empirical ecover) whose sample complexity is polynomial in these parameters and the dimension (number of inputs), for input distributions that satisfy our conditions. The runtime is polynomial in e1 and d1 provided that the dimension is any constant. We show how to adapt the algorithm to handle uniform misclassification noise.
Item Type:  Report  

Subjects:  Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software  
Divisions:  Faculty of Science, Engineering and Medicine > Science > Computer Science  
Library of Congress Subject Headings (LCSH):  Supervised learning (Machine learning)  
Series Name:  Computer science research report  
Publisher:  University of Warwick. Department of Computer Science  
Official Date:  10 September 1999  
Dates: 


Number:  Number 362  
DOI:  CSRR362  
Institution:  University of Warwick  
Theses Department:  Department of Computer Science  
Status:  Not Peer Reviewed  
Publication Status:  Unpublished  
Funder:  European Strategic Programme of Research and Development in Information Technology (ESPRIT)  
Grant number:  20244 (ESPRIT)  
Related URLs: 
Request changes or add full text files to a record
Repository staff actions (login required)
View Item 