Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Statistics
  • Help & Advice
University of Warwick

The Library

  • Login

Intelligent feature selection for neural regression : techniques and applications

Tools
- Tools
+ Tools

Zhang, Fu (2012) Intelligent feature selection for neural regression : techniques and applications. PhD thesis, University of Warwick.

[img]
Preview
Text
WRAP_THESIS_Zhang_2012.pdf - Submitted Version

Download (1799Kb) | Preview
Official URL: http://webcat.warwick.ac.uk/record=b2581803~S1

Abstract

Feature Selection (FS) and regression are two important technique categories in Data Mining (DM). In general, DM refers to the analysis of observational datasets to extract useful information and to summarise the data so that it can be more understandable and be used more efficiently in terms of storage and processing. FS is the technique of selecting a subset of features that are relevant to the development of learning models. Regression is the process of modelling and identifying the possible relationships between groups of features (variables). Comparing with the conventional techniques, Intelligent System Techniques (ISTs) are usually favourable due to their flexible capabilities for handling real‐life problems and the tolerance to data imprecision, uncertainty, partial truth, etc. This thesis introduces a novel hybrid intelligent technique, namely Sensitive Genetic Neural Optimisation (SGNO), which is capable of reducing the dimensionality of a dataset by identifying the most important group of features. The capability of SGNO is evaluated with four practical applications in three research areas, including plant science, civil engineering and economics. SGNO is constructed using three key techniques, known as the core modules, including Genetic Algorithm (GA), Neural Network (NN) and Sensitivity Analysis (SA). The GA module controls the progress of the algorithm and employs the NN module as its fitness function. The SA module quantifies the importance of each available variable using the results generated in the GA module. The global sensitivity scores of the variables are used determine the importance of the variables. Variables of higher sensitivity scores are considered to be more important than the variables with lower sensitivity scores. After determining the variables’ importance, the performance of SGNO is evaluated using the NN module that takes various numbers of variables with the highest global sensitivity scores as the inputs. In addition, the symbolic relationship between a group of variables with the highest global sensitivity scores and the model output is discovered using the Multiple‐Branch Encoded Genetic Programming (MBE‐GP). A total of four datasets have been used to evaluate the performance of SGNO. These datasets involve the prediction of short‐term greenhouse tomato yield, prediction of longitudinal dispersion coefficients in natural rivers, prediction of wave overtopping at coastal structures and the modelling of relationship between the growth of industrial inputs and the growth of the gross industrial output. SGNO was applied to all these datasets to explore its effectiveness of reducing the dimensionality of the datasets. The performance of SGNO is benchmarked with four dimensionality reduction techniques, including Backward Feature Selection (BFS), Forward Feature Selection (FFS), Principal Component Analysis (PCA) and Genetic Neural Mathematical Method (GNMM). The applications of SGNO on these datasets showed that SGNO is capable of identifying the most important feature groups of in the datasets effectively and the general performance of SGNO is better than those benchmarking techniques. Furthermore, the symbolic relationships discovered using MBE‐GP can generate performance competitive to the performance of NN models in terms of regression accuracies.

Item Type: Thesis or Dissertation (PhD)
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Library of Congress Subject Headings (LCSH): Data mining, Artificial intelligence, Neural networks (Computer science)
Date: May 2012
Institution: University of Warwick
Theses Department: School of Engineering
Thesis Type: PhD
Publication Status: Unpublished
Supervisor(s)/Advisor: Iliescu, Daciana ; Hines, Evor, 1957-
Extent: 266 leaves : ill., charts
Language: eng
URI: http://wrap.warwick.ac.uk/id/eprint/49639

Request changes to a record

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...
twitter

Email us: publications@warwick.ac.uk
Contact Details
About Us