Improved Neural Modeling of Real-World Systems Using Genetic Algorithm Based Variable Selection

Reading time: 5 minute
...

📝 Original Info

  • Title: Improved Neural Modeling of Real-World Systems Using Genetic Algorithm Based Variable Selection
  • ArXiv ID: 0706.1051
  • Date: 2007-06-08
  • Authors: 저자: 정보 없음 (논문에 명시된 저자 정보가 제공되지 않음)

📝 Abstract

Neural network models of real-world systems, such as industrial processes, made from sensor data must often rely on incomplete data. System states may not all be known, sensor data may be biased or noisy, and it is not often known which sensor data may be useful for predictive modelling. Genetic algorithms may be used to help to address this problem by determining the near optimal subset of sensor variables most appropriate to produce good models. This paper describes the use of genetic search to optimize variable selection to determine inputs into the neural network model. We discuss genetic algorithm implementation issues including data representation types and genetic operators such as crossover and mutation. We present the use of this technique for neural network modelling of a typical industrial application, a liquid fed ceramic melter, and detail the results of the genetic search to optimize the neural network model for this application.

💡 Deep Analysis

📄 Full Content

When modeling a complex system (such as a chemical reactor), it is not generally known a priori which system states are necessary to develop a good model, or which states are observable based upon available sensor technology (although it is often known that many system states are not observable). In addition, there is a greater problem in identifying useful data. Complex dynamic systems such as the chemical reactor may be instrumented with tens, hundreds or even thousands of sensors. The problem with so much sensor information is that most of it will be irrelevant. Worse still, unfiltered incorporation of irrelevant data will adulterate a model, eroding its predictive capabilities.

A key data pretreatment problem is sensor redundancy. It is well known that smaller models are often better models [Sofge92]. This translates to fewer inputs and fewer hidden layer nodes. While it may be nice to have highly redundant data from a large number of sensors, in reality we may only need a few key sensors in order to produce a good model. The problem is in determining which few sensors to choose, and ignoring most of the remaining sensors. This is confounded by the fact that due to differing sensor response characteristics and noise, in the aggregate there is a considerable amount of noise and bias in the data.

In the example given in this paper, modelling of a liquid fed ceramic melter (LFCM) process is undertaken in order to predict the surface level. The melt chamber is instrumented with 20 thermocouple sensors placed at different sites within the chamber. Each sensor may have a slightly different characteristic response curve due to differences in manufacturing, usage history, etc. Each sensor also is susceptible to some level of noise. We take a time history of data from all 20 sensors and store it in our database, and then use this database to train a neural network model. Some sensors, such as those near the surface in the reactor vessel, may offer fairly high-variance data throughout the process, but be largely irrelevant to accurately predicting final product quality. We would like to select a near-optimal set of sensor variables in order to train a neural network model with the greatest predictive accuracy.

A genetic algorithm (GA) is fundamentally a search method which is used to optimize a complex system which is too large to fully explore or to locate a true optimal solution. The GA search procedure is inspired by rules of natural selection in Darwinian evolution which suggest that only the fittest members of a group will survive, to then recombine genetically with other fit members to yield even fitter members, thereby passing their successful characteristics on to the next generation [Holland75]. Less competitive members of the group are discarded or die off and are not recombined, and thus the characteristics that they carry are not propagated. Thus a population “evolves”, with successive generations replacing older ones and more successful members replacing less successful ones. Each member of the population, called a “chromosome”, is represented by a string of “genes”, which are encoded characteristics to be optimized.

The genes need to be defined for a given application such that finding a better or more optimal set of genes means finding a better solution to the problem. A GA may perform variable selection if each gene in a chromosome represents an available sensor variable. Fitness is judged for each chromosome by determining how good the models are (accuracy, robustness) generated by that combination of variables. An initial population of chromosomes is generated by choosing a string length (# of genes) and randomly assigning a variable to each gene. The GA search is then set in motion and the chromosomes compete, reproduce, and die off as they are replaced by more fit chromosomes. It is usually desirable to maintain a fixed-size population in order to make sure that the fitter chromosomes quickly replace the less fit ones. An occasional mutation is introduced to make sure that certain genes (variables) which may be really useful aren’t quickly eliminated (possibly because they are randomly combined with really noisy variables early on) and then never incorporated again. This is referred to as a population in danger due to lack of genetic variation, and to avoid this situation a mutation rate is predetermined and mutated chromosomes are introduced into the population at regular intervals during GA search. As these parameters are application dependent, it is not possible to know beforehand which values will work best. The GA process is automated with automatic gene sequence selection, model building and discarding, and evaluation of accuracy and robustness of the models (scoring). Successive generations will inherit the best characteristics from the previous generation, while eliminating the less valuable characteristics.

Genetic algorithms are often thought of, discussed and implemented using

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut