This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture. The database contains measurements of soil profile data. We have applied GATree for generating classification decision tree. GATree is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems. Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.
Deep Dive into Soil Classification Using GATree.
This paper details the application of a genetic programming framework for classification of decision tree of Soil data to classify soil texture. The database contains measurements of soil profile data. We have applied GATree for generating classification decision tree. GATree is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems. Experimental results are presented which illustrate the performance of generating best decision tree for classifying soil texture for soil data set.
Fundamental analysis involves the analysis of economic data, industry conditions, company fundamentals, and corporate financial statements [5]. Data mining consists of the extraction of interesting novel knowledge from real-world databases [1]. Near boundless effort is expended in analyzing time series consisting of market and company metrics to predict future outcomes in order to achieve above average returns. This paper details an application of genetic programming to the problem of obtaining interesting knowledge from the soil dataset [7]. The database consisting measurements of soil profile data from various locations of Rayalaseema Region. Here we propose a genetic programming framework for induction of classification from databases. The framework outlines a method for classification of soil texture for soil data set using Genetic Algorithm. Experimental results are presented which illustrate the performance of generating the best decision tree for classifying soil texture for soil data set.
Of soil characteristics, soil Classification is the most important one. It influences many other properties of great significance to land use and management. The Soil texture is an important property for agriculture soil classification. It influences fertility, drainage, water holding capacity, aeration, tillage, and strength of soils.
A set of soil properties are diagnostic for differentiation of pedons. The differentiating characters are the soil properties that can be observed in the field or measured in the laboratory or can be inferred in the field. Some diagnostic soil horizons, both surface and sub-surfaces, soil moisture regimes, soil temperature regimes and physical, physiochemical and chemical properties of soils determined were used as criteria for classifying soils. The soils of various regions are classified into different orders, sub-orders, great groups, sub-groups, families and finally into series as per USDA Soil Taxonomy [14]. The texture of the surface varied from sand to silty clay loam where as in sub-surface horizons it varied from sand to clay [7].
The solid phase of soil can be divided into mineral matter and organic matter. The mineral particles can be futher subdivided into classes based on size. The classification of soil particles according to size are Sand, Silt, Clay. The proposition of Sand, Silt, Clay present in soil determines its textue.
In this paper Soil data consists of attributes like (i.e., Depth, Sand, Silt, Clay, Sandbysilt, Sandbyclay, Sandbysiltclay, TextureClass). The texture of the Soil data is varied from sand to silty clay loam where as in sub-surface horizons it varied from sand to clay [2]. Table 1 shows the different soil survey symbols.
Genetic Algorithm is the method for selecting the most suitable answer by using feasibility and Natural Selection of Charles Darwin [9]. Genetic Algorithm (GA) has been developed during the 60th decade and has become quite popular from being distributed by John Holland who published the book called, “Adaptation in Natural and Artificial Systems” for the first time in 1975. The process of GA was copied from natural selection that could be explained as the replacement of interesting problems by string of numbers or in biology as chromosomes. Each chromosome contained gene which was replaced by Decision Variable. In the first place, gene would be randomly selected to choose the population size. Later, each chromosome had been evaluated for Objective Function for fitness which represented the value of suitability of chromosomes before entering the process of GA through selection to find origin of species.
In this paper soil classification is performed using GATree [6], which is a decision tree builder that is based on Genetic Algorithms (GAs). The idea behind it is rather simple but powerful. Instead of using statistic metrics that are biased towards specific trees we use a more flexible, global metric of tree quality that try to optimize accuracy and size. GATree offers some unique features not to be found in any other tree inducers while at the same time it can produce better results for many difficult problems.
The main screen of the GA tree is shown in figure 1.
The main screen of the program (Figure 1) allows us to select an active training dataset and evolve the decision tree. In the main program’s window we can watch the best decision tree as it evolves through time. The right panel includes information about the current status of the evolution process.
GATree uses ARFF as its standard source format. An ARFF file is a simple text file that describes the problem instances and its attributes.
The statistics tab on the main screen provides several graphs of the evolution process.
Those graphs allow us to follow the evolution process in real time and discover potential problems and trends. As an example, when the Average Fitness of the population tend to be equal to the Fitness of the best Genome then there is little room
…(Full text truncated)…
This content is AI-processed based on ArXiv data.