Model tree based adaption strategy for software effort estimation by analogy

February 23, 2026

Reading time: 6 minute

...

📝 Abstract

Background: Adaptation technique is a crucial task for analogy based estimation. Current adaptation techniques often use linear size or linear similarity adjustment mechanisms which are often not suitable for datasets that have complex structure with many categorical attributes. Furthermore, the use of nonlinear adaptation technique such as neural network and genetic algorithms needs many user interactions and parameters optimization for configuring them (such as network model, number of neurons, activation functions, training functions, mutation, selection, crossover, … etc.). Aims: In response to the abovementioned challenges, the present paper proposes a new adaptation strategy using Model Tree based attribute distance to adjust estimation by analogy and derive new estimates. Using Model Tree has an advantage to deal with categorical attributes, minimize user interaction and improve efficiency of model learning through classification. Method: Seven well known datasets have been used with 3-Fold cross validation to empirically validate the proposed approach. The proposed method has been investigated using various K analogies from 1 to 3. Results: Experimental results showed that the proposed approach produced better results when compared with those obtained by using estimation by analogy based linear size adaptation, linear similarity adaptation, ‘regression towards the mean’ and null adaptation. Conclusions: Model Tree could form a useful extension for estimation by analogy especially for complex data sets with large number of categorical attributes.

💡 Analysis

🇰🇷 한글로 읽기

📄 Content

Model Tree Based Adaption Strategy for Software Effort Estimation by Analogy Mohammad Azzeh Department of Software Engineering Applied Science University Amman, Jordan PO BOX 133 m.y.azzeh@asu.edu.jo

Abstract— Background: Adaptation technique is a crucial task for analogy based estimation. Current adaptation techniques often use linear size or linear similarity adjustment mechanisms which are often not suitable for datasets that have complex structure with many categorical attributes. Furthermore, the use of nonlinear adaptation technique such as neural network and genetic algorithms needs many user interactions and parameters optimization for configuring them (such as network model, number of neurons, activation functions, training functions, mutation, selection, crossover,…etc.). Aims: In response to the abovementioned challenges, the present paper proposes a new adaptation strategy using Model Tree based attribute distance to adjust estimation by analogy and derive new estimates. Using Model Tree has an advantage to deal with categorical attributes, minimize user interaction and improve efficiency of model learning through classification. Method: Seven well known datasets have been used with 3-Fold cross validation to empirically validate the proposed approach. The proposed method has been investigated using various K analogies from 1 to 3. Results: Experimental results showed that the proposed approach produced better results when compared with those obtained by using estimation by analogy based linear size adaptation, linear similarity adaptation, ‘regression towards the mean’ and null adaptation. Conclusions: Model Tree could form a useful extension for estimation by analogy especially for complex data sets with large number of categorical attributes.

Keywords: Adaptation Strategy, Analogy-based estimation, Model Tree. I. INTRODUCTION Estimation by Analogy (EBA) makes prediction for a new project by retrieving previously completed similar projects that have been encountered and remembered as historical projects [2, 7, 18, 21, 22, 23]. The effort values in the retrieved projects are reused as proposed prediction to the new project. In a few cases, particularly when the dataset is enough large and exhibit some normal characteristics, the effort of the retrieved project can be reused directly without adaptation [20]. But for others, it is common for the retrieved project to be regarded as an initial solution that should be refined to capture the differences between the new and retrieved projects [20].
Adaptation (synonymously adjustment) is a mechanism used to capture the differences between target project and most similar project(s) and then derive a new estimate [14, 20]. It is an important step in estimation by analogy as it reflects the structure of target project on the retrieved projects. Figure 1 illustrates the process of adjusted analogy based estimation. However, in literature, many adaptation techniques have been proposed to improve prediction accuracy of estimation by analogy such as using ‘regression towards the mean’ [11], Genetic based similarity adjustment [6], linear size adjustment [10, 14, 24], and nonlinear adjustment [16].

Figure 1. Process of adjusted analogy based method [16] The majority of these adjustment mechanisms use linear adjustment such as size adjustment, similarity adjustment and productivity adjustment, which are generally restricted to size attribute and could not accept other than numeric attributes [16]. In practice, these approaches are not often efficient because software project datasets often have a complex structure and exhibit non-normal characteristics [2, 3, 16], and contain large proportion of categorical attributes [3, 8]. Moreover, the other learning based adaptation techniques such as genetic algorithm and neural networks are often challenging because they need parameter optimization and configuration setup that requires many user interactions such as decisions about: network model, number of neurons, activation functions, training functions, mutation, selection, crossover, etc. Moreover, learning and optimization through neural network and genetic algorithm takes sometimes longer time to train and may reduce performance of the model. Therefore any useful adaptation mechanism should learn from the structure of the historical dataset and should involve categorical attributes as they contain useful information to improve the accuracies of effort estimation [3, 8]. In addition to that it should minimize user interaction and reduce configuration parameters.
In response to the abovementioned reasons, the present paper proposes a new flexible adaptation technique based on Model Tree (see section 3 for more details) using attribute distance values between source historical projects and their closest analogies. In this approach, the conventional EBA procedure

View Original ArXiv

This content is AI-processed based on ArXiv data.

Model tree based adaption strategy for software effort estimation by analogy

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found