Machine Learning and value generation in Software Development: a survey
Predicting programming effort
Software effort estimation has received attention since the late 1970s and has been noticed to affect the workflow of the project and its overall success significantly. Moreover, programming effort underestimation often leads to missed deadlines and deterioration of the software quality; effort overestimation, on the other hand, is one of the reasons for project deceleration caused by budget constraints .Many software effort estimation methods have been proposed to accurately estimate effort as a function of a large number of factors. The most widely employed methods include expert models and logical statistical models (parametric models SLIM, COCOMO; regression analysis), traditional machine learning algorithms (Fuzzy Logic, Genetic Algorithms and Regression Trees) and Artificial Neural Networks. According to , the coding effort is most often estimated in lines of code (LOC), function points (FP) ; use case points (UCP) or in labour hours . This section depicts the most common approaches for software development effort estimation (SDEE) in the literature, as well as their characteristics.
The importance of accurate effort predictions and the demand for automation of the estimation process have motivated the researchers to propose first parametric models in the early 80s. These models were then tested on the software datasets comprised from the real industrial data of completed projects . According to Srinivasan and Fisher, the three most prominent models are COCOMO, SLIM and Function Points . COCOMO and SLIM models rely almost exclusively on source lines of code (SLOC) as a major input, while the function point approach utilises the number of transactions and other few additional processing characteristics (online updating and transaction rates). Despite being evaluated on the available historical data (COCOMO dataset), the above models have been proven to suffer from inconsistent performances due to the noisy nature of software datasets . Bayesian Networks (BN) is a statistical model used for estimating Agile development effort . Dragicevic, Celar and Turic outlined the benefits of BNs which include the capability of handling vast uncertainties caused by the shortage of relevant information, subjective nature of a number of metrics and difficulties in gathering them. .
Another common technique for predicting effort is expert estimation, which is suitable when the domain knowledge is not leveraged by the models . Despite its popularity, expert systems exhibit considerable human bias. One example of such system is Planning Poker, a gamified baseline strategy for SDEE in Agile environments in which developers make estimates by playing numbered cards. In a study by Moharreri et al. Plannig Poker was proven to overestimate in 40% of instances and was shown to have a very high MMRE score of 106.8% . Parametric models and expert systems are still widely used in industry and studies, however the need for better generalisation and overall performance has driven the researchers to apply machine learning methods .
Case-based reasoning (CBR) and decision trees (DT) have been among the most effective and researched ML models for SDEE . Results of these models are highly interpretable and are recognised as superior or at least compatible with those of parametric and effort estimation models . It was also asserted by Wen et al. that CBR is more suitable than DTs for this task since it is favourable towards smaller datasets, which is one of the biggest limitations in SDEE research . It is worth mentioning that ensemble models that different methods are often used to gain an even better precision. Moharreri et al. presented experimental evidence that DT coupled with Planning Poker produce better estimations than these models do on their own . Genetic algorithms and fuzzy logic have been used in ensemble models, primarily handling feature selection and imprecise information provided in the datasets .
The idea of Artificial Neural Networks (ANNs, or simply NNs), a model that has proven its potential and outperformed traditional ML methods in a number of areas, was first proposed in the 1940s and inspired by biological neurons. ANNs are an attractive approach due to their remarkable computational power: an ability to learn nonlinear relations, high parallelism, noise tolerance, learning and generalisation capabilities . The drawbacks of applying Neural Networks are as follows: a necessity of large datasets, computational expensiveness and the fact that the results are significantly less interpretable compared to traditional machine learning methods . However, there are some methods to overcome this limitation of interpretability .
Comparative study of techniques such as regression tree, k-nearest neighbour, regression analysis and neural networks when applied for software development effort estimation has shown neural networks’ best estimation ability . Further consideration was given to neural networks by various researchers to emphasize their superior capabilities in effort prediction . Thus, neural networks based models most often provide the best effort estimation compared to traditional ML and their accuracy increases with the amount of data supplied .
Predicting risks to the project
Several aspects can affect and abuse the software development cycle. Predicting risks is important because it helps to mitigate delays and unforeseen expenses and dangers to the project. As it was mentioned in , software development projects always have more risks than other management projects because it has more technical uncertainty and complexity. Most developers look for methodology to minimize the important risks to improve their management, because the risk factor affects the success or failure of any project.
Hu et al. identified the four main types of risks : schedule: the wrong schedule may break the development even at its very first stage; budget: the correct financing is a process that requires the utmost attention to avoid the risks in software development; technical: the developers trying to make changes or fixes in the unknown code will make the relatively big amount of mistakes until they get deep into the details of their task. Even if the damage of one mistake is minor, a big number of such mistakes can be a critical fact for the project; and management risks: risks which may include the bad working environment, insufficient hardware reliability, low effectiveness of the programming etc.
Wauters and Vanhouke proposed a method for continuously assessing schedule risks which uses support vector regression which reads periodic earned value management data from the project control environment, resulting in a more reliable time and cost forecasts . The parameters of the Support Vector Machine have been tuned using a cross-validation and grid search procedure, after which a large computational experiment is conducted. The results showed that the Support Vector Machine Regression outperforms the currently available forecasting methods. Additionally, a robustness experiment has been set up to investigate the performance of the proposed method when the discrepancy between training and test set becomes larger.
The wrong finance distribution will later lead to the unreasonable use of the finances and overall project fault. For solving this problem and predicting risks related with budget and finances distribution Ceylan, Kutlubay and Bener employed regression techniques to detect and identify software defects budget-related . These techniques are used to identify potentially defective software and allow corrective action to be taken before software is released to the production environment. The results of the ‘initial system structure’ show that the methods have many faulty defect predictions when the entire dataset is used. When the results are considered in terms of algorithm performances, it is seen that all of the learning algorithms used in the research have similar prediction performances having similar mean square error values.
Even a small number of technical mistakes could be a critical factor for the project. In , machine learning classifiers have emerged as a way to predict the existence of a bugs in a change made to a source code file. The classifier is first trained on software history data, and then used to predict bugs. Large numbers of features adversely impact scalability and accuracy of the approach. This technique is applied to predict bugs in software changes, and performance of Naive Bayes and Support Vector Machine classifiers is characterized.
Management risks in software development are one of the most global type of risks, because if they exist, most of the time they present the most prominent damage. aimed to predict the risks in software development projects by applying multiple logistic regression. The logistic regression was used as a tool to control the software development process. The logistic regression analyses can grade and help to point out the risk factors, which were important problems in development processes. These analytic results can lead to creation and development of strategies and highlighted problems, which are important issues to manage, control and reduce the risks of error.