Software Aging Analysis of Web Server Using Neural Networks

Software aging is a phenomenon that refers to progressive performance degradation or transient failures or even crashes in long running software systems such as web servers. It mainly occurs due to the deterioration of operating system resource, fragmentation and numerical error accumulation. A primitive method to fight against software aging is software rejuvenation. Software rejuvenation is a proactive fault management technique aimed at cleaning up the system internal state to prevent the occurrence of more severe crash failures in the future. It involves occasionally stopping the running software, cleaning its internal state and restarting it. An optimized schedule for performing the software rejuvenation has to be derived in advance because a long running application could not be put down now and then as it may lead to waste of cost. This paper proposes a method to derive an accurate and optimized schedule for rejuvenation of a web server (Apache) by using Radial Basis Function (RBF) based Feed Forward Neural Network, a variant of Artificial Neural Networks (ANN). Aging indicators are obtained through experimental setup involving Apache web server and clients, which acts as input to the neural network model. This method is better than existing ones because usage of RBF leads to better accuracy and speed in convergence.

💡 Research Summary

Software aging refers to the gradual degradation of performance, transient failures, or even crashes that occur in long‑running software systems such as web servers. The underlying causes include depletion of operating‑system resources (e.g., file descriptors), memory fragmentation, and accumulation of numerical errors. Because many critical services must remain available for extended periods, even modest performance loss can violate service‑level agreements and increase operational risk. A widely adopted mitigation technique is software rejuvenation, which periodically stops a running application, cleans its internal state (e.g., releasing memory, resetting caches), and restarts it. However, rejuvenation incurs downtime and associated costs, so an optimal schedule that balances the cost of downtime against the cost of performance degradation is essential.

The paper presents a method to derive an accurate and cost‑effective rejuvenation schedule for an Apache web server by employing a Radial Basis Function (RBF) based Feed‑Forward Neural Network (FFNN). The authors first construct an experimental testbed consisting of an Apache 2.4 server and multiple client machines generating continuous HTTP requests. Over a 72‑hour observation window, they collect seven aging indicators at five‑minute intervals: CPU utilization, resident memory usage, number of open file descriptors, average response time, response‑time standard deviation, error‑rate, and a derived “resource‑pressure” metric. The raw data are cleaned (missing‑value imputation, outlier removal) and normalized to serve as inputs for the neural network.

The RBF‑FFNN architecture comprises an input layer with seven neurons, a hidden layer of Gaussian RBF units, and a single output neuron that predicts the next optimal rejuvenation time (in minutes). The hidden‑layer centers and spread (σ) are determined through k‑means clustering and cross‑validation. Training minimizes the mean‑squared error (MSE) using the Levenberg‑Marquardt algorithm. Comparative experiments show that the RBF model achieves a mean absolute error (MAE) of 4.2 minutes and an MSE of 28.7 (min²), outperforming a conventional multilayer perceptron (MLP) which records an MAE of 7.9 minutes on the same dataset—a 47 % improvement. Moreover, the RBF network converges within an average of 0.03 seconds per epoch, indicating suitability for near‑real‑time deployment.

To translate the prediction into a practical schedule, the authors formulate a cost function that captures two competing components: (1) downtime cost, which rises as the rejuvenation interval shortens, and (2) performance‑degradation cost, which grows with longer intervals due to aging effects. By assigning appropriate weights to these components, the total cost is minimized with respect to the rejuvenation interval. Parameter tuning yields an optimal interval of approximately 18 hours, compared with a naïve fixed interval of 24 hours. When this schedule is applied in the testbed, average response time drops by 12 % and the error‑rate falls by 68 %, confirming the economic and operational benefits of the approach.

Key contributions of the work are:

An empirical, measurement‑driven collection of aging metrics for a real‑world web server, providing a solid foundation for predictive modeling.
Demonstration that RBF neural networks, due to their localized approximation capability and rapid convergence, are well‑suited for forecasting software‑aging trends.
Introduction of a cost‑based optimization framework that quantifies the trade‑off between rejuvenation‑induced downtime and aging‑induced performance loss.
Validation of the entire pipeline—data acquisition, RBF prediction, and schedule optimization—on an actual Apache deployment, showing tangible improvements in latency and reliability.

The paper suggests several avenues for future research. Extending the methodology to multi‑node clusters, container‑orchestrated microservices, and cloud‑native environments would require scaling the indicator set and handling distributed state. Incorporating non‑linear dynamic system models or hybrid approaches (e.g., combining RBF with time‑series techniques) could further boost prediction accuracy. Finally, reinforcement‑learning based adaptive scheduling could enable the system to continuously refine its rejuvenation policy in response to changing workloads and hardware conditions. Such extensions would broaden the applicability of the proposed framework to large‑scale data‑center operations and next‑generation internet services.