Metrics of Risk Associated with Defects Rediscovery

Metrics of Risk Associated with Defects Rediscovery
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Software defects rediscovered by a large number of customers affect various stakeholders and may: 1) hint at gaps in a software manufacturer’s Quality Assurance (QA) processes, 2) lead to an over-load of a software manufacturer’s support and maintenance teams, and 3) consume customers’ resources, leading to a loss of reputation and a decrease in sales. Quantifying risk associated with the rediscovery of defects can help all of these stake-holders. In this chapter we present a set of metrics needed to quantify the risks. The metrics are designed to help: 1) the QA team to assess their processes; 2) the support and maintenance teams to allocate their resources; and 3) the customers to assess the risk associated with using the software product. The paper includes a validation case study which applies the risk metrics to industrial data. To calculate the metrics we use mathematical instruments like the heavy-tailed Kappa distribution and the G/M/k queuing model.


💡 Research Summary

The paper addresses the problem of software defects that are rediscovered by many customers, a phenomenon that can indicate gaps in quality assurance (QA), overload support and maintenance teams, and erode customer confidence and sales. To turn this qualitative concern into actionable information, the authors propose a suite of seven quantitative risk metrics designed for three stakeholder groups: QA engineers, support/maintenance staff, and customers.

The core of the methodology consists of two statistical models. First, the distribution of the number of rediscoveries per defect is modeled with a compound Kappa distribution, a flexible heavy‑tailed family that can capture both thin‑tailed (exponential) and heavy‑tailed (Pareto‑like) behaviours. The authors compare the Kappa fit against exponential, log‑normal, geometric, and Pareto models on an industrial dataset (thousands of defects, tens of thousands of rediscoveries) and show that the Kappa distribution yields the lowest AIC/BIC and passes goodness‑of‑fit tests. This choice enables accurate estimation of tail probabilities, which are essential for assessing “avalanche” events where a defect triggers a flood of support requests.

Second, the handling of special‑build requests (patches) is modeled as a G/M/k queue. The arrival process of rediscovery reports is treated as a general (G) inter‑arrival distribution, while the service time (the time to produce, test, and deliver a special build) is assumed exponential (M). The parameter k represents the number of maintenance engineers dedicated to this task. By solving the steady‑state equations of the G/M/k system, the authors obtain key performance indicators such as average waiting time, system utilization, and probability of saturation. This queuing analysis directly informs staffing decisions and service‑level agreement (SLA) planning.

The seven metrics are:

  1. M1 – Expected number of defects rediscovered more than d times. Used by QA to monitor defect severity trends across releases.

  2. M2 – Expected number of defects affecting at least a given percentage x of the customer base. Provides customers with a risk‑based comparison of products.

  3. M3 – Expected total number of rediscoveries for defects whose rediscovery count exceeds a threshold d. Helps support teams anticipate the volume of incoming calls.

  4. M4 – Probability that the total number of rediscoveries in a time window exceeds a spike threshold L. Enables proactive resource allocation and “burst‑handling” strategies.

  5. M5 – Complementary probability that the total number of rediscoveries stays below L. Useful for SLA compliance verification.

  6. M6 – Worst‑case (Value‑at‑Risk‑like) estimate of total rediscoveries for a confidence level α. Gives a quantitative bound for risk‑averse planning.

  7. M7 – Expected waiting time for customers to receive a special build. Directly ties the queuing model to customer experience metrics.

Each metric is expressed analytically using the cumulative distribution function (F) and its inverse (quantile function) of the Kappa‑modeled rediscovery count, together with the parameters of the G/M/k queue. For example, M1 simplifies to N·


Comments & Academic Discussion

Loading comments...

Leave a Comment