Goodness-of-fit tests for weibull populations on the basis of records
Record is used to reduce the time and cost of running experiments (Doostparast and Balakrishnan, 2010). It is important to check the adequacy of models upon which inferences or actions are based (Lawless, 2003, Chapter 10, p. 465). In the area of goodness of fit based on record data, there are a few works. Smith (1988) proposed a form of residual for testing some parametric models. But in most cases, the variation inherent in graphical summaries is substantial, even when the data are generated by assumed model, and the eye can not always determine whether features in a plot are within the bounds of natural random variation. Consequently, formal hypothesis tests are an important part of model checking (Lawless, 2003). In this paper, Kolmogorov-Smirnov and Cramer-von Mises type goodness of fit tests for record data are proposed. Also a new weighted goodness of fit test is suggested. A Monte-Carlo simulation study is conducted to derive the percentiles of the statistics proposed. Finally, some real data sets are given to investigate results obtained.
💡 Research Summary
The manuscript addresses a gap in statistical methodology for assessing the goodness‑of‑fit (GOF) of Weibull distributions when the data are available only as records. Record data consist of successive minima (or maxima) and the number of observations between successive records, denoted ((R_i, K_i)). Because the total number of observations is random and only extreme values are retained, classical GOF tests such as the Kolmogorov‑Smirnov (KS) and Cramér‑von Mises (C‑M) tests lose power if applied directly.
The authors first derive the maximum‑likelihood estimators (MLEs) of the Weibull shape (\alpha) and scale (\sigma) under both the inverse‑sampling and random‑sampling schemes. The log‑likelihood for record data (equation 8) leads to two estimating equations: a closed‑form expression for (\sigma) (equation 9) and a nonlinear equation for (\alpha) (equation 10). The latter must be solved numerically; the authors note that the solution is unique, referencing Lehmann & Casella (1998).
Next, they adapt three GOF statistics to the record‑data context:
-
KS‑type statistic (D_n) – the supremum of the absolute difference between the non‑parametric maximum‑likelihood estimator (NPMLE) of the survival function (\hat{\bar F}(x)) and the hypothesised Weibull cdf (F_0(x)).
-
C‑M‑type statistic (W_n^2) – the integrated squared difference (\int (\hat F - F_0)^2 dF_0).
-
A new weighted statistic (D_{S_n}) – inspired by the Anderson‑Darling test, it weights the squared difference by (1/F_0(x)), thereby emphasizing discrepancies in the left tail where record data are most informative.
The authors provide explicit formulas for these statistics in terms of the record values and the cumulative products (\hat\Phi_i) (Proposition 3.1, equations 15‑17). They prove (Proposition 3.2) that, conditional on at least two records, the distributions of (D_n, W_n^2,) and (D_{S_n}) are invariant to the true Weibull parameters; consequently, critical values can be obtained by simulating from a standard Weibull distribution with (\alpha=\sigma=1).
A Monte‑Carlo study with 100 000 simulated record samples yields percentile tables (Table 1) for the three statistics across a range of significance levels (0.01–0.99) and record counts. These tables enable practitioners to conduct exact GOF tests without resorting to asymptotic approximations.
The paper also tackles the special case where the Weibull reduces to an exponential distribution ((\alpha=1)). Testing (H_0!:!X\sim\text{Exp}(\sigma)) against (H_1!:!X\sim W(\alpha,\sigma)) is equivalent to testing (\alpha=1) versus (\alpha\neq1). Since a uniformly most powerful test is unavailable, the authors employ a generalized likelihood‑ratio (GLR) approach. They derive the GLR statistic (\Lambda) (equation 19) and show that (-2\log\Lambda) asymptotically follows a (\chi^2_1) distribution, providing a practical critical value (C^\star) for a given significance level.
An empirical illustration uses inter‑call times from a telephone switching system (48 observations). Complete‑data MLE of the exponential scale is (\hat\sigma_C=0.934). From the corresponding record data (five records with associated (K_i) values), the record‑based exponential MLE is (\hat\sigma_0=1.022). Assuming a Weibull model, the record‑based MLEs are (\hat\alpha=1.1815) and (\hat\sigma=0.8181). The computed GOF statistics are (D_n=0.6979), (W_n^2=5.5140), and (D_{S_n}=8.8604). Comparing these to the 5 % critical values from Table 1, none exceed the thresholds, leading to acceptance of the Weibull model. The GLR test similarly fails to reject the exponential null, reinforcing the conclusion that the Weibull distribution adequately describes the data.
In summary, the manuscript delivers a complete methodological toolkit for Weibull GOF testing with record data: (i) MLE derivation tailored to records, (ii) three adapted GOF statistics—including a novel left‑tail‑sensitive test, (iii) Monte‑Carlo‑based critical values that are parameter‑free, and (iv) a GLR framework for testing exponential versus Weibull alternatives. The work fills a notable void in the literature on censored and record‑type data, offering both theoretical justification and practical guidance. Limitations include reliance on numerical solutions for (\alpha) and the need for extensive simulation to obtain critical values for very small record counts. Future research could explore analytical approximations for the null distributions, robust estimation under model misspecification, and extensions to other parametric families.
Comments & Academic Discussion
Loading comments...
Leave a Comment