Deriving the term-structure of loan write-off risk under IFRS 9 by using survival analysis: A benchmark study
The estimation of marginal loan write-off probabilities is a non-trivial task when modelling the loss given default (LGD) risk parameter in credit risk. We explore two types of survival models in estimating the overall write-off probability over defa…
Authors: Arno Botha, Mohammed Gabru, Marcel Muller
Deriving the term-structure of loan write-off risk under IFRS 9 b y using surviv al anal y sis: A benc hmark study Arno Botha ∗ ,a,b , Mohammed Gabru † ,a , Marcel Muller ‡ ,a , and Janette Lar ne y § ,a,b a Centr e f or Business Mat hematics and Inf or matics & U nit for Data Science and Computing, North- W est Univ ersity , P otc hefstr oom, South Africa b National Institut e for Theor etical and Computational Sciences (NITheCS), P otchef stroom, South Africa Abstract The estimation of marginal loan wr ite-off probabilities is a non-tr ivial task when modelling the loss giv en default (LGD) r isk parameter in credit risk. W e e xplore tw o types of survival models in estimating the o verall write-off probability o v er default spell time, where these probabilities f or m the ter m-s tr ucture of wr ite-off r isk in aggregate. These survival models include a discrete-time hazard (DtH) model and a conditional inf erence sur viv al tree. Both models are compared to a cross-sectional logis tic regression model f or write-off r isk. All of these (firs t-stag e) models are then ensconced in a broader tw o-stage LGD-modelling approach, wherein a loss se verity model is estimated in the second stag e. In e xpanding the model suite, a no vel dic hotomisation step is introduced f or collapsing the write-off probability into a 0/1-value, pr ior to LGD-calculation. A benchmark study is subsequentl y conducted amongst the resulting LGD-models. W e find that the DtH-model outperf or ms other two-s tage LGD-models admirabl y across most diagnostics. How e ver , a single-stage LGD-model still had the best results, likely due to the peculiar ‘L -shaped’ LGD-distribution in our data. Ultimatel y , w e believ e that our tutor ial-sty le work can enhance LGD-modelling practices when estimating the e xpected credit loss under IFRS 9. Keyw ords— IFRS 9; Loss Giv en Default (LGD); W r ite-off; Sur viv al anal ysis JEL: C44, C63, G21. W ord count (ex cluding front matter and appendices): 16871 Disclosure of interest and declaration of funding This w ork is not financially suppor ted by an institution or study g rant, and has no conflicts of interest that ma y hav e influenced the outcome of this work. ∗ OR C iD: 0000-0002-1708-0153; Corresponding author: arno.spasie.botha@gmail.com † OR C iD: ; email: mogabru@gmail.com ‡ OR C iD: ; email: marcelcelliers@gmail.com § OR C iD: 0000-0003-0091-9917; email: janette.larney@nwu.ac.za 1 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 1 Introduction In banking, a fundamental task is the estimation of the loss associated with a bor ro w er who may default, i.e., credit risk. The lev el (and ev olution) of credit r isk dictates the amount by which a financial asset ought to be adjusted regularl y , as go verned b y the Inter national Financial Accounting Standard (IFRS) 9 from the IASB ( 2014 ). These regular adjus tments are based on a statis tical model of the asset’ s expected credit loss (ECL). In turn, the ECL represents the probability -weighted sum of cash shortfalls that are e xpected to be f or f eited o v er a specific time horizon; see IASB ( 2014 , §5.5.17–18, §B5.5.28–31, §B5.5.41–44). This ECL -amount should be unbiased and be deter mined across a range of possible outcomes that can influence the asset ’ s value o ver time. Further more, estimating the ECL should consider the time value of money , past ev ents, current conditions, and f or w ard-looking inf or mation (e.g., macroeconomic forecas ts). Chang es in the ECL -amount are then reser v ed within a central loss pro vision, which ultimately absorbs future wr ite-off amounts. One of the fundamental r isk parameters within an ECL -model is that of the loss given default (LGD), which is the fraction of the outstanding balance at the default point that is expected to be lost. There are f our common methods f or estimating this LGD-quantity , as descr ibed b y Schuermann ( 2004 ), V an Gestel and Baesens ( 2009 , pp. 217-222), and Baesens et al. ( 2016 , §10); though we shall f ocus on the wor kout method that uses a bank’ s o wn internal data. This method aims to mimic the real-w orld resolution process through which defaulted loans typically progress, with the goal of nursing the strained relationship between bank and borro wer bac k to health. Finla y ( 2010 , pp. 11-13) and Botha ( 2021 , §2.2) describe a f ew remedial actions that, if successful, can induce a full reco v er y in a defaulted loan. In this case, the loan is said to hav e ‘cured’ from default and a zero loss is typically assigned. Ho w ev er, these remedial actions might f ail and the bank might be left with little choice but to initiate legal proceedings and/or foreclose on an y a vailable collateral tow ards reco vering as much of the defaulted debt as soon as possible. The resulting receipts (or recov eries) are then offset agains t the last-kno wn balance of the loan, whereafter an y non-zero remainder is written off as a credit loss. As descr ibed b y Lar ne y et al. ( 2025 ), the w orkout process theref ore culminates in either a cured or written-off outcome, with the remaining cases considered as unresol ved (or right-censored). Consider ing wr ite-offs, the w orkout method takes as input the resulting ser ies of collected receipts, and calculates the discounted sum thereof o ver the w orkout period back to the default point. The percentag e chang e betw een this discounted sum and the default balance is then defined as the empir ical loss rate, or the actual LGD. As discussed and illustrated by Schuer mann ( 2004 ), Calabrese and Zeng a ( 2010 ), Loterman et al. ( 2012 ), Baesens et al. ( 2016 , §10), and Lar ne y et al. ( 2025 ), the distribution of the resulting loss rates typically hav e tw o defining characteristics. These include a heavy right-ske wed tail and bimodality in the distribution, with one of the modes alwa ys centred at 0 due to zero-loss cures. The e xtent of the sk ew w ould directl y depend on the pre valence of zero-loss cured outcomes, which is anecdotally about 70-80% of resol v ed defaults f or residential mor tg ages, as an e xample. These c haracter istics ha ve inspired a particular modelling strategy called tw o-stag e LGD-modelling that has become quite popular; see Leow and Mues ( 2012 ) and Gür tler and Hibbeln ( 2013 ) for an o vervie w thereof. In this s trategy , the LGD of a defaulted loan 𝑖 is decomposed into a wr ite-off component 𝑤 𝑖 and a loss sev er ity component 𝑙 𝑖 , each f or ming a separate ‘stag e’ . The wr ite-off component 𝑤 𝑖 represents the probability of wr iting/c harging off this 𝑖 giv en a fe w input variables 𝒙 𝑖 . The loss se v er ity component 𝑙 𝑖 signifies the realised loss rate of 𝑖 in the ev ent of wr ite-off, i.e., the loss giv en wr ite-off; itself also modelled as a function of input variables. Each component is then estimated separately using a particular modelling technique with input v ar iables, e.g., logistic reg ression f or 𝑤 𝑖 and linear regression for 𝑙 𝑖 . Thereafter, the estimates of 𝑤 𝑖 and 𝑙 𝑖 are combined to wards producing a final LGD-estimate, reconstituted as LGD = 𝑤 𝑖 · 𝑙 𝑖 . 2 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study In this paper , we shall define the wr ite-off component 𝑤 𝑖 as a function of time in default, at the very least. For a def aulted loan 𝑖 , let 𝑡 = 𝜏 𝑑 , . . . , 𝜏 𝑟 inde x the time in default from the default point 𝜏 𝑑 up to the resolution time 𝜏 𝑟 . The collection of wr ite-off probabilities for 𝑖 is then ref er red to as the term-structure of write-off r isk given inputs 𝒙 𝑖 , denoted as 𝒘 ( 𝒙 𝑖 ) = { 𝑤 ( 𝑡 , 𝒙 𝑖 ) } 𝜏 𝑟 𝑡 = 𝜏 𝑑 . Modelling this ter m-s tr ucture across man y defaulted loans is the pr imary f ocus of this paper . The premise hereof is that write-off r isk chang es with 𝑡 in a non-linear f ashion, as w e shall demonstrate later . If this relationship is ignored, then the estimation of LGD may v er y well become inaccurate and biased. In tur n, this inaccuracy can negativ ely affect the ECL -amount under IFRS 9, and attenuate a bank’ s loss pro vision. Giv en such an inaccurate ECL -amount, a bank may unnecessar il y hold ex cess provisions (too high an ECL -amount), which poses an oppor tunity cost. On the other hand, a bank may also hold too little pro visions (too lo w an ECL -amount), thereby r isking insol vency . The aim of producing unbiased ECL -es timates under IFRS 9 theref ore becomes a matter of prediction accuracy , which is a non-trivial task in LGD-modelling. In this paper , our objectiv e and main contr ibution is to e xplore and model this wr ite-off probability using a f ew techniques tow ards building bespok e LGD-models. In par ticular , we shall in ves tigate a par ticular class of modelling techniq ues called sur viv al analysis , which includes discrete-time hazard (DtH) models and conditional inf erence sur viv al tree (ST) models. W e compare both of these more dynamic sur viv al techniques to a classical cross-sectional model f or the wr ite-off probability using logis tic regression. As f ar as we kno w , this par ticular anal ysis and the use of DtH and ST models f or wr ite-off r isk ha v e not yet been explored in literature, hence our contr ibution. For each model, we also introduce a nov el dichotomisation step wherein the der iv ed write-off probability is first collapsed into a 1 or a 0 value bef ore calculating the LGD. Finally , we embed these models into a wider tw o-stag e LGD-modelling approach, whereafter a loss sev erity model (giv en wr ite-off ) is built in the second stag e. This loss sev er ity model is itself estimated using a T w eedie compound Poisson g eneralised linear model (GLM), f ollo wing some experimentation, as will be def ended later . The resulting tw o-stag e LGD-models are then compared to a fe w simple single-stage GLM-based LGD-models in the interest of completeness. Our model compar ison uses a v ar iety of diagnostics, which we believ e fur ther differentiates our benchmark study from others. These (reusable) diagnostics include time-dependent measures of discr iminatory pow er , prediction accuracy , and the e xtent to whic h predictions agree with obser v ations when aggregated in various wa ys. One of these aggregation wa ys include b y default spell time 𝑡 , thereb y forming the o v erall ter m-s tr ucture of wr ite-off risk; itself a rather unique view inspired by sur viv al anal ysis. Las tly , w e per f or m a dis tr ibutional analy sis betw een the empir ical and expected distributions of the LGD-predictions emanating from each model. In conducting our comparativ e study , w e shall use a r ic h dataset of residential mor tgag e loans from a larg e South Afr ican bank. This dataset spans January 2007 up to December 2022, dur ing which time mortgag es w ere continuousl y originated, ultimatel y containing 653,317 loan accounts. W e believ e that our w ork can guide practitioners in modelling the aggregated ter m-s tr ucture of wr ite-off r isk o ver time, thereb y enabling g reater accuracy in LGD-modelling. In so doing, the central aim of IFRS 9 can be better realised in producing timeous and accurate ECL -es timates tow ards sizing a bank’ s loss provision. This paper is str uctured as f ollo ws. In Sec. 2 , we summarise the literature on tw o-stag e LGD-modelling, with a par ticular f ocus on using sur viv al anal ysis as modelling technique. Basic survival modelling concepts are discussed and illustrated in Sec. 3 to wards f or mulating our setup mathematically , str ucturing our data, estimating the empir ical wr ite-off term-str ucture that our models shall tr y to reco ver , and specifying our models. W e then e valuate the v ar ious wr ite-off r isk models in Sec. 4 across fiv e main diagnostics. These results are complemented in Sec. 5 b y a distr ibutional compar ison of the downs tream LGD-models, whereafter we conclude the study in Sec. 6 . 3 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study Giv en its rather e x otic nature, the basics of ST -models are discussed in the appendix. Other ancillar y mater ial in the appendix include the cor rect specification of surviv al data; an illustration of a sampling representativ eness measure (i.e., the resolution rat e ); a descr iption of the selected input variables within our various models; and a shor t (but no v el) optimisation procedure f or dichotomising models that output probabilities. Finall y , this work is accompanied b y an open-source R -codebase, as maintained b y Gabr u et al. ( 2026 ). 2 R evie wing tw o-stage LGD-modelling and the use of surviv al anal ysis The tw o-stag e approach to LGD-modelling has its genesis in the w ork of Leo w and Mues ( 2012 ), who demonstrated the approach using residential mortgag e data from the UK. In particular , the authors decomposed the LGD into the product of a r epossession model and a hair cut model. The repossession probability model estimates the likelihood of repossessing the proper ty as a function of a fe w input variables, e.g., loan-to-value (L T V), time on book, type of security , and a previous default indicator . The haircut model es timates the ratio betw een the sale pr ice and the market value of the repossessed asset at def ault, i.e., the loss sev erity giv en repossession. The y compared this tw o-stag e model agains t a single-stag e LGD-model based on ordinary least squares (OLS). The authors f ound that the two-s tage model can faithfull y reproduce the empir ical LGD-distribution with its peak at 0, whereas the single-stag e model str uggles to do so. This finding was cor roborated by the w ork of Loter man et al. ( 2012 ), who compared 24 different techniq ues (including single-stag e and two-s tage approaches) for estimating the LGD. Most curiously , the y f ound that sophisticated single-stag e LGD-models (e.g., suppor t v ector machines and ar tificial neural netw orks) per f orm similarl y to a two-s tage LGD-model with simpler linear component models (e.g., logistic and linear regression). Further more, their LGD-decomposition within the two-s tage approach f oregoes the idea of repossession and rather f ocusses on whether the LGD was either equal to or g reater than 0, i.e., a write-off r isk component. This conv ention is a useful g eneralisation across product type, par ticularl y since they ha ve used six different datasets. F or this reason, we too shall adopt a wr ite-off r isk component rather than repossession, given its g eneralisability and the fact that ‘loss ’ is typically associated with a wr ite-off ev ent in credit r isk modelling. Originating from the biostatis tical literature, sur vival analysis is a rather pow erful class of techniques; as discussed b y Singer and Willett ( 1993 ), Kleinbaum and Klein ( 2012 ), Kar tsonaki ( 2016 ), and Schober and V etter ( 2018 ). These techniq ues generall y analy se the length of time until reaching some well-defined endpoint, should the e vent occur . Survival analy sis theref ore do not onl y predict the probability of an ev ent occur ring, but also its timing. Further more, sur viv al analy sis is able to use all av ailable data, including those unresolv ed/r ight-censored cases. So f ar, its use in the literature of credit r isk modelling has larg ely been restricted to modelling another r isk parameter instead of the LGD, i.e., the pr obability of default (PD). In this regard, see Banasik et al. ( 1999 ), Stepano va and Thomas ( 2002 ), Bellotti and Crook ( 2009 ), Bellotti and Crook ( 2013 ), Bellotti and Crook ( 2014 ), Dirick et al. ( 2017 ), Djeundje and Crook ( 2019 ), Breeden and Crook ( 2022 ), Botha et al. ( 2025 ), and Botha and V erster ( 2026 ). Nonetheless, and as w e shall demonstrate, the use of sur viv al analy sis in LGD-modelling is still relativ ely scant. 2.1. Using survival analy sis in LGD-modelling: Co x proportional hazards (CPH) models One of the earlies t works that explored the use of sur viv al anal ysis in LGD-modelling is that of Witzan y et al. ( 2012 ). In particular, the y fit a Co x proportional hazar ds (CPH) model, whic h can le verag e unresol ved/right-censored def aults. T w o other models were dev eloped within a broader comparative study : 1) a single-stag e ordinary least squares (OLS) model; and 2) a two-s tage model consisting of a logistic reg ression model f or estimating wr ite-off probabilities, coupled with an av erage loss sev erity given write-off. Using a fe w custom diagnostics (e.g., a modified 4 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study coefficient of deter mination), they f ound that the single-stag e OLS-model is again outper f or med by the tw o-stage model. Y et both of these models are significantly outper f or med b y the single-stag e CPH-model that directl y estimates the LGD as the sur viving propor tion of the default balance. J. Zhang and Thomas ( 2012 ) also compared a fe w sur viv al models to an OLS-model within both a single-s tage and tw o-stag e setup. Their survival models included a CPH-model and an acceler ated f ailure time (AFT) model in estimating the reco very rate. Since the latter AFT -model cannot handle zero-values in the outcome variable, the authors essentially adopted a tw o-stag e approach to wards estimating both sur viv al models. They first had to classify complete wr ite-offs (i.e., a reco very rate of zero) using a separate logistic regression model, whereafter both sur viv al models were built on the non-zero reco very rates. In contrast to Witzan y et al. ( 2012 ), the authors ultimately achie v ed mix ed results in that the OLS-model performed the best in cer tain metr ics, whilst the rev erse is tr ue f or sur viv al models in other metr ics. Nonetheless, these results bode well f or using sur viv al analy sis in g eneral, and ser v e as a premise for our w ork. The work of Fenech et al. ( 2016 ) ex amined the determinants of loan recov er y outcomes using sur viv al analy sis on def aulted Amer ican commercial loans. The authors compared nonparametric (Kaplan–Meier), semi-parametr ic (CPH-models), and se veral parametric Co x-models. The y f ound that the dynamics of debt recov ery were best captured when assuming that the baseline hazard function f ollow s a log–logistic distribution. In par ticular , their results rev ealed a "hump-shaped" hazard function, which peaked at 23 months post-def ault, after which the reco very likelihood declined again. Our study will par tl y cor roborate this result in that the write-off probabilities ov er time ha ve a similar right-ske wed shape. W ood and Po w ell ( 2017 ) e xtended the application of sur viv al analy sis in tw o-stage LGD-modelling, though the y swapped wr ite-off r isk f or repossession r isk in the LGD decomposition. Within the repossession component, the y dev eloped a ser ies of CPH-models, each f ollo wing the Fine-Gra y fashion in contending with the competing nature of repossession vs cure ev ents. Each CPH-model embeds a different probation per iod 𝑘 in defining (re)def ault o v er loan life. In par ticular , a defaulted loan that resumes pa yment is said to cure only after 𝑘 periods hav e lapsed, during whic h time def ault cr iteria must not apply . The premise of imposing such a probation period within the data is to subdue ‘noise ’ in the repayment histories of those high-r isk loans, which can other wise e xit and re-enter the default state multiple times o ver loan life. Put differentl y , 𝑘 should be sufficiently larg e so that a new def ault spell is a tr ul y new and independent spell, instead of being an extension of the pre vious spell. The authors then performed a simulation study and show ed that the cumulativ e incidence function (of the repossession ev ent) differs substantiall y o ver time and across different 𝑘 -v alues, where 𝑘 ∈ { 1 , 3 , 6 , 9 , 12 } months. W e similarl y contend with a probation per iod in our data and set 𝑘 = 6 months, though future w ork can cer tainl y e xper iment in this regard to wards loss-optimising this 𝑘 -v alue. In Jouber t et al. ( 2018b ), the authors used CPH-models in modelling the wr ite-off r isk component, as motivated b y the inherent ability of CPH-models to use r ight-censored inf or mation. The y built two separate CPH-models f or modelling the sur viv al probability of either the write-off or cure ev ent o ver time. Both sur viv al cur v es were combined using the cumulative incidence function within a competing r isks setup. These CPH-models were then compared to a logistic regression (LR) model in estimating wr ite-off r isk, whereafter all model outputs w ere multiplied with a simple loss se verity model to w ards obtaining o verall LGD-estimates. The authors f ound that the mean squared error (MSE) of the LGD-values produced by the CPH-model was substantially lo wer than that of the LR -model, when compared to the empir ical LGD-values. The authors unfortunately did not provide other diagnostics of the various models, ex cept f or the MSE and decile-graphs betw een the empirical and e xpected LGD-v alues. Nonetheless, the results of Jouber t et al. ( 2018b ) will inf or m the design of our own comparativ e study 5 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study of LGD sur viv al models, which will include an expanded set of model diagnostics. Jouber t et al. ( 2018a ) and Joubert et al. ( 2021 ) introduced a ne w approach to using sur viv al analy sis in single-stag e LGD-modelling. Their approach refines the e xposure-weighted approach from Witzan y et al. ( 2012 ) to one that is w eighted b y the def ault balance using frequency w eights. F ur thermore, their approach can cater f or neg ative cash flo ws, and it can embed the dynamics of recov ering more than the def ault balance. Ho we v er, and similar to Witzan y et al. ( 2012 ) and J. Zhang and Thomas ( 2012 ), all of these authors f or mulate the sur viv al probability 𝑆 ( 𝑡 ) as the surviving propor tion of the default balance up to a given time 𝑡 . In so doing, one implicitly assumes that each unit of cur rency is an independent ‘lif e’ that has sur viv ed the wr ite-off ev ent. But consider that a single reco v er y (or receipt) 𝑅 𝑡 at time 𝑡 has, in fact, many (inter -dependent) units of cur rency in its make-up; par ticular ly since they all der iv e from the same obligor . This variance-related aspect is y et unstudied to the best of our kno w ledge. Our work differs conceptually in that we e xplicitly model the time-dependent wr ite-off probability separatel y from the loss sev erity , in follo wing a two-s tage approach to LGD-modelling. Using unsecured consumer loans, Li et al. ( 2023 ) impro ved LGD prediction by lev eraging time-varying scores as input variables that emanate from separate sur viv al models. Their premise is that no consensus e xists on the best LGD-modelling techniq ue, and that one should rather spend effort on crafting inputs (such as these scores) to wards obtaining better results. As such, they used a CPH-model in de veloping application scores, which reflect the PD at the point of loan application. Then, they built a multiplicative hazard (MH) model, which is based on a counting process formulation and g eneralises the CPH-model tow ards dealing with recur rent def ault ev ents. This MH-model produces behavioural scores, which signify the PD at each time point during loan life up to the def ault point. Both score types are then used as input variables, tog ether with others such as macroeconomic variables, which are consumed within four kinds of LGD-models: 1) T obit regression; 2) regression trees; 3) logit-transformed regression; and 4) beta regression. While the authors f ound that the inclusion of these scores resulted in better LGD-models, their per f or mance metr ics (e.g., the coefficient of deter mination) still indicated a general trend of poor performance, with T obit regression scor ing the best results. Based on the kno wn cor relation between the PD and the LGD, the y ultimately show ed that the LGD prediction accuracy benefits from using time-dependent inputs, such as these sur viv al scores. 2.2. Extensions to surviv al anal ysis in credit risk modelling As f or e xtensions, Larney et al. ( 2023 ) dev eloped a pr omotion time cure (PTC) model in estimating the time to write-off amongst competing r isks. Emerging from cancer s tudies, this type of surviv al model recognises that a cer tain propor tion of def aulted loans will ne ver e xper ience write-off; i.e., they are immune. In par ticular , the def aulted account is said to hav e 𝑁 number of unobservable and competing causes (or "carcinogenic cells") for the main wr ite-off ev ent such that the activ ation of an y of these causes can tr igg er wr ite-off. Examples of such latent causes may include the distressed bor ro w er’ s discretionar y e xpenditure, or undisclosed debts. For 𝑁 ≥ 1 causes, the f ailure time is then defined as 𝑇 = min ( 𝑍 1 , . . . , 𝑍 𝑁 ) , where 𝑍 1 , . . . , 𝑍 𝑁 represent the time required f or the 𝑗 th cause to be realised. A non-susceptible or cured account is said to ha v e 𝑁 = 0 causes, where 𝑁 is commonl y assumed to ha ve a Poisson distribution. Lar ne y et al. ( 2023 ) also contended that the time to wr ite-off within LGD-modelling is influenced by latent factors, e.g., unobser v able elements of the economy . A ccordingly , the y combined their gamma PTC-model with a frailty component to account for such factors. Doing so can model the hazard function more fle xibl y , which may lead to impro ved prediction accuracy . In fact, the y demons trated the super iority of suc h a PTC-model with gamma frailty using US cor porate loans, having compared it ag ainst 6 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study other parametr ic frailty types, including a PTC-model with no frailty . These frailty PTC-models can ultimately help characterise the une xplained heterogeneity in wr ite-off times. Ho we v er , assuming that cer tain accounts hav e write-off immunity may itself become a assumption subject to s tress during economic crises. This assumption ma y very w ell invite cr iticism from regulators and auditors alike when using PTC-models practicall y . This waxing and waning of wr ite-off immunity is y et uns tudied to the bes t of our know ledg e, and w e shall theref ore leav e PTC-models outside of our study scope f or the time being. In Botha and V erster ( 2026 ), the authors e xplored the use of discre te-time hazar d (DtH) models f or predicting the loan-lev el PD, which culminated in a data-dr iv en tutor ial. Their w ork included a rigorous sur v e y of survival modelling in credit risk, a r ic h descr iption of the necessar y data structure, DtH-models themsel ves, and a demonstration thereof using residential mortgag es. They used the f amiliar g eneralised linear models (GLM) frame work with a logit link function f or estimating these DtH-models, which bodes w ell f or the practical adoption of such models. These DtH-models were then assessed using a range of applicable diagnostics such as time-dependent v ar ieties of both R OC-anal yses and Br ier scores. Both of these diagnostics w ere succinctly ref or mulated f or the credit r isk modelling conte xt, and implemented in the R -prog ramming language. The authors also compared the empirical vs e xpected ter m-s tr ucture of default risk, or the collection of av erag e default probabilities o ver spell time. Ov erall, they f ound that the predictions of DtH-models agree quite closely with reality , depending on the quality of the input variables, which was itself v ar ied. These results augur well f or the use of DtH-models in estimating write-off probabilities, as we shall e xplore in this paper . As another sur viv al modelling technique, consider a binar y sur viv al tree , which e xtends recursiv e partitioning methods to r ight-censored data; i.e., marr ying sur viv al anal ysis with decision trees. Such a tree is g ro wn by recursiv ely splitting the data into tw o "daughter nodes" f or a par ticular input variable. During this process, a splitting rule is used that maximises the difference in the sur viv al probabilities between these two nodes, as remarked by Frydman and Matuszyk ( 2022 ). In so doing, dissimilar cases are pushed apar t, and each node e ventuall y contains homog enous cases with similar sur viv al probabilities o ver time. This process continues until reaching a saturation point whereat each node contains at least 𝑛 0 > 0 cases, and the most e xtreme nodes are called ter minal nodes (or lea ves). In fact, the authors explored an ensemble-based e xtension of this technique, called a random sur viv al for est (RSF), which was applied within the context of PD-modelling with competing risks, i.e., default vs earl y settlement. Using P olish data on car leases, the authors compared this RSF-model to a classical competing risks CPH-model, itself built using the Fine-Gra y method. The y f ound that the former model had greater prediction pow er than the latter classical model, at least based on the time-dependent Br ier scores of these models. Another work that lev erages sur viv al trees is that of Blumenstoc k et al. ( 2022 ), who dev eloped f our sur viv al models amongst two competing r isks (default and earl y settlement) within the context of PD-modelling. These models included a classical CPH-model with/without using the Fine-Gra y method, as wel l as an RSF-model, and a deep lear ning-based model called "DeepHit". All f our models seek to estimate the cumulativ e incidence function f or a par ticular competing r isk. The authors used the time-dependent concordance s tatistic (Har rell’ s 𝑐 ) as their metr ic of c hoice in compar ing the prediction perf or mance across the four models. The DeepHit-model outper f or med all others, unsur prisingly ; how ev er, the RSF-model was outperformed by only a relativel y small margin in most of the experiments. This result attes ts of the robus tness of a tree-based method suc h as RSF in holding its o wn agains t the more complex DeepHit-model. The authors also address the typical concer n with machine learning models (i.e., their opaqueness and lack of e xplainability) by using "per mutation impor tance" tow ards establishing the e xtent to which model predictions are influenced by individual input variables. These permutation impor tance 7 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study scores ev aluate the decrease in model per f or mance after ‘cor rupting’ a par ticular input b y adding random noise. Ov erall, the w orks of Fr ydman and Matuszyk ( 2022 ) and Blumenstock et al. ( 2022 ) demonstrated that adv anced tree-based methods (e.g., RSF) can perf or m admirably in PD-modelling. This success may very well translate to LGD-modelling, or at least elements thereof, though there is yet little research on this aspect. In this reg ard, Ptak -Chmielew ska and Kopciusze wski ( 2024 ) are probabl y the first who dev eloped an RSF-based LGD-model, and compared it to a CPH-model in predicting the sur viv al probability . Their diagnostics differ from ours in that the y used the classical concordance 𝑐 -statis tic and the coefficient of deter mination ( 𝑅 2 ), itself calculated using a bootstrap analy sis. The authors f ound that the RSF-model produced a higher 𝑐 -statis tic, which again bodes well f or tree-based methods in LGD-modelling. Theref ore, our study too shall include such a tree-based method, and w e re view the fundamentals of a par ticular type of sur viv al tree in Appendix A called a conditional infer ence sur viv al tree. The use of sur viv al anal ysis has also surfaced in other aspects of LGD-modelling. In par ticular , Jacobs ( 2024 ) proposed a modelling framew ork wherein the ultimate LGD is linked with a time-to-resolution concept using a joint fractional logit-survival model. Their framew ork produces both unconditional LGD f orecasts f or the pur poses of pr icing and capital modelling, and it produces conditional f orecasts f or loans cur rentl y in default. Using European small business loans, Betz et al. ( 2021 ) dev eloped a Bay esian hierarchical modelling framew ork wherein the default resolution time (DR T) is intrinsically linked with the ev entual loss se verity . The y used a W eibull AFT -model f or the DRT and a finite mixture model f or the loss se v er ity , thereb y allowing f or correlated shocks betw een resolution times and losses. Their results show that long er DR T s are associated with higher loss rates, and that economic cr ises e xacerbate both, whic h can lead to se v ere underestimation in standard models (with a bias up to 20%-points). Ultimately , both of these studies sho w case the impor tance of explicitl y modelling the link betw een time and loss sev erity . Our w ork is cognisant hereof and theref ore both components within our two-s tage LGD-modelling approach will include a time f actor . As shown, the use of sur viv al anal ysis in LGD-modelling is still relativ ely new within literature, at least when compared to its matur ity in PD-modelling. While the tw o-stag e approach is f ollo wed in some cases, its definition/decomposition often inv ol ves a repossession component instead of an e xplicit wr ite-off component; which w e belie ve to be more fundamental to ECL -estimation under IFRS 9. Our comparativ e study g enerally f ollo ws the design of Jouber t et al. ( 2018b ) and Loter man et al. ( 2012 ), though we amend both the list of diagnostics and selection of techniques. In par ticular , our diagnostics assesses various aspects of the models, including their discriminator y pow er , prediction accuracy , and the extent to which predictions ag ree with reality when aggregated in v ar ious wa ys. Our s tudy theref ore comprises a greater v ar iety of diagnostics, which renders the benc hmark study as more v aluable to practitioner and regulator alik e. Reg arding tec hniques, we include a fe w single-stag e LGD-models as a baseline ag ainst which a v ar iety of tw o-stag e LGD-models are compared. In this regard, we belie ve our s tudy to be the first to include DtH-models in this compar ison. While RSF-models w ere positiv ely compared in previous w orks, little attention has been paid to one of its classical siblings – the humble and more e xplainable (conditional inf erence) sur viv al tree, which w e include in the compar ison. Lastl y , and whereas some studies hav e f ocused on the competing r isks nature of LGD-modelling, we shall restrict our attention to building cause-specific (write-off ) survival models in the interest of simplicity . 8 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 3 The estimation of lif etime write-off risk using survival analy sis in discre te time W e present some basic concepts and notation in Subsec. 3.1 and discuss the str ucturing of credit data into a f or mat that is conducive to sur viv al analy sis f or LGD-modelling. W e br iefly f or mulate discrete-time survival analy sis in Subsec. 3.2 , and present the empirical ter m-structure of write-off r isk, which our ev entual modelling will strive to reco v er . Finall y , the techniques are presented in Subsec. 3.3 to wards modelling the wr ite-off r isk ov er an account ’ s lif etime. Man y of the af orementioned subsections (and appendix mater ial) may be vie wed as par ticular steps within a broader data-dr iv en tutor ial to modelling the wr ite-off r isk component in LGD-modelling. 3.1. Basic concepts & notation to war ds structuring credit data for LGD surviv al modelling W e shall start with some notation to wards f or malising the use of sur viv al anal ysis in estimating write-off r isk. As discussed by Botha and V erster ( 2026 ), the lif etime of a loan can be bifurcated into either per f or mance or default spells, as demonstrated in F ig. 1 for a fe w h ypothetical loans. A spell denotes a multi-period time span during which the repayment per f or mance of a loan is monitored up to the time of the spell’ s resolution. A default spell starts at the default time 𝜏 𝑑 and ends at a resolution time 𝜏 𝑟 > 𝜏 𝑑 , index ed by 𝑡 = 𝜏 𝑑 , . . . , 𝜏 𝑟 . Such a default spell ma y cure and the loan ma y re-default later dur ing its lif e, which implies a ‘multi-spell’ setup in tracking the loan o v er its lif etime. This multi-spell setup is also kno wn as recur rent survival analy sis, as discussed by Willett and Sing er ( 1995 ), Jenkins ( 2005 , §1.1), and e xplored by Botha et al. ( 2025 ). Moreo v er , the cure-outcome competes with the wr i t e - o ff outcome at ev er y time point dur ing a spell’ s lif etime, whic h has bear ing on the ov erall modelling of write-off r isk. Fig. 1. Demons trating the resolution types of default spells o ver time f or a f ew h ypothetical loans. Consider a por tf olio of 𝑁 𝑑 def aulted loans, wherein any loan 𝑖 = 1 , . . . , 𝑁 𝑑 ma y hav e 𝑗 = 1 , . . . , 𝑛 𝑖 number of def ault spells, where 𝑛 𝑖 denotes the maximum number of spells f or loan 𝑖 . Let ( 𝑖 , 𝑗 ) ref er to a specific subject-spell that represents the history of a single default spell 𝑗 of 𝑖 , as accompanied b y a spell resolution outcome. Some spells ma y lack such an outcome, in which case the y are said to be right-censored . W e denote these r ight-censored 9 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study spells with 𝑐 𝑖 𝑗 ∈ { 0 , 1 } in that 𝑐 𝑖 𝑗 = 1 f or a r ight-censored spell, and 𝑐 𝑖 𝑗 = 0 otherwise. The f ew outcomes into which ( 𝑖 , 𝑗 ) may resolv e are coalesced into a single nominal variable R D 𝑖 𝑗 , defined as R D 𝑖 𝑗 = 1 : W r itten-off if 𝑐 𝑖 𝑗 = 0 and wr ite-off cr iteria applies 2 : Cured if 𝑐 𝑖 𝑗 = 0 and cur ing cr iteria applies 3 : Censored if 𝑐 𝑖 𝑗 = 1 . (1) Similar to Botha and V erster ( 2026 ) regarding per f or ming spells, w e obser v e a default spell ( 𝑖 , 𝑗 ) from its entr y time 𝜏 𝑑 ( 𝑖 , 𝑗 ) ≥ 1 up to one of tw o points. These endpoints include either the resolution time 𝜏 𝑟 ( 𝑖 , 𝑗 ) f or 𝑐 𝑖 𝑗 = 0 , or the censor ing time 𝐶 𝑖 𝑗 < 𝜏 𝑟 ( 𝑖 , 𝑗 ) f or 𝑐 𝑖 𝑗 = 1 . W e shall contend with the ov erall spell stop time 𝜏 𝑠 ( 𝑖 , 𝑗 ) , which is simpl y the minimum betw een 𝜏 𝑟 ( 𝑖 , 𝑗 ) and 𝐶 𝑖 𝑗 . During an on-going spell, we measure time discretely using an integer -valued counter variable, called the spell period and is denoted b y 𝑡 𝑖 𝑗 = 𝜏 𝑑 ( 𝑖 , 𝑗 ) , . . . , 𝑡 𝑖 𝑗 𝑘 , . . . , 𝜏 𝑠 ( 𝑖 , 𝑗 ) f or spell 𝑗 of defaulted loan 𝑖 . This notation implies that our data f ollow s the counting process sty le in using the le xicon from sur viv al analy sis, as discussed b y Kleinbaum and Klein ( 2012 , pp. 20-23). W e shall denote the o verall spell age of ( 𝑖 , 𝑗 ) as 𝑇 𝑖 𝑗 , which represents the obser v able f ollo w-up time, defined as 𝑇 𝑖 𝑗 = 𝜏 𝑠 ( 𝑖 , 𝑗 ) − 𝜏 𝑑 ( 𝑖 . 𝑗 ) . The ( 𝑖 , 𝑗 ) -par t will no w be dropped from the notation of certain quantities { 𝜏 𝑑 , 𝜏 𝑠 , 𝜏 𝑟 } in the interest of simplicity , though its connection to a particular spell remains implied. Fur thermore, the loan per iod 𝑡 𝑖 tracks the o v erall history (or age) of 𝑖 , e.g., the variable "time on book". This variable may differ from the spell period 𝑡 𝑖 𝑗 , which itself measures the time spent in the spell at each point of its duration. Lastl y , let 𝑒 𝑖 𝑗 𝑡 be the ev ent history indicator that flags whether the main (wr ite-off ) ev ent transpired at a specific point 𝑡 𝑖 𝑗 , where zero-v alues indicate either competing risks or right-censoring. In the w ords of Sing er and Willett ( 1993 ), this 𝑒 𝑖 𝑗 𝑡 ma y be described as a "chronology of e vent indicators" since its collection ov er time represents one of tw o v ectors, whic h differ onl y in the last element: either ( 0 , 0 , . . . , 1 ) f or a written-off subject-spell, or ( 0 , 0 , . . . , 0 ) f or a censored subject-spell. Whilst cer tainl y not ideal, our setup encodes competing r isks as r ight-censored cases, which is kno wn as the latent risks approach to handling them; see Putter et al. ( 2007 ). From Jenkins ( 2005 , §1.3), sur viv al anal ysis e xamines a random non-negativ e variable 𝑇 ≥ 0 that denotes the latent lifetimes of default spells. Ho w ev er , and giv en the obser v ed lifetimes 𝑇 𝑖 𝑗 , 𝑖 = 1 , . . . , 𝑁 𝑑 , 𝑗 = 1 , . . . , 𝑛 𝑖 , we cannot tr ul y e xamine the aforementioned latent lifetimes because of the conf ounding possibilities of censoring and truncation . Put simply , a completed spell ( 𝑖 , 𝑗 ) ending in write-off w ould sugges t 𝑇 = 𝑇 𝑖 𝑗 , whereas a right-censored spell suggests 𝑇 > 𝑇 𝑖 𝑗 , whic h hampers the estimation of 𝑇 . Witzan y et al. ( 2012 ) imagined this problem using a r ight-angled "tr iangle of data", wherein increasingly recent cohor ts of loans become progressivel y more r ight-censored ov er time. Naturall y , our mor tg age data is dul y affected by r ight-censoring. A left-tr uncated spell fur ther complicates the interpretation of 𝑇 𝑖 𝑗 , since the star ting point of ( 𝑖 , 𝑗 ) predates that of the ov erall sampling windo w; see Jenkins ( 2005 , §1.2.1) and Kleinbaum and Klein ( 2012 , pp. 132-134). Practically , our mor tg age data exhibits e xtensive left-tr uncation and we encode its presence by adjusting the starting time of the first affected spell. Specificall y , 𝜏 𝑑 is set to the starting loan age; itself calculated as the difference betw een the cur rent date and the origination date. These ideas ultimatel y culminate in the longitudinal dataset credit dataset D = n 𝑖 , 𝑡 𝑖 , 𝑗 , 𝑡 𝑖 𝑗 , 𝜏 𝑑 , 𝜏 𝑠 , R D 𝑖 𝑗 , 𝑇 𝑖 𝑗 , 𝑒 𝑖 𝑗 𝑡 o , which is illustrated in the appendix f or a f ew hypothetical loans. Using the mortgage data, the distributions of obser v ed lifetimes are graphed in Fig. 2 per resolution type R D , which rev eal quite a f ew insights. Firs tly , it is evident that the distributions hav e different shapes, ev en though all of them are r ight-sk ew ed. Consider the dis tr ibution of the r ight-censored outcome R D = 3 , whic h is markedl y 10 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study different from that of the main write-off outcome R D = 1 . This difference sugg ests that the censoring times 𝐶 𝑖 𝑗 , 𝑖 = 1 , . . . , 𝑁 𝑑 , 𝑗 = 1 , . . . , 𝑛 𝑖 are independent from the write-off resolution times 𝜏 𝑟 ( 𝑖 , 𝑗 ) . In tur n, it sugges ts that censor ing is non-inf or mativ e in that it does not affect the occur rence or timing of the main wr ite-off e vent. Note that non-inf or mativ e censor ing is a necessar y assumption in sur viv al analy sis, as discussed b y Kleinbaum and Klein ( 2012 , p. 42) and Schober and V etter ( 2018 ). Secondly , Fig. 2 can help us understand the effect and prev alence of competing risk e v ents, whose occur rence precludes the main ev ent from taking place. E.g., wr ite-off occur red only in about 21% of spells, whilst 71% thereof ended in the competing cure ev ent. Thirdl y , the r ight-sk ew ed nature of the main write-off ev ent affirms the intuition that most def ault spells are rather short-lived, with onl y a f ew surviving bey ond 60 months (itself an arbitrar il y chosen point for illustration pur poses). All of these results render the histogram of failure times into a rather useful (and reusable) diagnostic in practice, and w e advocate its use. Fig. 2. Histograms of failure times (or spell ages) f or 𝑇 𝑖 𝑗 ≤ 300 b y resolution type, having used residential mor tg age data. Empirical density estimates are o ver laid. In sampling data from the credit dataset D f or sur viv al analy sis, w e retain the entire spell histor y o ver all of its periods 𝑡 𝑖 𝑗 = 𝜏 𝑑 , . . . , 𝜏 𝑠 , lest the resulting sur viv al estimates become compromised due to missing spell per iods. The ro w-observations in D are theref ore cluster ed around a common characteristic – the loan ID – bef ore sampling randoml y amongst the resulting 𝑁 𝑑 clusters. W e resample all of the default spells into the training set D 𝑇 across the entire lifetimes of randoml y selected loans, which constitutes 70% of 𝑁 𝑑 . This sampling fraction is set using con vention, though can cer tainl y be inv estig ated in future work. The remaining spell his tor ies are relegated into the 11 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study non-o v erlapping validation set D 𝑉 . This clustered random sampling scheme is also br iefly discussed and illustrated b y Baesens et al. ( 2016 , §6) within the PD-modelling conte xt. W e affir m that the process b y whic h the sets D 𝑇 and D 𝑉 are created is representativ e of the trends in the main dataset D . This process uses the resolution rat e that is calculated f or each dataset, as descr ibed by Botha and V erster ( 2026 ) and detailed in the appendix ( Subsec. A.3 ). 3.2. Estimating the em pirical term-structure of write-off risk using discre te-time surviv al anal ysis As discussed b y Botha and V erster ( 2026 ), we discretise continuous time into a sequence of distinct con- tiguous intervals, giv en the typical s tr ucture of credit data, and w e do so tow ards f ormulating discrete-time survival analy sis. Let this sequence of uniq ue or dered failur e times (e x cluding censored cases) be defined as 0 , 𝑡 ( 1 ) , 𝑡 ( 1 ) , 𝑡 ( 2 ) , . . . , 𝑡 ( 𝑘 − 1 ) , 𝑡 ( 𝑘 ) , . . . 𝑡 ( 𝑚 − 1 ) , 𝑡 ( 𝑚 ) , where 𝑡 ( 1 ) < 𝑡 ( 2 ) , < · · · < 𝑡 ( 𝑘 ) < · · · < 𝑡 ( 𝑚 ) up to some maximum time point 𝑚 . W ithin eac h of these inter v als 𝑡 ( 𝑘 − 1 ) , 𝑡 ( 𝑘 ) , one ma y tally the number of certain ev ents at month-end. In par ticular , let 𝑓 𝑘 represent the number of spells that hav e f ailed (or been wr itten-off ) at 𝑡 ( 𝑘 ) , and let 𝑐 𝑘 denote the number of spells that became r ight-censored during said inter v al. The r isk set at 𝑡 ( 𝑘 ) is said to contain 𝑛 𝑘 number of spells that are at risk of ending immediatel y prior to 𝑡 ( 𝑘 ) , where such spells hav e a spell ag e of at least 𝑇 𝑖 𝑗 > 𝑡 ( 𝑘 ) . This 𝑛 𝑘 ma y be defined as the summation of all failure times 𝑓 𝑞 and censoring times 𝑐 𝑞 , be y ond those remaining times from and bey ond 𝑡 ( 𝑘 ) , as index ed b y 𝑞 ≥ 𝑘 ; i.e., 𝑛 𝑘 = 𝑓 𝑞 + 𝑐 𝑞 + 𝑓 𝑞 + 1 + 𝑐 𝑞 + 1 + · · · + ( 𝑓 𝑚 + 𝑐 𝑚 ) = 𝑚 𝑞 = 𝑘 ( 𝑓 𝑞 + 𝑐 𝑞 ) , (2) assuming that 𝑓 0 = 0 and that 𝑛 0 is the initial population count. The aforementioned e vent history indicator 𝑒 𝑖 𝑗 𝑡 from Subsec. 3.1 can now be more f or mall y defined as 𝑒 𝑖 𝑗 𝑘 = I 𝑡 ( 𝑘 − 1 ) < 𝑇 𝑖 𝑗 ≤ 𝑡 ( 𝑘 ) , which equals 1 if subject-spell ( 𝑖 , 𝑗 ) w as written-off dur ing 𝑡 ( 𝑘 − 1 ) , 𝑡 ( 𝑘 ) , and 0 other wise. For more detail on these quantities in discrete-time survival analy sis, see Singer and W illett ( 1993 ), Jenkins ( 2005 , pp. 15-17, §4.1), Allison ( 2010 , §7), Cro wder ( 2012 , pp. 15-16, 57–58, 81–82), Kar tsonaki ( 2016 ), and Suresh et al. ( 2022 ). W e no w consider the lifetime 𝑇 𝑖 𝑗 of each spell ( 𝑖 , 𝑗 ) to be a realisation from a non-negativ e random variable 𝑇 that represents the latent lif etimes of default spells. U p to some per iod 𝑡 ( 𝑘 ) , let 𝐹 𝑡 ( 𝑘 ) = P ( 𝑇 ≤ 𝑡 ( 𝑘 ) ) denote the cumulativ e lif etime distr ibution, i.e., the probability of e xper iencing wr ite-off dur ing the long time frame 𝑡 ( 0 ) , 𝑡 ( 𝑘 ) . The complement thereof 𝑆 𝑡 ( 𝑘 ) = 1 − 𝐹 𝑡 ( 𝑘 ) = P 𝑇 > 𝑡 ( 𝑘 ) represents the classical sur viv or function. An associated probability mass function e xists 𝑓 𝑡 ( 𝑘 ) = P 𝑇 = 𝑡 ( 𝑘 ) that represents the marginal write-off pr obability , i.e., the probability of 𝑇 assuming a specific ev ent time. In discrete-time, and drawing inspiration from Jenkins ( 2005 , pp. 17-20), Crowder ( 2012 , pp. 15-16), and Suresh et al. ( 2022 ), we relate 𝑓 𝑡 ( 𝑘 ) to 𝑆 𝑡 ( 𝑘 ) as 𝑓 𝑡 ( 𝑘 ) = P 𝑇 = 𝑡 ( 𝑘 ) = P 𝑡 ( 𝑘 − 1 ) < 𝑇 ≤ 𝑡 ( 𝑘 ) = 𝑆 𝑡 ( 𝑘 − 1 ) − 𝑆 𝑡 ( 𝑘 ) . (3) By con vention, zero-length lif etimes are not possible such that 𝑆 𝑡 ( 0 ) = 1 , while 𝑓 ( 𝑡 ) = 0 whene ver 𝑡 does not equal any of the ordered f ailure time 𝑡 ( 𝑘 ) . The survivor function 𝑆 𝑡 ( 𝑘 ) also f eatures in another useful quantity , i.e., the discr ete hazar d ℎ 𝑡 ( 𝑘 ) . This quantity may be interpreted as the propor tion of the r isk set just prior to 𝑡 ( 𝑘 ) that was written-off dur ing the contiguous interval 𝑡 ( 𝑘 − 1 ) , 𝑡 ( 𝑘 ) , i.e., e xiting the spell during the 𝑘 th interval. In f ollowing Jenkins ( 2005 , pp. 17-20), Cro wder ( 2012 , pp. 15-16), and Botha and V erster ( 2026 ), the hazard function is the conditional 12 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study write-off probability , ha ving sur viv ed hither to, and is e xpressed in discrete time as ℎ 𝑡 ( 𝑘 ) = P 𝑡 ( 𝑘 − 1 ) < 𝑇 ≤ 𝑡 ( 𝑘 ) | 𝑇 > 𝑡 ( 𝑘 − 1 ) = 1 − 𝑆 𝑡 ( 𝑘 ) 𝑆 𝑡 ( 𝑘 − 1 ) , with 0 ≤ ℎ 𝑡 ( 𝑘 ) ≤ 1 . (4) N aturally , this conditional write-off probability ℎ 𝑡 ( 𝑘 ) is related to the marginal v ar iant thereof 𝑓 𝑡 ( 𝑘 ) as 𝑓 𝑡 ( 𝑘 ) = 𝑆 𝑡 ( 𝑘 − 1 ) · ℎ 𝑡 ( 𝑘 ) = ⇒ ℎ 𝑡 ( 𝑘 ) = 𝑓 𝑡 ( 𝑘 ) 𝑆 𝑡 ( 𝑘 − 1 ) . (5) Most impor tantl y , the collection 𝑓 𝑡 ( 0 ) , . . . , 𝑓 𝑡 ( 𝑚 ) constitutes the empirical term structure of write-off r isk o v er spell time, which is the focus of this paper . Fig. 3. The empir ical ter m-s tr ucture of wr ite-off risk, as constituted by the discrete ev ent probabilities 𝑓 ( 𝑡 ) , 𝑡 = 𝑡 ( 1 ) , . . . , 𝑡 ( 𝑚 ) o v er spell time. Its estimation relies upon the KM-estimator from Eq. 6 using residential mor tgag e data. A LOESS-smoother with a 95% confidence interval is ov erlaid merel y to summar ise the visual trend. In estimating these sur viv al quantities, consider the well-kno wn Kaplan-Meier (KM) estimator from Kaplan and Meier ( 1958 ) in estimating 𝑆 𝑡 ( 𝑘 ) . This KM-estimator is defined as ˆ 𝑆 𝑡 ( 𝑘 ) = Ö 𝑠 : 𝑡 ( 𝑠 ) ≤ 𝑡 ( 𝑘 ) 1 − 𝑓 𝑠 𝑛 𝑠 = Ö 𝑠 : 𝑡 ( 𝑠 ) ≤ 𝑡 ( 𝑘 ) ( 1 − ℎ 𝑠 ) , (6) where ℎ 𝑘 = 𝑓 𝑘 / 𝑛 𝑘 is the estimated discrete hazard dur ing the 𝑘 th time inter v al, as discussed by Cro wder ( 2012 , pp. 15, 55, 77, 81) and Kar tsonaki ( 2016 ). Giv en estimates of 𝑆 ( 𝑡 ) , 𝑡 = 𝑡 ( 0 ) , . . . , 𝑡 ( 𝑚 ) , one ma y use Eq. 3 to der iv e the empir ical ter m-s tr ucture of wr ite-off r isk, i.e., the collection 𝑓 𝑡 ( 𝑘 ) 𝑚 𝑘 = 1 . It is this collection that can f or m the empirical (or ‘actual’) baseline ter m-s tr ucture against which expected varieties thereof, as produced by competing 13 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study models, can ev entuall y be compared. In fact, w e illustrate this empir ical ter m-s tr ucture in Fig. 3 o v er spell time using the same residential mor tg age data. It is e vident that the probabilities become increasingly unstable at greater spell ag es (i.e., to wards the r ight-hand side of the g raph), which is attr ibuted to the increasing sparsity of data dur ing those periods. The ev ent probability is also lo wer dur ing extremel y earl y ag es, which w e believ e attests of inter nal write-off policies that ha ve not y et taken effect due to on-going collection effor ts. Fur thermore, the r ight-sk ew ed nature of this curve sugges ts that earlier failure (wr ite-off ) is g enerally more probable than later failure. The credit sy stem is theref ore prone to "w ear in" such that the hazard rate decreases o ver time, whic h is similar to infant mor tality studies, as described by Cro wder ( 2012 , pp. 14). The "hump-shaped" hazard function that under pins Fig. 3 also cor roborates the work from Fenec h et al. ( 2016 ), i.e., a r ight-sk ew ed log-logistic distribution had the best fit. 3.3. Three competing models f or estimating write-off risk W e shall now f or mulate our v arious models in Sections 3.3.1 – 3.3.3 f or estimating the wr ite-off probability as a function of a set of in put v ariables 𝒙 𝑖 𝑗 specific to each spell ( 𝑖 , 𝑗 ) . These competing models include a classical (cross-sectional) logistic reg ression (LR) model, a more dynamic discrete-time hazard (DtH) model, and a conditional inference sur viv al tree (ST). There are also two v ersions of the LR and DtH models: a T ype A and a T ype B, as explained in subsubsection 3.3.4 . 3.3.1. A logistic reg ression (LR) model A classical cross-sectional dataset is used f or this modelling tec hnique, whereb y each row in the modelling dataset represents a single subject-spell ( 𝑖 , 𝑗 ) . By implication, the data is structured as 𝑖 , 𝑗 , 𝑦 𝑖 𝑗 , 𝒙 𝑖 𝑗 , where 𝑦 𝑖 𝑗 ∈ { 0 , 1 } denotes the written-off outcome of each ( 𝑖 , 𝑗 ) , i.e., those cases f or which R D 𝑖 𝑗 = 1 . Eac h 𝑦 𝑖 𝑗 -v alue is a realisation from an underl ying Ber noulli random variable 𝑌 𝑖 𝑗 with its conditional mean denoted as 𝜇 𝑖 𝑗 = E 𝑌 𝑖 | 𝒙 𝑖 𝑗 . Further more, the v ector 𝒙 𝑖 𝑗 = 𝑥 𝑖 𝑗 1 , . . . , 𝑥 𝑖 𝑗 𝑝 represents the 𝑝 input variables of eac h ( 𝑖 , 𝑗 ) . These in puts may include both account-le vel and por tf olio-le vel inf or mation, as well as macroeconomic co variates; though all inputs are observed only once per spell at its start. In f or malising our LR -model, w e emplo y the well-kno wn g ener alised linear models (GLM) framew ork, and define its linear predictor as 𝜂 𝑖 𝑗 = 𝛼 + 𝜷 T 𝒙 𝑖 𝑗 , where 𝜷 = { 𝛽 1 , . . . , 𝛽 𝑝 } is a v ector of estimable coefficients. This linear predictor is then modelled using 𝑔 ( 𝜇 𝑖 𝑗 ) = log 𝜇 𝑖 𝑗 1 − 𝜇 𝑖 𝑗 = 𝜂 𝑖 𝑗 = ⇒ log 𝑤 ( 𝒙 𝑖 𝑗 ) 1 − 𝑤 ( 𝒙 𝑖 𝑗 ) = 𝜂 𝑖 𝑗 (7) as the logit link function, where 𝑤 ( 𝒙 𝑖 𝑗 ) denotes the conditional write-off probability P 𝑌 𝑖 𝑗 = 1 | 𝒙 𝑖 𝑗 giv en the inputs 𝒙 𝑖 𝑗 of each ( 𝑖 , 𝑗 ) . The values within 𝜷 , together with that of the intercept 𝛼 , are f ound by maximising the log-likelihood function, as implemented within the glm() function in the R -prog ramming language; see script 4c in the accompanying codebase. 3.3.2. A discr ete-time hazard (DtH) model As described in Subsec. A.2 , the modelling dataset f or this techniq ue is in the counting process s tyle, whereby each record represents a point in the his tor y of the subject-spell ( 𝑖 , 𝑗 ) . In estimating our DtH-model, w e reuse the previous GLM-framew ork with a logit link function, which follo ws the w ork of Singer and Willett ( 1993 ) and 14 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study Botha and V erster ( 2026 ). A ccordingly , the discrete-time hazard probability from Eq. 4 is specified as ℎ 𝑡 | 𝑬 𝑖 𝑗 , 𝒙 𝑖 𝑗 , 𝒙 𝑖 𝑗 ( 𝑡 ) , 𝒙 ( 𝑡 ) = 1 1 + exp − 𝜶 T 𝑬 𝑖 𝑗 + 𝜷 T 𝒙 𝑖 𝑗 + 𝜸 T 𝒙 𝑖 𝑗 ( 𝑡 ) + 𝜹 T 𝒙 ( 𝑡 ) . (8) In Eq. 8 , the v ector 𝑬 𝑖 𝑗 = 𝐸 𝑖 𝑗 1 , . . . , 𝐸 𝑖 𝑗 𝑚 contains indicator variables, which flag a par ticular per iod 𝑡 ∈ [ 1 , . . . , 𝑚 ] during the discretely -valued history of ( 𝑖 , 𝑗 ) , up to the obser v ed maximum 𝑚 . These period indicators are accompanied b y the estimable coefficients 𝜶 = { 𝛼 1 , . . . , 𝛼 𝑚 } ; which together with the per iod indicators, represent the baseline hazard. Moreov er, the v ector 𝜷 = 𝛽 1 , . . . , 𝛽 𝑝 contains estimable coefficients for 𝑝 spell-le vel time-fix ed input variables, as denoted by 𝒙 𝑖 𝑗 = 𝑥 𝑖 𝑗 1 , . . . , 𝑥 𝑖 𝑗 𝑝 . All of these par ticular inputs are measured at the star t of each ( 𝑖 , 𝑗 ) . Similarl y , the coefficients 𝜸 = 𝛾 1 , . . . , 𝛾 𝑝 ′ accompan y the 𝑝 ′ time-dependent variables 𝒙 𝑖 𝑗 ( 𝑡 ) = 𝑥 𝑖 𝑗 1 ( 𝑡 ) , . . . , 𝑥 𝑖 𝑗 𝑝 ′ ( 𝑡 ) of each ( 𝑖 , 𝑗 ) o v er time, whereas the coefficients 𝜹 = 𝛿 1 , . . . , 𝛿 𝑝 ∗ are associated with the 𝑝 ∗ por tf olio-lev el time-dependent variables 𝒙 ( 𝑡 ) = 𝑥 1 ( 𝑡 ) , . . . , 𝑥 𝑝 ∗ ( 𝑡 ) , e.g., macroeconomic v ar iables. These various coefficient v ectors { 𝜶 , 𝜷 , 𝜸 , 𝜹 } are f ound using the same glm() function in R b y maximising the log-likelihood function, though giv en a different data structure. Using Eq. 8 , we f or mulate two DtH-models that differ only in the breadth of their input spaces (or set of input v ar iables): a basic and an advanced DtH-model, labelled respectiv ely as DtH-Basic and DtH-A dvanced. As will be e xplained later , the full variable selection process is f ollow ed f or the DtH-A dv anced model. A smaller subset of this in put space is c hosen f or the DtH-Basic model, based on expert judg ement. See the 4b script series in the accompan ying codebase f or more details. 3.3.3. A conditional infer ence sur viv al tree (ST) model A conditional inf erence sur viv al tree (ST) is fit using data that is structured very similarl y to that of the LR -model. In par ticular , each row again represents a single spell ( 𝑖 , 𝑗 ) , though we also obser v e the spell age 𝑇 𝑖 𝑗 , which implies the dataset 𝑖 , 𝑗 , 𝑦 𝑖 𝑗 , 𝑇 𝑖 𝑗 , 𝒙 𝑖 𝑗 . W e use the ctree() -function from the partykit R -pac kage from Hothorn and Zeileis ( 2015 ) in fitting an ST -model. In creating tw o daughter nodes, an ST -model uses a splitting cr iterion that is based on the log-rank score statistic, which quantifies the association between the sur viv al outcome ( 𝑇 𝑖 𝑗 , 𝑦 𝑖 𝑗 ) and each candidate input variable. This statis tic ar ises from a per mutation-based test of the null h ypothesis, which itself asser ts independence between the sur viv al response and the input variable. In tur n, the variable selection procedure becomes unbiased and is not dr iv en by the scale or number of possible cut points of an input variable. Onl y the v ar iable 𝑥 𝑖 𝑗 with the strong est association with the sur viv al response is selected f or fur ther splitting. Assuming that such a variable is numer ic, the cut point 𝑐 is selected that maximises the corresponding log-rank statis tic, thereb y par titioning the data into tw o daughter nodes, i.e., 𝑥 𝑖 𝑗 < 𝑐 and 𝑥 𝑖 𝑗 ≥ 𝑐 . A Kaplan-Meier cur v e is then respectiv ely fit from the obser v ations within each ter minal node. From this sur viv or function, the cor responding discrete hazard and marginal wr ite-off probabilities are der iv ed analogously to Subsec. 3.2 . For more details on conditional inf erence ST -models, see the appendix ( Subsec. A.1 ). R egarding the hyperparameters of our ST -model, we f ollow ed a manual tuning process using the validation set D 𝑉 . This process is based on selecting the values of the hyperparameters that optimised various diagnostics, as will be discussed in Sec. 4 . The final hyperparameters include the follo wing: a maximum tree depth of f our; a minimum number of 1000 observations needed to attempt a split; and a minimum number of 50 obser v ations within a terminal node. W e also impose a minimum of 99% ( 1 − 𝑝 -v alue) when maximising the splitting criterion; i.e., a split is only attempted when the 𝑝 -v alue of the o verarc hing per mutation test is smaller than 1%. For more 15 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study details on the fitting process, see scr ipt 4f in the accompan ying codebase. Further more, we pro vide the ST -model with the same input space of the DtH-A dv anced model. Ho we v er , the resulting tree contains only a fe w of these input variables, chiefly due to the specified tree depth. See Subsec. A.5 f or details on the final input space. 3.3.4. Dic hotomising the probabilis tic models into discret e classifiers: T ype A and B Each wr ite-off r isk model produces a probability score as output, which is denoted by 𝑤 ( 𝑡 , 𝒙 𝑖 ) ∈ [ 0 , 1 ] f or defaulted loan 𝑖 with inputs 𝒙 𝑖 at spell time 𝑡 . These probability scores are typically multiplied with an estimate of the loss se verity tow ards calculating the LGD. The model that outputs such ‘raw’ probability scores shall be called a T ype A model. One can ho we v er dichotomise these scores into 0/1-decisions using a cut-off value 𝑐 , such that the indicator function I ( 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 ) outputs 1 if 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 , and 0 other wise. Doing so can better attune eac h resulting LGD-model to the underl ying empir ical LGD-distribution, as will be sho wn later . Ne v er theless, we shall call such a dichotomised model I ( 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 ) a T ype B model. In finding this cut-off value 𝑐 , consider first the sensitivity and specificity from R OC-analy sis, as e xamined b y F a wcett ( 2006 ). Sensitivity is the probability of a tr ue positiv e (or a wr ite-off ev ent correctly predicted as such), which is defined as 𝑞 ( 𝑐 ) = P ( 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 | C 1 ) , where C 1 is the positive class, i.e., an observed wr ite-off e vent. Similarl y , specificity is the probability of a tr ue negativ e (or a non-ev ent cor rectl y predicted as such), which in tur n is defined as 𝑝 ( 𝑐 ) = P ( 𝑤 ( 𝑡 , 𝒙 𝑖 ) ≤ 𝑐 | C 0 ) , where C 0 is the neg ativ e class. These two components do not necessarily car ry the same weight in classification problems, especially so under IFRS 9 where a bank w ould rather be ov er -provided than under -pro vided in its loss provisions. Put differently , the misclassification cost differs between f alse positives and negativ es, and the latter should be penalised to a greater e xtent in the interest of r isk pr udence. A ccordingly , let 𝑎 > 0 denote a cost multiple (or ratio) of committing a false negativ e relativ e to a f alse positiv e, and let 𝜙 denote the estimated prev alence of the C 1 -e vent, i.e., P ( C 1 ) . Both of these quantities are used in the Generalised Y ouden Index (GYI) function, which quantifies the prediction po wer of a model f or a given 𝑐 -v alue amidst differ ing misclassification cos ts. As introduced and discussed by Geisser ( 1998 ), Kaivanto ( 2008 ), and Schis ter man et al. ( 2008 ), we define the GYI for a giv en 𝑐 as 𝐽 𝑎 ( 𝑐 ) = 𝑞 ( 𝑐 ) + 1 − 𝜙 𝑎 𝜙 · 𝑝 ( 𝑐 ) − 1 , (9) whereupon the optimal cut-off 𝑐 ∗ is giv en b y 𝑐 ∗ = ar g max 𝑐 𝐽 𝑎 ( 𝑐 ) . By setting 𝑎 = ( 1 − 𝜙 ) / 𝜙 in Eq. 9 , the respective weights of 𝑞 ( 𝑐 ) and 𝑝 ( 𝑐 ) become equal. Though as 𝑎 increases, the contribution (or w eight) of the specificity ter m 𝑝 ( 𝑐 ) decreases. Since 𝑝 ( 𝑐 ) reflects the a voidance of false positives, a reduced w eighting on 𝑝 ( 𝑐 ) implies that the penalty on f alse positiv es decreases. A larg er 𝑎 -v alue theref ore produces a low er 𝑐 ∗ -v alue, which translates to a more "liberal" model that is more likel y to render C 1 -predictions than C 0 -predictions. Lastly , the GYI is implemented within the bespoke R -function GenYoudenIndex() , which itself uses the JDEoptim() -function from the DEoptimR -packag e; see the R -codebase maintained by Gabr u et al. ( 2026 ). W e also devise in the appendix ( Subsec. A.4 ) a short procedure f or optimising 𝑎 within the GYI. This procedure is based on minimising the MAE betw een the empir ical and expected ter m-s tr uctures, the latter of which emanates from a dichotomised model giv en a cor responding 𝑎 -value. 16 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 3.4. F ormulating single-stage vs tw o-st ag e LGD-models: our model univ erse While the f ocus of this study is on write-off r isk, w e think it pr udent to compare the resulting LGD-estimates from the various models in the interest of completeness. But deriving these LGD-estimates requires a loss se verity component, in f ollo wing the two-s tage approach to LGD-modelling. For simplicity , w e shall reuse the GLM-frame work in reg ressing empir ical loss rates (given write-off ) 𝑙 𝑖 , 𝑖 = 1 , . . . , 𝑛 on to a set of input variables. Future work can most cer tainl y refine the estimation of these loss rates from data using more sophisticated models. Follo wing some e xper imentation with response distributions and appropr iate link functions, we select the compound P oisson (CP) distribution f or our GLM, and w e select the (canonical) log link function. In par ticular , the response v ar iable 𝑌 𝑖 is assumed to f ollow a T w eedie dis tr ibution, i.e., 𝑌 𝑖 ∼ T weedie ( 𝜇 𝑖 , 𝜙, 𝑝 ) , where 𝜇 𝑖 is the mean, 𝜙 is the dispersion parameter , and 1 < 𝑝 < 2 is the po wer parameter that deter mines the CP -structure. This choice is motiv ated by the fact that the resulting GLM can handle zero-inflated data (such as zero-valued loss rates) rather fle xibly , as discussed by Y . Zhang ( 2013 ). Its use is quite popular in actuar ial science wherein the number of insurance claims can be zero, as well as in estimating the (possibly zero-valued) amount of precipitation during a giv en period when modelling rainfall. In fitting a T weedie CP -GLM, w e use the cpglm() -function from the cplm -packag e in R. Pre vious work b y Botha and V erster ( 2026 ) has sho wn the merit of dev eloping a baseline model that has an input space that is deliberately stripped bare, which can aid compar isons. W e adopt the same design and dev elop a basic DtH model (labelled as DtH-Basic) that contains but a time factor , in addition to a f ew r udimentary variables, as selected using e xper t judgment and s tatistical significance; see Subsec. A.5 . This DtH-Basic model is impro ved upon by its more advanced sibling (labelled as DtH- Adv anced) whose variable selection is more r igorous, as will be e xplained later . The expectation is that the DtH- A dvanced should outper f orm its more simplis tic sibling in predicting wr ite-off r isk, purely b y vir tue of ha ving a more enr iched input space. Both of these DtH-models are also dichotomised, thereb y a vailing T ypes A and B of eac h model. In follo wing the two-s tage LGD-modelling approach, our model univ erse theref ore includes eight model components: 2x LR -models (A & B), 2x DtH-Basic models (A & B), 2x Dth-A dvanced models (A & B), 1x ST -model, and the loss se verity , a T weedie CP -GLM. W e summarise these models in Fig. 4 for easy ref erence. Fig. 4. Differentiating the model univ erse between tw o-stag e and single-stage LGD-modelling approaches. 17 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study The two-s tage LGD-modelling approach can be easily compared to a single-stag e approach, wherein the zero-loss cures are included in the modelling dataset. The ov erall LGD-distribution is no w modelled directly as a function of a f e w in put v ar iables. W e believ e it prudent to do so in positioning our work as a ref erence point amongst other w orks that hav e pursued similar compar isons of LGD-models, such as that of Loterman et al. ( 2012 ). A ccordingly , and to keep things simple, w e reuse the GLM-framew ork and build tw o competing single-s tage LGD-models: 1) one with a Gaussian distribution and an identity link function; and 2) one with the same T w eedie CP -distribution with a log link function, as in the aforementioned tw o-stag e approach. Both of these single-stag e LGD-models form par t of our model univ erse, as summar ised in Fig. 4 . The justification for choosing the latter T w eedie CP -GLM is that it should theoreticall y be able to contend with the probability mass at zero, i.e., the zero-loss cures. The o verall e xpectation is that w e can replicate the results from previous studies that show ed the tw o-stag e approach to outper f or m the single-stag e approach in general, without using more sophisticated machine learning techniques within a single-stag e approach. 4 An em pirical comparison of competing models f or write-off risk In building our models, variable selection is per f or med using an interactiv e thematic selection process, which is e xplained as f ollo ws. Within a theme, e.g., "macroeconomic variables: inflation", specific questions are posed to be in ves tigated, e.g., "which lag or der of a macroeconomic variable is t he ‘best’?" . Eac h question is answ ered using a v ar iety of model diagnostics, including statis tical significance testing, domain kno w ledge, cor relation studies, and goodness-of-fit, as measured using Akaike ’ s Inf or mation Cr iterion (AIC) and McFadden ’ s pseudo 𝑅 2 from Menard ( 2000 ). F or a set of thematically -chosen input v ar iables, we build single-factor models and then ev aluate them using the same diagnostics, thereby generating insights on the ‘best ’ inputs. This process is complemented by r unning a stepwise forw ard selection, as explained by James et al. ( 2013 , §6), whereafter the final list of input v ar iables is manually curated using e xper t judgment. This thematic selection process is chiefly f ollow ed in estimating our DtH- Adv anced model, though the resulting insights are reused in building the other models. See Subsec. A.5 for a complete list of input variables per model, whilst the details of this selection process are contained within the accompan ying R -codebase maintained by Gabr u et al. ( 2026 ). As f or ev aluating our various models, w e shall now discuss and present the results of five main diagnostics. Firs tly , we analy se the discriminator y po wer of our LR -models using the receiv er operating char acteristic (R OC), as discussed by F aw cett ( 2006 ). Eac h R OC-analy sis is then summar ised into the ar ea under the cur v e (A UC) statis tic, where greater v alues indicate greater discr iminatory pow er . For the sur viv al models, a time-dependent R OC-anal ysis (tR OC) can be conducted giv en hazard predictions at a par ticular spell time point 𝑡 = 𝑡 ( 0 ) , . . . , 𝑡 ( 𝑚 ) . This type of R OC-analy sis g eneralises the classical one to a conte xt wherein r ight-censored cases are adequately treated, as discussed by Heag er ty et al. ( 2000 ) and Bansal and Heagerty ( 2018 ). Since the degree of r ight-censoring v ar ies across 𝑡 , the discriminator y po wer of a sur viv al model will itself vary o ver 𝑡 . As such, the tw o elements of a tR OC-graph can be e xpressed as functions of both 𝑡 and the cut-off 𝑝 𝑐 : the true positiv e rate 𝑇 ∗ ( 𝑡 , 𝑝 𝑐 ) and the f alse positiv e rate 𝐹 ∗ ( 𝑡 , 𝑝 𝑐 ) . Under the "cumulativ e case/dynamic control" frame work of Bansal and Heag er ty ( 2018 ), both of these quantities are formulated using the conditional sur viv al probability at 𝑡 , itself estimated using the nearest neighbour estimator from Akr itas ( 1994 ). For a gentler introduction to tR OC-analy sis, see Botha et al. ( 2025 ) and Botha and V erster ( 2026 ), who hav e reformulated tR OC-analy sis within the PD-modelling conte xt using performing spells. This par ticular spell f or mulation is similar to default spells within our LGD-modelling conte xt, and so we shall use the e xact same setup in conducting tR OC-anal yses. 18 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study (a) DtH-Basic A (b) DtH-A dv anced A (c) Survival tree Fig. 5. Ev aluating the discriminator y pow er of three competing survival models by using the clustered tR OC- e xtension to R OC-anal ysis at specific time points 𝑡 ∈ { 3 , 6 , 24 , 36 } . The results per model (T ype A -ser ies) are sho wn respectiv ely in panels (a) - (c) . R egarding the implementation of such a tR OC-analy sis, w e use the bespoke tROC.multi() function that is defined in the accompanying R -codebase. The chosen time points f or this tR OC-analy sis are selected using e xper t judgment and include 𝑡 ∈ { 6 , 12 , 24 , 48 } months in default. Similar to an R OC-analy sis, a tR OC-anal ysis can be summarised into a time-dependent A UC-value (tA UC); one f or each spell time point 𝑡 . As shown in Fig. 5 , the DtH- Adv anced model clearl y has greater tA UC-values than those of the DtH-Basic model across all time frames. Although not sho wn, this trend holds true f or the T ype B model v ar iants as w ell, albeit at lo wer tA UC-le vels across all 𝑡 . W e belie ve this outper f or mance of the DtH-A dvanced model o ver the DtH-Basic model attests of the f or mer model having a more comprehensive set of input variables, which underscores the importance of f eature engineering. F ur thermore, the ST -model achie v ed a second-in-class per f ormance, and though it outper f ormed the DtH-Basic model, it did not best the DtH-A dvanced model. This finding is interesting since one w ould ha ve 19 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study e xpected better results from a more sophisticated techniq ue such as sur viv al trees, especially when it shares the same input space of the DtH- Adv anced model. Fig. 6. The time-dependent Br ier Score (tBS) ov er spell time 𝑡 per survival model (T ype A -series). The integrated Brier score (IBS) is annotated per sur viv al model in summar ising the tBS-values ov er time. Secondl y , and in ev aluating the prediction accuracy of a sur viv al model o v er time, one can consider the time-dependent Brier scor e (tBS), as discussed by Graf et al. ( 1999 ) and Suresh et al. ( 2022 ), and implemented by Botha and V erster ( 2026 ) within the PD-modelling conte xt. As a measure of prediction er ror at a par ticular spell time 𝑡 , this tBS-measure calculates the av erag e squared difference between the predicted sur viv al probabilities and the observed outcomes amidst r ight-censoring. A lo wer tBS-value at a par ticular 𝑡 indicates lo wer prediction er ror and hence greater model per f ormance. A ke y advantag e of the tBS is its model-agnostic nature in that it relies only on the predicted sur viv al probabilities. W e use the bespoke tBrierScore() function within the accompan ying R -codebase, and graph these tBS-values of each sur viv al model in Fig. 6 o ver 𝑡 . The results sho w that the more comple x DtH-A dv anced model achie ves lo wer tBS-v alues o ver all 𝑡 , relativ e to those of the DtH-Basic model. For both models, the increasing trend in the tBS ov er 𝑡 attests of the increasing difficulty of estimating sur viv al probabilities accurately at later spell times, larg ely due to dwindling sample sizes at those per iods. The ST -model achie v ed tBS-values that are lar gel y similar to those of the DtH-Basic model, which corroborates the sur prise finding of the previous tR OC-diagnostic. Moreo v er , the maximum period 𝑡 ∗ o v er whic h to calculate the tBS should be meaningfull y chosen, particularl y giv en the increasing pre valence of r ight-censored cases as 𝑡 → 𝑡 ∗ . Our Kaplan-Meier analy sis (itself revie wed in Subsec. 3.2 ) suggested that the vas t majority of the dataset is e xhausted as 𝑡 → 120 months in default, though e ven this period is considered to be extremel y long in practice. Ha ving used e xper t judgment, our inset graph in Fig. 6 sho ws the tBS o ver a more realistic work out per iod of 48 months in default. Ho we v er , the diver gence in the prediction er rors seems to worsen e xponentially ov er 𝑡 betw een the DtH-A dvanced and Dth-Basic/ST models. This result cor roborates our previous tR OC-results regarding discr iminatory po wer . That is, the DtH-Basic and ST models seem to produce predictions that are less accurate than those of the DtH- Adv anced 20 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study model. The tBS-v alues can be agg reg ated o v er 𝑡 into a singular value called the integr ated Brier score (IBS). Lo wer IBS-v alues are deemed as superior in that the predictions of the cor responding sur viv al model ag ree with reality to a greater e xtent. Calculating such an IBS-value requires choosing a time-dependent w eighting function b y which the tBS-values are blended and summed together . W e select a unif or m w eight ov er spell time in the interest of simplicity , as implemented in the erstwhile tBrierScore() function. For interpreting an IBS-v alue, Graf et al. ( 1999 ) noted a simple rule of thumb of IBS < 0 . 25 , having made a plausible assumption about the sur viv al probability . In par ticular , the authors reasoned that, in the absence of an y inf or mation, one may assign an a verag e survival probability of ˆ 𝑆 ( 𝑡 ) = 0 . 5 to all subject-spells at a giv en 𝑡 . The cor responding tBS-value of this ‘model’ w ould then be 0.25, which can be used as an upper limit of sor ts when inter preting any other IBS. Accordingl y , and when e xamining the inset g raph in Fig. 6 , it is clear that the IBS-v alues of both the DtH-Basic model (0.531) and the ST -model (0.563) breach this r ule of thumb. In contrast, the IBS of the DtH-A dvanced model (0.162) is smaller b y far , which attests of the model’ s super ior accuracy when producing predictions. Our third diagnostic is a compar ison of ter m-s tr uctures, which is described as f ollo ws. All of our wr ite-off r isk models contain at leas t one time factor of sorts: time in default spell f or the surviv al models, and default spell ag e f or the LR -model. This model design allo w s f or the der iv ation of the aggreg ate write-off probabilities o ver spell time 𝑡 = 𝑡 ( 1 ) , . . . , 𝑡 ( 𝑚 ) , i.e., the wr ite-off term-str ucture. The empirical ter m-s tr ucture is der iv ed using the e vent probabilities emanating from a Kaplan-Meier analy sis, as discussed and illustrated previousl y in Subsec. 3.2 . Ag ainst this backdrop, w e compare in Fig. 7 the expected term-structures as produced by our models, where the a verag e write-off probability across all applicable spells is obtained f or each 𝑡 . Evidentl y , the term-structure of the LR -model in Fig. 7b diver ges dramatically from the empirical ter m-s tr ucture (g reen) and ev en differs in shape; a f act that is true f or both model T ypes A and B. This diver gence can be quantified b y calculating the MAE betw een an y pair of line graphs, with the MAE being 0.952% (LR T ype A) and 2.880% (LR T ype B) respectivel y . On the other end of the spectr um, both of the DtH-A dvanced models in Fig. 7a appro ximate the empir ical ter m-s tr ucture the closest, with the MAE being 0.162% (T ype A) and 0.257% (T ype B) respectiv ely . The sur viv al tree scores a close second place, with an MAE of 0.167%. It also appears to under -predict the empirical term-str ucture o ver most 𝑡 , which is cer tainl y not r isk -pr udent. Nonetheless, these results underscore the ability of the sur viv al models to produce predictions that can e vol v e o ver time as a default spell unf olds, and which agree more closely with reality . In contrast, and although simplistic, a basic cross-sectional LR -model cannot so easily contend with the element of time in render ing accurate predictions in agg reg ate. As f or our f our th diagnostic, w e compare the cross-sectional LR -models to the dynamic sur viv al models using the classical A UC-statistic and Br ier scores. Giv en the time-dependent nature of sur viv al models, one would need to select a par ticular time point when compar ing them to the LR -model. W e choose the point 𝑡 = 44 at whic h the median sur viv al probability (50%) occurs, based on a Kaplan-Meier analy sis of the sur viv al probability . The f ollo wing results are obtained, as summar ised in T able 1 across the af orementioned two diagnostics. W e note the f ollo wing points f or the T ype A models. F irstly , it is b y now unsur prising (though still impor tant to note) that the DtH- Adv anced model outper f or ms its basic counter part y et again. Secondly , the LR -model per f or ms better than the DtH-Basic model, but still under perf or ms agains t the DtH- Adv anced model, despite the LR -model ha ving a v er y similar set of inputs. In compar ing T ype A models versus their T ype B counter parts, the results sugg est that dichotomisation can erode model per f or mance. All T ype B models achie ved similar Br ier -scores (6.4–6.62), which are orders of magnitude w orse than those of the T ype A models. Ho we v er , we note that dichotomisation inherentl y 21 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study (a) Best fitting models (b) W orst fitting models Fig. 7. The empirical vs e xpected ter m-s tr uctures of write-off r isk o ver spell time 𝑡 per model type. N atural splines are fit in summar ising the ov erall trend. The MAE betw een the empir ical and each expected values are annotated. The best fitting models are sho wn in panel (a) , whereas panel (b) contains the wors t fitting models. 22 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study leads to information loss, and so the degradation in diagnos tics is not entirely surpr ising. Dichotomisation is itself a relativel y standard practice f or cross-sectional probabilistic classifier models, including our LR -model. Y et dichotomising the predictions from a hazard model is larg ely unstudied amidst censored obser v ations, at leas t as far as we kno w . This f act may v er y w ell e xplain the relativ ely poor perf or mance of the T ype B DtH-models. Future research can cer tainl y f ocus on this aspect (censored observ ations) when dichotomising the output of hazard models using some threshold. Lastl y , the ST -model per f or med admirably across both diagnostics, though did not outperform the DtH-A dvanced model, which cor roborates the model ranking in our previous diagnostics. T able 1: A summary of various diagnostics b y modelling technique, having assessed the sur viv al models at 𝑡 = 44 months in default. Under lined figures indicate the best-in-class model. Model A UC Brier score LR: A 86.29% 5.48 LR: B 74.83% 6.55 DtH-Basic: A 61.18% 1.57 DtH-Basic: B 49.55% 6.4 DtH- Adv anced: A 97.15% 0.47 DtH- Adv anced: B 51.30% 6.62 Survival tree 75.75% 1.48 Our fifth and final diagnostic is based on assessing the discr iminatory po wer ov er time using classical R OC-anal ysis. While the A UC-statis tic in T able 1 is calculated across the entire sample, one may also par tition the sample by resolution date, and then reper f or m R OC-analy sis within each date-based par tition. The A UC is then calculated at each time point, which can be used in assessing the discr iminatory pow er of each model o ver time; see Fig. 8 f or the T ype A-series of models. The results show that the LR -model’ s discr iminatory po wer is not onl y the w orst ov er time (with a through-t he-cycle [TTC] mean A UC-value of 61.29%), but also varies the most. In contrast, the DtH- Adv anced model per f or ms the best (with a TTC-mean A UC-value of 97.36%), and achie ves a remarkably stable set of A UC-values ov er time. The DtH-Basic and ST models produced similar A UC-values o ver time, with TTC-means of 70.03% and 75.64% respectivel y . That said, there is a wide dispar ity in the results of both models, when compared to that of the DtH-A dvanced model. Although not sho wn, w e reperf or m the same analy sis f or the T ype B-ser ies of models, purely in the interest of completeness. The trends re verse in that the LR -model no w achie v es the greatest A UC o ver time with a TTC-mean of 67.61%, follo wed by the DtH-Basic (51.07%) and the DtH- Adv anced (46.62%) models. Ho w ev er , w e believ e it best not to appropr iate too much meaning to these T ype B results, giv en the problematic dichotomisation of sur viv al models. 5 Comparing single-st ag e and tw o-st ag e LGD-models In ev aluating the various LGD-models, w e opt f or a simple dis tr ibutional analy sis wherein the empir ical LGD- distribution of realised LGD-v alues is compared with its e xpected counter part, as produced by each LGD-model. The premise hereof is that the distribution of the predicted LGD-values ought to resemble that of the realised LGD-v alues as closely as possible. In analy sing the similar ity betw een either distribution, w e emplo y the w ell-known Kolmogor ov-Smirno v (KS) test statistic, as well as tw o div erg ence measures from Zeng ( 2013 ): Kullbac k-Leibler (KL) and Jensen-Shannon (JS). Smaller v alues in any of these measures indicate greater similar ity betw een two distributions (empirical vs e xpected), and hence a more accurate model. W e show the distributional analy ses in Figs. 9 – 10 , and note the f ollo wing three main points. 23 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study Fig. 8. A UC-values o ver time (resolution date) by model type f or the T ype A-series. The through-the-cy cle (TTC) means of the A UC-values are annotated, with 95% confidence inter v als. The black line (70%) represents an acceptable le vel of the A UC ov er time, based on industry conv ention. Firs tly , and as shown in Fig. 10d , the underl ying loss sev er ity model within the two-s tage approach struggles to capture the salient aspects of the empirical distribution of realised loss rates. These aspects include the tw o modes near zero and one, as w ell as the slight r ight-sk ew . The model itself achie ved a coefficient of deter mination of only 𝑅 2 = 22 . 45% , which is relativ ely w eak in explaining the variance. While the focus of our work remains on wr ite-off r isk models, the influence of the loss sev erity component is unmistak eable within the composite LGD-model. A ccordingly , the under perf or mance of the two-s tage LGD-models is noticeable across most similar ity metrics, at least when compared to their single-stag e counterpar ts. For e xample, the T w eedie CP -GLM model achie v ed a KL of 0.0055 and a JS of 0.0019 (see Fig. 9b ), which are muc h smaller/better than those metrics of the best-in-class two-s tage T ype A LGD-model (see F ig. 10a ). I.e., the DtH-A dvanced T ype A model has a KL of 0.7349 and a JS of 0.0581. This result is cor roborated by the fact that the mean loss rates of the single-stage LGD-models (8.6% and 8.0% respectivel y) are much closer to the empir ical mean of about 10%, relative to those means of the tw o-stag e LGD-models. Ho w ev er , we posit that this result is highly dependent on the shape of the underl ying empirical LGD-dis tr ibution, which resembles an ‘L ’-shape in our case rather than the typical ‘U’-shape of most LGD-distributions. Modelling an ‘L ’-shaped LGD-distribution prov es to be demonstrably difficult, larg ely as a result of an abundance of zero-valued cures. F or the tw o-stage LGD-modelling approach to triumph ov er its single-stag e sibling, one w ould likel y ha ve to produce a fair ly decent loss sev erity model as well (at least better than at present), which should match the quality of the wr ite-off r isk models. 24 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study (a) Single-stag e: Gaussian GLM (b) Single-stag e: T weedie CP -GLM (c) T w o-stage: LR A (d) T wo-s tage: LR B (e) T w o-stage: DtH-Basic A (f) T w o-stage: DtH-Basic B Fig. 9. LGD-distributions of empir ical vs expected loss rates per model. Distr ibutional diagnostics include the K olmorogov -Smirnov (KS) tes t s tatistic, Kullbac k -Leibler (KL), and the Jensen-Shannon (JS) diver gence. The mean e xpected loss rate is o v erlaid (in red) per model and in each inset g raph. 25 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study (a) T w o-stage (wr ite-off r isk): DtH-A dvanced A (b) T w o-stage (wr ite-off r isk): DtH-A dvanced B (c) T w o-stage (wr ite-off r isk): Sur viv al tree (d) T w o-stage (loss sev erity): T w eedie CP-GLM Fig. 10. LGD-dis tr ibutions of empir ical vs expected loss rates per model (continued). Secondl y , it w ould appear from Figs. 9 – 10 that the DtH- Adv anced T ype A model outperf or ms the other tw o-stag e models (T ype A). In par ticular , the f or mer model scores a KL -s tatistic that impro v es more than three-f old upon that of the LR T ype A model (see Fig. 9c ), and more than f our-f old in the JS-statistic. W e believ e that this result underscores the dynamicity of surviv al models ov er their cross-sectional counter parts, which ultimatel y renders the LGD-predictions as more accurate. Interestingly , the more e x otic ST -model (see F ig. 10c ) did not outperform either of the tw o Dth-A dv anced models across the KL and JS statis tics. Thirdly , and as troublesome as it is, dichotomisation did impro ve the similar ity metr ics f or both the LR and DtH-Basic models. As an e xample, the KL of the LR -model strengthened almost 15 times from 2.47070 to 0.1652 (see Fig. 9d ) f ollo wing dichotomisation, whereas the DtH-Basic model’ s KL impro v ed almost nine times from 6.2057 to 0.7198 (see Fig. 9f ). How ev er , w e achie ve the opposite result for the DtH-A dvanced models in that dichotomisation yielded greater dissimilar ity betw een the distributions. It is possible that this result attests y et again of the issue of dichotomising the output of survival models, as previousl y discussed in Sec. 4 . Nonetheless, it seems that g reater accuracy in LGD-predictions can be attained via dichotomisation, especiall y so f or the cross-sectional LR -model. 26 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 6 Conclusion It is notoriously difficult to model the LGD r isk parameter in credit risk modelling. This is par ticular ly true when consider ing the two aspects that characterise most LGD-distributions: bimodality and ske w ed tails. These tw o aspects can complicate the direct modelling of the realised LGD-v alues as a function of a set of in put v ar iables (or predictors). As a result, these aspects hav e inspired a two-s tag e LGD-modelling approach in literature, whereb y the LGD is commonly decomposed into a wr ite-off r isk component and a loss sev er ity (giv en wr ite-off ) component. Each component can then be separately modelled, whereafter the product of the tw o model outputs are multiplied together in forming an LGD-estimate. V arious s tudies hav e highlighted the success of the two-s tage LGD-modelling approach. Ho we v er , we note that the ma jor ity of these studies hav e contended with a so-called ‘U-shaped’ LGD-distribution, where the tw o modes are each respectiv ely located at the tail-ends of the dis tr ibution. In contrast, our dataset e xhibits more of an ‘L-shaped’ LGD-distr ibution, with one ma jor mode located at zero (representing the cures), and an extremel y minor mode at one. W e belie ve that this shape may be more character is tic of a secured por tf olio (e.g., mor tgag es) in general, whereas a U-shape better reflects an unsecured por tf olio (e.g., credit cards). Reg ardless, a peculiar distributional shape will of course hav e consequences f or the success of any par ticular modelling strategy . In tur n, an inappropriate modelling strategy may introduce unnecessar y bias into the ECL -es timates under IFRS 9, which can compromise a bank’ s loss pro visions. W e ha ve e xplored the use of survival analy sis in modelling the wr ite-off risk component, having used South African mor tgag e data. One of the most impor tant input v ar iables within these sur viv al models is that of the time spent in a default spell. Its inclusion allo ws us to appro ximate the empirical ter m-s tr ucture of wr ite-off risk, i.e., the collection of wr ite-off probabilities o ver default spell time 𝑡 . This empir ical ter m-s tr ucture is itself g enerated using the Kaplan-Meier (KM) estimator from sur viv al analy sis, thereb y lev eraging r ight-censored obser v ations in constructing an ‘actual/empirical’ cur v e that represents reality . By itself, this KM-based method already represents a more efficient use of data. The output of our v ar ious models (including sur viv al models) w ere then duly agg reg ated and compared to this empir ical ter m-s tr ucture; itself f or ming a reusable and simple diagnostic f or assessing dynamic LGD-models. Our results sho wed that a par ticular type of sur viv al model – a discre te-time hazard (Dth) model – outper f or med other two-s tage contenders across mos t metrics, par ticular ly when compar ing the e xpected v s the empirical term-structures. Other metr ics included time-dependent v ar ieties of both discr iminatory po w er (tR OC-anal ysis) and the Br ier score (tBS). As inspired by previous studies, we ensconced our wr ite-off r isk models within a two-s tage LGD-modelling approach, such that w e can conduct a benchmark study amongst different modelling techniques. Doing so required building a bespoke loss sev erity model, which was itself estimated using a T w eedie compound Poisson (CP) GLM. In f act, a T weedie CP-GLM also dro v e one of our tw o candidate single-stage LGD-models, with the other one being a Gaussian GLM. Ho we v er , in modelling the loss sev erity within the two-s tage approach, w e only achiev ed a mediocre fit to the data. The downs tream results show ed that a single-stag e LGD-model (CP -GLM) outper f or med all of the two-s tage LGD-models, including the more ex otic conditional inference sur viv al tree (ST) model. Though rather unexpected, w e ascr ibe this result to the peculiar shape of the LGD-distr ibution. The implication is that a tw o-stag e LGD-modelling approach can only triumph o ver its single-stag e counter part when both of its components perform adequatel y , which is not the case at present. Another result is that the dic hotomisation of the wr ite-off probability into 0/1-values improv ed the accuracy of most two-s tage LGD-models. In so doing, zero-v alued cures can be predicted more accurately , and the resulting mode at zero within the LGD-distribution is better captured. Ov erall, our work ser v es both as a benchmark study and a tutor ial to modelling the wr ite-off r isk component in 27 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study LGD-modelling, especiall y when consider ing the ancillar y mater ial in the appendix. Future research can f ocus on an alternative wa y of handling competing risks within our sur viv al models (i.e., the cured outcomes). Doing so can theoretically reduce the latent bias introduced by such r isks, which are cur rentl y treated as r ight-censored observations. E.g., each cause (or outcome, such as wr ite-off and cure) actually has a cumulativ e incidence function (CIF) in the competing r isks literature, where suc h a CIF is denoted as 𝐹 𝑗 ( 𝑡 ) , and signifies the probability of e xper iencing f ailure from a specific cause 𝑗 b y time 𝑡 . One may es timate each cause-specific CIF by using the Aalen-Johansen estimator , which accounts f or the possibility of other causes occur ring first. Another research av enue is experimenting with machine lear ning f or ms of survival analy sis, e.g., random sur viv al f orests. Doing so may deliv er greater per f ormance than what we ha ve obtained with the slightly less e x otic ST -model. Future effor t may also be dedicated to building models that more accuratel y predict the loss se verity component. This route ma y v er y well redeem the tw o-stag e LGD-modelling strategy , especially when coupled with decent wr ite-off r isk models, such as those that we hav e de veloped in this study . One can also e xplore the appropriate dichotomisation of the output from sur viv al models within the present conte xt, which w e believ e is cur rentl y understudied. A Appendix In Subsec. A.1 , w e re vie w the fundamentals of a par ticular type of sur viv al trees, which includes its in-depth f or mulation and application in the R -programming language. An illustration is given in Subsec. A.2 of how one should structure the underlying credit data tow ards conducting sur viv al anal y sis in estimating the write-off term-str ucture. In affir ming the representativeness of resampled datasets, the resolution rat e is defined and illustrated in Subsec. A.3 . W e formulate a br ief optimisation procedure in Subsec. A.4 f or finding the cost multiple 𝑎 within the Generalised Y ouden Inde x (GYI), which is used in dichotomising our model outputs. Lastl y , Subsec. A.5 contains a descr iption of the input v ar iables within our models. A.1. An ov erview of conditional inf erence surviv al trees Survival trees extend recur sive binar y partitioning (RBP) methods to r ight-censored time-to-ev ent data. In par ticular , the co variate (or input) space is recursivel y par titioned into increasingly homogeneous subgroups, such that obser v ations within a ter minal node e xhibit similar behaviour in the subsequent sur viv al distributions. In credit risk applications, sur viv al trees can ser v e as a fle xible tool f or unco vering heterogeneity in wr ite-off timing (the response v ar iable) when giv en various input variables. These trees can help with discov ering complex interactions and non-linear effects, without imposing restrictive (semi-)parametric assumptions such as propor tional hazards. For an o vervie w of sur viv al trees in the credit r isk context, see Fr ydman and Matuszyk ( 2022 ). In this study , w e f ocus on sur viv al trees constructed within the conditional infer ence frame work of Hothor n et al. ( 2006 ), abbreviated as HHZ -trees and implemented in the partykit R -pac kage from Hothor n and Zeileis ( 2015 ). The construction of such HHZ-trees deliberately separates the variable selection step from the subsequent splitting step, wherein the space of the selected variable is split into two regions. Separating these tw o steps a voids the selection bias that is typicall y associated with classical RBP -methods. T o this point, a typical RBP-method jointl y optimises a splitting criterion ov er all in puts variables and across all admissible cut points. This implies the dispropor tionate selection of those inputs with many potential split points (due to different measurement scales), or of those categor ical inputs with man y categor ies. This selection bias e xists ev en when the association betw een such 28 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study inputs and the outcome is weak. While post-estimation tree pr uning is commonly emplo yed to mitigate ov er fitting, it does not address this inherent selection bias and instead introduces additional tuning parameters, as discussed by Hothorn et al. ( 2006 ). In contrast, HHZ-trees assess the association between each input and the outcome using the conditional distribution of a linear test statis tic, as der iv ed using a per mutation-based test procedure. The optimal binary split is only deter mined af t er identifying the variable that exhibits the strong est association with the outcome according to this statistical cr iterion, thereby eliminating the af orementioned selection bias. W e shall now discuss v ar ious aspects of HHZ-trees ov er the ne xt fe w subsections. A.1.1. Br oad steps for implementing recur sive binar y partitioning (RBP) using censored data Let 𝑇 𝑖 𝑗 denote the spell duration f or def ault spell 𝑗 ∈ { 1 , . . . , 𝑛 𝑖 } of loan 𝑖 ∈ { 1 , . . . , 𝑁 𝑝 } , where 𝑛 𝑖 denotes the maximum number of spells endured b y 𝑖 , and 𝑁 𝑝 is the total number of loans. Let 𝛿 𝑖 𝑗 ∈ { 0 , 1 } be the ev ent indicator such that 𝑇 𝑖 𝑗 represents the wr ite-off time if 𝛿 𝑖 𝑗 = 1 . Other wise if 𝛿 𝑖 𝑗 = 0 , then 𝑇 𝑖 𝑗 signifies the r ight-censoring time. T ogether , 𝑇 𝑖 𝑗 and 𝛿 𝑖 𝑗 f or m the bivariate survival response 𝑌 𝑖 𝑗 = ( 𝑇 𝑖 𝑗 , 𝛿 𝑖 𝑗 ) with a sample space of Y . F or notational con venience, let us re-inde x the 𝑛 observed spells genericall y by 𝑘 = 1 , . . . , 𝑛 , which are treated as independent observations for tree construction, yielding 𝑌 𝑘 = ( 𝑇 𝑘 , 𝛿 𝑘 ) . W e shall inv estigate the conditional distribution of this response 𝑌 = ( 𝑌 1 , . . . , 𝑌 𝑘 ) giv en the obser v ations of 𝑚 input variables 𝑿 = ( 𝑋 1 , . . . , 𝑋 𝑚 ) ; themsel ves measured at arbitrary scales and taken from the input (or cov ariate) sample space X = X 1 × · · · × X 𝑚 . In a tree-based frame work, the conditional distribution 𝐷 ( 𝑌 | 𝑿 ) of a g ener ic response 𝑌 𝑘 is assumed to depend on a function 𝑓 of the inputs, i.e., 𝐷 ( 𝑌 𝑘 | 𝑿 𝑘 ) = 𝐷 ( 𝑌 𝑘 | 𝑓 ( 𝑋 1 𝑘 , . . . , 𝑋 𝑚 𝑘 ) ) . This function 𝑓 maps each input v ector to one of 𝑟 regions (or ter minal nodes), thereby par titioning the in put space into the disjoint cells 𝐵 1 , . . . , 𝐵 𝑟 such that X = Ð 𝑟 𝑙 = 1 𝐵 𝑙 . W ithin each ter minal node 𝐵 𝑙 , the conditional distribution 𝐷 ( 𝑌 | 𝑿 ∈ 𝐵 𝑙 ) is assumed to be homogeneous, as discussed by Hothor n et al. ( 2006 ). Ultimately , our dataset D 𝑛 that w e will use in tree-construction is defined as D 𝑛 = { ( 𝑌 𝑘 , 𝑋 1 𝑘 , . . . , 𝑋 𝑚 𝑘 ) ; 𝑘 = 1 , . . . , 𝑛 } . Consider the case w eights 𝒘 = ( 𝑤 1 , . . . , 𝑤 𝑛 ) that are associated with the 𝑛 observations. Observ ations belonging to a par ticular ter minal node 𝐵 𝑙 will receiv e non-negativ e weights, whereas obser v ations outside of 𝐵 𝑙 will receiv e weight zero. The follo wing are then generic steps tow ards implementing RBP within the HHZ-frame work of Hothor n et al. ( 2006 ). Firstl y , and giv en 𝒘 , we test the global h ypothesis of independence between any of the 𝑚 co variates and the response 𝑌 . If w e reject this h ypothesis, then w e select the 𝑣 ∗ th input with the strong est association with 𝑌 , as measured by the cor responding test statis tic (itself e xplained later). Secondly , choose a set 𝐴 ∗ ⊂ X 𝑣 ∗ f or splitting X 𝑣 ∗ into two disjoint sets 𝐴 ∗ and X 𝑣 ∗ \ 𝐴 ∗ . Membership to the resulting tw o child nodes is defined via updated case weights respectiv e to each node, i.e., 𝒘 left , 𝑘 = 𝑤 𝑘 I ( 𝑋 𝑣 ∗ 𝑘 ∈ 𝐴 ∗ ) and 𝒘 right , 𝑘 = 𝑤 𝑘 I ( 𝑋 𝑣 ∗ 𝑘 ∉ 𝐴 ∗ ) f or all 𝑘 = 1 , . . . , 𝑛 , (10) where I ( ·) is the indicator function. Steps 1–2 are then recursiv ely repeated with modified case weight v ectors 𝒘 left and 𝒘 right until the global null hypothesis of independence can no longer be rejected at a pre-specified nominal le vel 𝛼 . 29 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study A.1.2. Identifying the ‘best’ input variable using a log-rank type test statistic within the HHZ-framew ork In Step 1 of constructing an HHZ-tree, Hothorn et al. ( 2006 ) advised that one w ould generall y need to assess whether any inf or mation about the response 𝑌 is contained in the 𝑚 input variables, i.e., performing variable selection. Consider the f ollowing 𝑚 par tial null h ypotheses 𝐻 𝑣 0 : 𝐷 ( 𝑌 | 𝑋 𝑣 ) = 𝐷 ( 𝑌 ) 𝑣 = 1 , . . . , 𝑚 , each of which asser ts distributional independence betw een 𝑌 and the input variable 𝑋 𝑣 within the cur rent parent node, as identified and represented by the case w eights 𝒘 . From these par tial hypotheses, one can f or m the global null hypothesis 𝐻 0 = Ñ 𝑚 𝑣 = 1 𝐻 𝑣 0 , which states that 𝑌 is jointly independent of all inputs. Ho we ver , we shall restrict our attention to the par tial hypotheses since they f or m the basis of variable selection within an HHZ-tree. In testing one of these par tial hypotheses 𝐻 𝑣 0 , Hothor n et al. ( 2006 ) and Fu and Simonoff ( 2017 ) proposed that the association between 𝑌 and the input v ar iable 𝑋 𝑣 is tested using linear test statistics 𝑻 𝑣 ( D 𝑛 , 𝒘 ) of the form 𝑻 𝑣 ( D 𝑛 , 𝒘 ) = v ec 𝑛 𝑘 = 1 𝑤 𝑘 𝑔 𝑣 ( 𝑋 𝑣 𝑘 ) ℎ ( 𝑌 𝑘 ) T ! ∈ R 𝑝 𝑣 𝑞 . (11) In Eq. 11 , the "v ec" operator s tacks the columns of the resulting 𝑝 𝑣 × 𝑞 matrix into a 𝑝 𝑣 𝑞 -dimensional column v ector . The function 𝑔 𝑣 : X 𝑣 → R 𝑝 𝑣 is a non-random transf or mation of 𝑋 𝑣 reflecting its measurement scale, where 𝑝 𝑣 denotes the dimension of the transf or med input, i.e., the length of the v ector produced b y 𝑔 𝑣 ( ·) . This function is commonly chosen to be the identity 𝑔 𝑣 ( 𝑋 𝑣 ) = 𝑋 𝑣 f or continuous inputs, though other choices cer tainl y e xist such as binning schemes or basis expansions (e.g., spline bases). F ur thermore, ℎ : Y × Y 𝑛 → R 𝑞 is an influence function of the responses 𝑌 1 , . . . , 𝑌 𝑛 that depends on the obser v ed sample (i.e., the Y 𝑛 argument) and is treated as fix ed when assessing the distribution of the test statistic under the null h ypothesis 𝐻 𝑣 0 . As for choosing the influence function ℎ in Eq. 11 , it is common to use the log-rank score statis tic within a survival analy sis setting, as descr ibed by Fu and Simonoff ( 2017 ), giv en the appealing non-parametric nature of this statistic. Having ordered the distinct ev ent times as 𝑡 ( 1 ) < · · · 𝑡 ( 𝑠 ) < · · · < 𝑡 ( 𝑆 ) , we define 𝑑 𝑠 as the number of observed wr ite-off ev ents at time 𝑡 ( 𝑠 ) and let 𝑛 𝑠 denote the size of the r isk set immediatel y pr ior to time 𝑡 ( 𝑠 ) . The log-rank score associated with each biv ar iate sur viv al obser v ation 𝑌 𝑘 = ( 𝑇 𝑘 , 𝛿 𝑘 ) is then giv en by ℎ ( 𝑌 𝑘 ) = 𝛿 𝑘 − 𝑠 : 𝑡 ( 𝑠 ) ≤ 𝑇 𝑘 𝑑 𝑠 𝑛 𝑠 . (12) Each score ℎ ( 𝑌 𝑘 ) represents the difference between the obser v ed ev ent indicator 𝛿 𝑘 and its e xpected v alue under the null hypothesis that all obser v ations share a common sur viv al distribution; i.e., the sur viv al response is independent of an y input variable bef ore splitting. In par ticular , the hazard contr ibutions 𝑑 𝑠 / 𝑛 𝑠 are calculated using the global (pooled) r isk sets { 𝑛 𝑠 } 𝑆 𝑠 = 1 f or med pr ior to an y splitting. These contr ibutions are then summed across all e vent times 𝑡 ( 𝑠 ) ≤ 𝑇 𝑘 , thereby yielding the expected number of wr ite-off ev ents f or the 𝑘 th observation under the af orementioned null model. Each ℎ ( 𝑌 𝑘 ) is essentially an "obser v ed-minus-e xpected" contribution, which under pins the classical log-rank score paradigm. W e then compute a weighted sum of these log-rank scores from Eq. 12 f or each candidate input v ar iable 𝑋 𝑣 30 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study within the cur rent parent node, thereby calculating the linear statistic 𝑇 𝑣 = 𝑛 𝑘 = 1 𝑤 𝑘 𝑔 𝑣 ( 𝑋 𝑣 𝑘 ) ℎ ( 𝑌 𝑘 ) . (13) This statis tic 𝑇 𝑣 measures the association (or degree of co v ar iation) betw een the survival response 𝑌 and each transf or med 𝑔 𝑣 ( 𝑋 𝑣 ) within the cur rent node. But Hothor n et al. ( 2006 ) e xplained that the joint distribution between 𝑌 and 𝑋 𝑣 is generall y unspecified under the null h ypothesis 𝐻 𝑣 0 of independence. As such, the sampling distribution of 𝑇 𝑣 cannot be derived analyticall y under 𝐻 𝑣 0 since no parametric model is assumed. This prohibits directl y assessing whether the obser v ed value of 𝑇 𝑣 is unusually larg e. As a remedy , one can use a per mutation test procedure to appro ximate this sampling distribution. Under 𝐻 𝑣 0 , assume that the survival outcomes 𝑌 𝑘 , 𝑘 = 1 , . . . , 𝑛 , and hence the log-rank scores ℎ ( 𝑌 𝑘 ) , are exchang eable with respect to 𝑋 𝑣 amongst its obser v ations with positive case w eights within the cur rent node. That is, an y re-order ing of these scores amongst the fix ed obser v ations is equall y likel y to occur under 𝐻 0 𝑣 . Conseq uently , one can obtain the conditional distribution of 𝑇 𝑣 under 𝐻 𝑣 0 b y repeatedly permuting the response values. In this context, "per muting the response" means that the log-rank scores ℎ ( 𝑌 𝑘 ) are randoml y re-allocated amongst the obser v ations of 𝑋 𝑣 within the cur rent parent node, whilst keeping fixed the input v alues 𝑋 𝑣 𝑘 and case w eights 𝑤 𝑘 . Doing so destro ys the association betw een 𝑌 and 𝑋 𝑣 , though it still preser v es the marginal distributions of both. By repeatedly recalculating Eq. 13 after each such a per mutation, one obtains the permutation distr ibution L ( 𝑇 𝑣 | 𝑋 𝑣 , 𝒘 ) of 𝑇 𝑣 within a parent node, without requir ing parametr ic assumptions. Ho we v er , it is not necessar y to enumerate all per mutations of the response relative to the input v ar iable 𝑋 𝑣 , which can quickl y become computationally inf easible. In par ticular , Hothorn et al. ( 2006 ) pro vided expressions of the conditional mean and v ar iance of the linear statistic 𝑇 𝑣 , denoted respectivel y by 𝜇 𝑣 and 𝜎 2 𝑣 f or the univariate case where 𝑇 𝑣 is a scalar . Both 𝜇 𝑣 and 𝜎 2 𝑣 are ev aluated under the null hypothesis 𝐻 𝑣 0 of independence betw een the survival response 𝑌 and the transformed input 𝑔 𝑣 ( 𝑋 𝑣 ) . The statistic 𝑇 𝑣 can theref ore be standardised as 𝑍 𝑣 = 𝑇 𝑣 − 𝜇 𝑣 𝜎 𝑣 , (14) which is asymptotically nor mal under 𝐻 𝑣 0 . Such standardisation allo ws f or calculating a cor responding 𝑝 -v alue, thereb y measuring all in puts on the same scale and allo wing for unbiased v ar iable selection. Specifically , w e calculate the 𝑝 -v alue of 𝑍 𝑣 from Eq. 14 under 𝐻 𝑣 0 as 𝑝 𝑣 = 2 Φ ( −| 𝑍 𝑣 | ) , where Φ ( ·) is the standard normal cumulative distribution function. Ultimately , larg er absolute v alues of 𝑍 𝑣 , and hence smaller values of 𝑝 𝑣 , ser v e as g reater e vidence against 𝐻 𝑣 0 . The variable 𝑋 𝑣 ∗ with the smallest 𝑝 𝑣 ∗ is then selected f or splitting in Step 2 of the HHZ-framew ork. More f or mall y , and given a pre-specified significance le vel 𝛼 , w e ha v e 𝑣 ∗ = ar g min 𝑣 = 1 , .. ., 𝑚 𝑝 𝑣 pro vided that 𝑝 𝑣 ∗ ≤ 𝛼 ; otherwise, the algor ithm stops and the current node becomes terminal. A.1.3. Split point selection within a c hosen input variable Step 2 in the HHZ-frame work from Hothor n et al. ( 2006 ) concer ns split point estimation f or a chosen input variable 𝑋 𝑣 ∗ , whereb y its domain X 𝑣 ∗ is par titioned into two disjoint regions. Consider now all admissible subsets 𝐴 ⊂ X 𝑣 ∗ , where admissibility depends on the v ar iable type. For e xample, if 𝑋 𝑣 ∗ is an ordered (numer ic) v ar iable, then 31 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study admissible splits are restr icted to threshold-type par titions of the form 𝐴 = { 𝑥 ∈ 𝑋 𝑣 ∗ : 𝑥 ≤ 𝑐 } f or some cut-point 𝑐 . If 𝑋 𝑣 ∗ is a categor ial v ar iable with 𝐾 le vels, then admissible splits cor respond to all non-empty , proper subsets of the categor y set, i.e., 𝐴 ⊂ { 1 , 2 , . . . , 𝐾 } with 𝐴 ≠ ∅ and 𝐴 ≠ X 𝑣 ∗ . A dmissibility ma y also impose fur ther constraints, such as the minimum number of obser v ations within a node. Eac h candidate subset 𝐴 induces tw o non-empty child nodes, cor responding to 𝐴 and its complement 𝐴 𝑐 = X 𝑣 ∗ \ 𝐴 , which can again be represented via updated case weights as in Eq. 10 . For each admissible subset 𝐴 , a special case of the linear statis tic from Eq. 11 is then computed as the weighted sum of the log-rank scores within one of the tw o induced subg roups, i.e., 𝑇 𝑣 ∗ ( 𝐴 ) = 𝑛 𝑘 = 1 𝑤 𝑘 I ( 𝑋 𝑣 ∗ 𝑘 ∈ 𝐴 ) ℎ ( 𝑌 𝑘 ) , (15) This statistic cor responds to the classical tw o-sample log-rank statis tic that compares two survival distributions; one computed from the observations with 𝑋 𝑣 ∗ 𝑘 ∈ 𝐴 , and one calculated from those in the complement set 𝐴 𝑐 . It suffices to calculate the statis tic f or one subgroup only (instead of calculating it for both groups) since the total w eighted sum of log-rank scores sum to zero within the cur rent parent node, i.e., 𝑇 𝑣 ∗ ( 𝐴 ) + 𝑇 𝑣 ∗ ( 𝐴 𝑐 ) = 0 . Follo wing the calculation of 𝑇 𝑣 ∗ ( 𝐴 ) from Eq. 15 f or a giv en split 𝐴 , we proceed again to a per mutation test procedural setup, as in Step 1. The aim is to approximate the conditional per mutation distribution L ( 𝑇 𝑣 ∗ ( 𝐴 ) | 𝑋 𝑣 ∗ , 𝒘 ) of the linear statis tic under the null hypothesis 𝐻 𝑣 ∗ 0 of independence betw een 𝑋 𝑣 ∗ and the survival response 𝑌 . As bef ore, this per mutation distribution is again obtained under 𝐻 𝑣 ∗ 0 b y randomly per muting 𝑌 (or equivalentl y , the log-rank scores ℎ ( 𝑌 𝑘 ) ) relative to the fix ed input values 𝑋 𝑣 ∗ 𝑘 and fixed case weights 𝑤 𝑘 within the cur rent parent node. Hothor n et al. ( 2006 ) pro vided anal ytic e xpressions for estimating the conditional mean and variance of 𝑇 𝑣 ∗ ( 𝐴 ) under 𝐻 𝑣 ∗ 0 f or a giv en split 𝐴 , denoted respectivel y as the mean 𝜇 𝑣 ∗ ( 𝐴 ) and v ar iance 𝜎 2 𝑣 ∗ ( 𝐴 ) . The test statis tic of each candidate split 𝐴 is then standardised as 𝑍 𝑣 ∗ ( 𝐴 ) = 𝑇 𝑣 ∗ ( 𝐴 ) − 𝜇 𝑣 ∗ ( 𝐴 ) 𝜎 𝑣 ∗ ( 𝐴 ) . (16) Finall y , the optimal subset 𝐴 ∗ is chosen as 𝐴 ∗ = ar g max 𝐴 | 𝑍 𝑣 ∗ ( 𝐴 ) | , (17) which is the split that yields the most extreme tw o-sample log-rank statistic. Put differently , and amongst all admissible binar y par titions of X 𝑣 ∗ , the algorithm selects the subset 𝐴 ∗ that e xhibits the larges t standardised de viation aw ay from the centre of its conditional per mutation distribution. 32 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study A.1.4. Estimating the sur viv or function within each terminal node Ha ving induced an HHZ-tree as described by Hothorn et al. ( 2006 ), each ter minal node 𝐵 𝑙 defines a region of the input space within whic h the sur viv al dis tr ibution is assumed to be homog eneous. As such, and using the obser v ations within a 𝐵 𝑙 , the conditional surviv or function 𝑆 𝑙 ( 𝑡 ) is estimated nonparametr icall y using the Kaplan–Meier (KM) estimator . Let I 𝑙 = { 𝑘 ∈ { 1 , . . . , 𝑛 } : 𝑿 𝑘 ∈ 𝐵 𝑙 } denote the inde x set of obser v ations resor ting into node 𝐵 𝑙 . W ithin each 𝐵 𝑙 , w e define 𝑑 𝑙 𝑠 as the number of write-off e vents at ordered failure time 𝑡 ( 𝑠 ) amongst the obser v ations in 𝐵 𝑙 . Similar ly , let 𝑛 𝑙 𝑠 signify the size of the local r isk set just pr ior to 𝑡 ( 𝑠 ) within 𝐵 𝑙 . The quantities 𝑑 𝑙 𝑠 and 𝑛 𝑙 𝑠 are then respectivel y defined as 𝑑 𝑙 𝑠 = 𝑘 ∈ I 𝑙 I 𝑇 𝑘 = 𝑡 ( 𝑠 ) , 𝛿 𝑘 = 1 , 𝑛 𝑙 𝑠 = 𝑘 ∈ I 𝑙 I 𝑇 𝑘 ≥ 𝑡 ( 𝑠 ) . No w consider estimating the sur viv al probability P ( 𝑇 > 𝑡 | 𝑿 ∈ 𝐵 𝑙 ) within node 𝐵 𝑙 , where 𝑇 denotes a random v ar iable that represents the non-negativ e lifetimes of default spells. The node-specific KM-estimator of this sur viv al probability is then specified as ˆ 𝑆 𝑙 ( 𝑡 ) = Ö 𝑡 ( 𝑠 ) ≤ 𝑡 1 − 𝑑 𝑙 𝑠 𝑛 𝑙 𝑠 . (18) Using Eq. 18 , one ma y then der iv e the hazard rates ℎ 𝑙 ( 𝑡 ) and the associated write-off ev ent probabilities 𝑓 𝑙 ( 𝑡 ) within node 𝐵 𝑙 , which are respectiv ely estimated as ℎ 𝑙 ( 𝑡 ) = 1 − ˆ 𝑆 𝑙 ( 𝑡 ) ˆ 𝑆 𝑙 ( 𝑡 − 1 ) , ˆ 𝑓 𝑙 ( 𝑡 ) = ˆ 𝑆 𝑙 ( 𝑡 − 1 ) ˆ ℎ 𝑙 ( 𝑡 ) . A.1.5. Pr actical implementation of an HHZ-tree in the R -prog ramming languag e For completeness and reproducibility , w e pro vide an R -based implementation of an HHZ-tree using the partykit R -pac kage from Hothor n and Zeileis ( 2015 ). This implementation estimates the HHZ-based survival tree that is described in subsubsection 3.3.3 , using a f ew dummy input variables. Note how e ver that an HHZ-tree simultaneously e valuates the association betw een the response and multiple candidate inputs within each node, which implies a multiple h ypothesis testing problem. W ithin the HHZ-framew ork, this multiplicity is addressed b y testing the global null h ypothesis of independence between the response and all inputs, though using multiplicity-adjus ted 𝑝 -v alues such as a Bonferroni cor rection. In the interest of simplicity , our implementation also uses a Bonferroni cor rection when e valuating the per mutation-based 𝑝 -v alues across inputs. V ariable selection proceeds by choosing the input with the smallest adjusted 𝑝 -v alue, provided that it satisfies the significance threshold implied by mincriterion = 0.99 . 33 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study m o d S u r v T r e e < - c t r e e ( S u r v ( D e f S p e l l _ A g e , D e f S p e l l _ E v e n t ) ~ B a l a n c e + p m n t _ m e t h o d + I n t e r e s t R a t e _ N o m + M _ D T I _ G r o w t h _ 6 , d a t a = d a t T r a i n , c o n t r o l = c t r e e _ c o n t r o l ( m i n c r i t e r i o n = 0 . 9 9 , m i n s p l i t = 1 0 0 0 , m i n b u c k e t = 5 0 , t e s t t y p e = " B o n f e r r o n i " , m a x d e p t h = 4 ) ) A.2. Illustrating the necessary dat a structure for LGD surviv al modelling Consider the data structure in T able 2 of the longitudinal credit dataset D = n 𝑖 , 𝑡 𝑖 , 𝑗 , 𝑡 𝑖 𝑗 , 𝜏 𝑑 , 𝜏 𝑠 , R D 𝑖 𝑗 , 𝑇 𝑖 𝑗 , 𝑒 𝑖 𝑗 𝑡 o , as defined in Subsec. 3.1 . W e identify each ro w using ( 𝑖 , 𝑗 , 𝑡 𝑖 𝑗 ) as the composite ke y across the month-end obser v ations 𝑡 𝑖 𝑗 within a par ticular default spell ( 𝑖 , 𝑗 ) . Loan 1 ( 𝑖 = 1 ) had a single def ault spell that ended in wr ite-off at time 𝑡 𝑖 = 6 , whereas Loan 2 ( 𝑖 = 2 ) became r ight-censored at 𝑡 𝑖 = 14 , which presumably coincides with the study-end. Loan 3 ( 𝑖 = 3 ) had two def ault spells; the first spell cured while the second spell ended in wr ite-off, both ha ving spent two and three months in default respectiv ely . Loan 4 ( 𝑖 = 4 ) had a delay ed entr y (i.e., it w as left-tr uncated) such that obser v ation only star ted at month 𝑡 𝑖 = 13 , which is why its default entr y -time is dul y adjusted – assuming that it was in still def ault pr ior . It cured 𝑇 𝑖 𝑗 = 3 months later at 𝑡 𝑖 = 15 , follo wed by tw o successive default spells; the last of which became r ight-censored at time 𝑡 𝑖 = 41 . Note that these ex amples ignore the imposition of any probation period within the default definition, simply in the interest of brevity . T able 2: Illus trating the structure of the raw panel dataset D , filtered f or default spells. The alter nating grey -shaded ro ws indicate loan-lev el his tor y , while the alter nating colour -shaded cells signify different def ault spell-le vel histories of each loan; the remaining unshaded cells denote per iod-le vel inf or mation. Inspired by Botha et al. ( 2025 ). Loan 𝑖 P eriod 𝑡 𝑖 Spell number 𝑗 Spell pe- riod 𝑡 𝑖 𝑗 Default time 𝜏 𝑑 Stop time 𝜏 𝑠 Resolution type R D 𝑖 𝑗 Spell age 𝑇 𝑖 𝑗 Ev ent 𝑒 𝑖 𝑗 1 5 1 1 0 2 1: W r ite-off 2 0 1 6 1 2 0 2 1: W r ite-off 2 1 2 12 1 1 0 3 3: Censored 3 0 2 13 1 2 0 3 3: Censored 3 0 2 14 1 3 0 3 3: Censored 3 0 3 6 1 1 0 2 2: Cured 2 0 3 7 1 2 0 2 2: Cured 2 0 3 24 2 1 0 3 1: W r ite-off 3 0 3 25 2 2 0 3 1: W r ite-off 3 0 3 26 2 3 0 3 1: W r ite-off 3 1 4 13 1 13 12 15 2: Cured 3 0 4 14 1 14 12 15 2: Cured 3 0 4 15 1 15 12 15 2: Cured 3 0 4 24 2 1 0 4 2: Cured 4 0 4 25 2 2 0 4 2: Cured 4 0 4 26 2 3 0 4 2: Cured 4 0 4 27 2 4 0 4 2: Cured 4 0 4 40 3 1 0 2 3: Censored 2 0 4 41 3 2 0 2 3: Censored 2 0 34 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study A.3. A diagnostic measure for testing the representativ eness of resam pled LGD dat ase ts As discussed by Botha and V erster ( 2026 ), the training and validation sets { D 𝑇 , D 𝑉 } should not e xhibit undue sampling bias o v er time. W e can measure this bias using the resolution rat e , which can be calculated f or either set and compared to that of the panel dataset D . Consider no w that D can be par titioned into a ser ies of non-o ver lapping monthl y spell cohor ts D ( 𝑡 ′ ) o v er repor ting time 𝑡 ′ ∈ { 𝑡 ′ 1 , . . . , 𝑡 ′ 𝑙 , . . . , 𝑡 ′ 𝑛 } , e.g., Jan-2008 to Dec-2022. Let 𝑟 𝜓 𝑡 ′ 𝑙 , D denote the resolution rate at which spells resol v e at an y time 𝑡 ′ 𝑙 into a specified type 𝜓 within a giv en dataset D . The type 𝜓 ∈ { 1 , 2 , 3 } ref ers respectiv ely to wr ite-off (1), cured (2), or censored (3) outcomes. Suppose that 𝑛 𝑡 ′ denotes the size of D ( 𝑡 ′ ) at 𝑡 ′ . No w suppose that D ( 𝑡 ′ ) contains all spells that commonly stop at 𝑡 ′ , i.e., the so-called ‘cohor t-end’-definition from Botha et al. ( 2025 ), which resolv es into the resolution date. W e then define the resolution rate of type 𝜓 at each 𝑡 ′ as 𝑟 D 𝜓 ( 𝑡 ′ , D ) = 1 𝑛 𝑡 ′ ( 𝑖 , 𝑗 ) ∈ D ( 𝑡 ′ ) I ( R D 𝑖 𝑗 = 𝜓 ) ∀ D ( 𝑡 ′ ) ⊂ D and for 𝜓 ∈ R D , (19) where I ( ·) is an indicator function, a n d R D is a nominal-v alued random variable with realisations R D 𝑖 𝑗 , 𝑖 = 1 , . . . , 𝑁 𝑑 , 𝑗 = 1 , . . . , 𝑛 𝑖 . This resolution rate is indeed similar to the one introduced by Botha and V erster ( 2026 ) f or per f or ming spells. W e can no w c heck the sets { D 𝑇 , D 𝑉 } f or time-dependent sampling bias using Eq. 19 . In par ticular , the resolution rates 𝑟 D 𝜓 ( 𝑡 ′ , D ) , 𝑟 D 𝜓 ( 𝑡 ′ , D 𝑇 ) , and 𝑟 D 𝜓 ( 𝑡 ′ , D 𝑉 ) are dul y calculated and compared ov er 𝑡 ′ to wards screening f or larg e discrepancies. This e xercise is formalised b y using the MAE-based aver ag e discr epancy (AD) measure as a diagnostic from Botha and V erster ( 2026 ), which is expressed between any tw o non-o v erlapping sets D 1 and D 2 as AD: ¯ 𝑟 D 𝜓 ( D 1 , D 2 ) = 1 𝑛 𝑡 ′ 𝑟 D 𝜓 ( 𝑡 ′ , D 1 ) − 𝑟 D 𝜓 ( 𝑡 ′ , D 2 ) ∀ 𝑡 ′ and f or 𝜓 ∈ R D . (20) Smaller AD-v alues signify g reater representativ eness between tw o subsampled sets. One can calculate the AD- measure f or v ar ious combinations of our datasets, which would include ¯ 𝑟 D 𝜓 ( D , D 𝑇 ) , ¯ 𝑟 D 𝜓 ( D , D 𝑉 ) , and ¯ 𝑟 D 𝜓 ( D 𝑇 , D 𝑉 ) . W e demonstrate and compare the wr ite-off resolution rate ( 𝜓 = 1 ) in Fig. 11 across different samples of data. These rates clearl y track major macroeconomic phenomena such as the 2008 financial cr isis and the Co vid-19 pandemic, during which times the resolution rates spiked dramatically . For the 2008 cr isis, there is clearl y a delay ed effect in the resolution rate, which is simpl y a function of the length y w orkout process of def aults. The last spike in the resolution rates during the 2022-2023 per iod is ascr ibed to the increases in the policy rate, as set b y the central bank in response to r ising inflation rates at the time. This effect caused instalments to become unaffordable f or man y bor ro wers, thereby inducing default and straining collection effor ts. Ne v er theless, one can obser v e that all rates are reasonably close to another , with an AD-value of about 0.8% between D and D 𝑇 . Based on these results, we consider the resampled sets as f airl y representativ e of D , whic h bodes w ell f or the g eneralisation ability of the e ventual models, bey ond the training data. A.4. A short pr ocedure f or optimising the cost multiple 𝑎 within the Generalised Y ouden Inde x using the MAE In dichotomising a giv en probability score 𝑤 ( 𝑡 , 𝒙 𝑖 ) into a 0/1-decision, consider the indicator function I ( 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 ∗ ) that outputs 1 if 𝑤 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 ∗ , and 0 otherwise. The optimal cut-off 𝑐 ∗ is obtained by finding the maximum value of the Generalised Y ouden inde x 𝐽 𝑎 ( 𝑐 ) from Eq. 9 across a range of possible cut-off values 𝑐 ∈ [ 0 , 1 ] , given the entire 35 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study Fig. 11. Compar ing the resolution rates of type 𝜓 = 1 (W rite-off ) o ver time across the various datasets, having used residential mor tg age data. The MAE-based AD-measure from Eq. 20 summar ises the discrepancies ov er time f or each dataset-pair . Inspired by Botha and V erster ( 2026 ). set of probability scores across 𝑖 = 1 , . . . , 𝑁 . Ha ving dichotomised the under lying probabilistic model using 𝑐 ∗ , w e calculate the empir ical and e xpected dichotomised ter m-s tr uctures. The empir ical ter m-s tr ucture is estimated as the collection of the ev ent rates 𝑓 ( 𝑡 ) across failure times 𝑡 = 𝑡 ( 1 ) , . . . , 𝑡 ( 𝑚 ) , where 𝑓 ( 𝑡 ) is itself estimated using Kaplan- Meier analy sis; see Subsec. 3.2 . The e xpected ter m-structure is f ound by first obtaining the hazard-predictions ˆ ℎ ( 𝑡 , 𝒙 𝑖 ) from a fitted model giv en a set of input variables 𝒙 𝑖 = 𝑬 𝑖 𝑗 , 𝒙 𝑖 𝑗 , 𝒙 𝑖 𝑗 ( 𝑡 ) , 𝒙 ( 𝑡 ) , as e xplained in Subsec. 3.3 . These hazard-predictions are then related to the cor responding ev ent rates 𝑓 ( 𝑡 , 𝒙 𝑖 ) = ˆ 𝑆 ( 𝑡 − 1 , 𝒙 𝑖 ) · ˆ ℎ ( 𝑡 , 𝒙 𝑖 ) , where ˆ 𝑆 ( 𝑡 , 𝒙 𝑖 ) is the predicted account-le vel sur viv al probability from Eq. 6 . The expected ter m-s tr ucture is then the agg reg ated f or m of these e vent rates 𝑓 ( 𝑡 , 𝒙 𝑖 ) , denoted by 𝑓 E ( 𝑡 ) and e xpressed using the mean as 𝑓 E ( 𝑡 ) = 1 | D 𝑡 | 𝑖 ∈ D 𝑡 𝑓 ( 𝑡 , 𝒙 𝑖 ) f or 𝑡 = 𝑡 ( 1 ) , . . . , 𝑡 ( 𝑚 ) , (21) where each D 𝑡 is a set that contains those subject-spells that ha v e sur viv ed up to 𝑡 . Similarl y , the dic hotomised v ersion of the e xpected ter m-s tr ucture given 𝑐 ∗ is 𝑓 ∗ E ( 𝑡 , 𝑐 ∗ ) = 1 | D 𝑡 | 𝑖 ∈ D 𝑡 I ( 𝑓 ( 𝑡 , 𝒙 𝑖 ) > 𝑐 ∗ ) f or 𝑡 = 𝑡 ( 1 ) , . . . , 𝑡 ( 𝑚 ) . (22) 36 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study Betw een each collection 𝑓 𝑡 ( 𝑘 ) 𝑚 𝑘 = 1 and 𝑓 ∗ E 𝑡 ( 𝑘 ) , 𝑐 ∗ 𝑚 𝑘 = 1 , we calculate the MAE given a par ticular cost multiple 𝑎 and resulting 𝑐 ∗ as MAE ( 𝑎 ) = 1 𝑚 𝑚 𝑘 = 1 𝑓 𝑡 ( 𝑘 ) − 𝑓 ∗ E 𝑡 ( 𝑘 ) , 𝑐 ∗ . (23) where 𝑐 ∗ = ar g min 𝑐 𝐽 𝑎 ( 𝑐 ) . Eq. 23 is then repeatedly calculated using a range of 𝑎 -v alues, thereby resulting in a collection of corresponding MAE-values. The minimum hereof should indicate the chosen 𝑎 -v alue, thereb y concluding the optimisation procedure. W e sho w the results of the af orementioned optimisation procedure in Fig. 12 . F or computational expediency , limits are imposed on the search space of 𝑐 f or each model dur ing the optimisation. The upper bound hereof is f ound f ollo wing a distributional analy sis on the underl ying ev ent rates 𝑓 ( 𝑡 , 𝒙 𝑖 ) , 𝑖 = 1 , . . . , 𝑁 per model, whereafter the upper bound is chosen such that it equals the 99% quantile of ev ent rates. Accordingl y , these limits are 𝑐 ∈ [ 0 , 0 . 025 ] f or the DtH-Basic model, 𝑐 ∈ [ 0 , 0 . 3 ] f or the DtH- Adv anced model, and 𝑐 ∈ [ 0 , 0 . 4 ] f or the LR -model. W e shall no w discuss the optimisation results f or each model, and present the cor responding 𝑐 ∗ -threshold. Fig. 12. Optimisation results of the cost multiple 𝑎 in minimising the MAE between the empir ical and e xpected term-str ucture. The latter emanates from a dichotomised model, having imposed a particular cut-off 𝑐 ∗ ; itself obtained from calculating the Generalised Y ouden inde x 𝐽 𝑎 giv en 𝑎 . The encircled points indicate chosen 𝑎 -v alues. The MAE g enerally increases as 𝑎 increases in Fig. 12 , whilst the cor responding 𝑐 ∗ -v alue decreases as 𝑎 increases. Ins tead of being a single point, the minimum MAE-v alues present across a range of 𝑎 -v alues f or both the DtH-Basic and LR models. Ho w ev er , the resulting term-str uctures are unrealistic when imposing the cor responding 𝑐 ∗ -v alues at any of these 𝑎 -v alues. In par ticular , these ter m-s tr uctures appear as flat zero-valued lines o v er most of 𝑡 . This result implies that all of the model predictions are zero-v alued, and that the cor responding 𝑐 ∗ -threshold is too high; despite yielding a minimum MAE-value in the do wnstream term-str ucture diagnostic. Moreo v er , such se vere under -prediction of the e v ent probability w ould not be in the interest of risk pr udence. As a remedy , we chose the lo wes t 𝑎 -v alue (and cor responding 𝑐 ∗ -threshold) that star ted to yield a more credible ter m-structure. In 37 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study this rather manual process, we used expert judgment in adjusting the optimisation results, such that the resulting term-str ucture will at least o ver -predict the e vent probability across most of 𝑡 . Accordingl y , this process yielded 𝑎 = 38 f or the DtH-Basic model with a cor responding threshold of 𝑐 ∗ = 0 . 0152 ; and 𝑎 = 12 f or the LR -model with an associated threshold of 𝑐 ∗ = 0 . 0651 . Credible ter m-s tr uctures, as sho wn in Fig. 7b , are obtained using both of these 𝑐 ∗ -choices. For the DtH-A dvanced model, the minimum MAE is achie v ed at 𝑎 = 1 and the cor responding threshold is 𝑐 ∗ = 0 . 295 . This 𝑐 ∗ -choice also yielded a realistic ter m-s tr ucture, as sho wn in Fig. 7a , and no special adjustment is required. A.5. A description of selected in put v ariables within each LGD-model In T able 3 , the selected in put v ariables of the finalised LGD-models are described. This description includes a mapping betw een variables and the specific LGD-model, whils t relegating the fitting procedure and its diagnostics to the codebase maintained by Gabr u et al. ( 2026 ), purely in the interest of brevity . T able 3: The selected in put v ar iables mapped across the v ar ious LGD-models. W rite-off r isk models include the single-stag e Gaussian-GLM (1s-Gaussian), single-stag e Compound-Poisson-GLM (1s-CP), tw o-stage logistic regression (2s-LR), tw o-stage DtH-basic (2s-DtH-Bas), tw o-stag e DtH-advanced (2s-DtH-A dv), and the tw o-stag e survival tree (2s-ST). The loss se verity is modelled with a Compound-Poisson-GLM (LS-CP). Subscr ipts [ a ] denote loan account-lev el variables, [ p ] are por tf olio-lev el inputs, and [ m ] represent macroeconomic cov ariates. V ariable Description Models AgeToTerm_Avg [ p ] Mean value across the portfolio of the ratio between a loan’ s age and its term. 1s-Gaussian; LS-CP AgeToTerm [ a ] Ratio between a loan’ s age and its ter m. 1s-CP ArrearsDir_3 [ a ] The trending direction of the ar rears direction, qualitativel y obtained by comparing the current ar rears-le vel to that of 3 months ago, binned as: 1) increasing; 2) milling; 3) decreasing (reference); and 4) missing. LS-CP Arrears_Med [ p ] Median-imputed amount in ar rears. 1s-Gaussian; 1s- CP ArrearsToBal_Avg_1 [ p ] Mean value across the por tf olio of the ratio betw een the arrears amount and the outstanding balance, lagged by 1 month. 1s-CP; 2s-LR Balance_Real_1 [ a ] Inflation-adjusted outstanding balance of the loan, lagged by 1 month 1s-Gaussian; 1s- CP; 2s-LR; 2s- DtH-A dv; 2s- ST ; LS-CP DefSpell_Age [ a ] Default spell age (months). 1s-Gaussian; 1s- CP; 2s-LR DefSpell_Age_Mean [ a ] Mean default spell age (months) across the por tf olio at the time. 1s-Gaussian; DefaultStatus_Avg [ p ] Fraction of the por tf olio in default. 1s-Gaussian DefaultStatus_Avg_12 [ p ] 12-month lagged version of DefaultStatus_Avg . 1s-CP; 2s-LR; 2s-DtH-A dv; 2s- ST g0_Delinq_1 [ a ] Delinquency -lev el (1-month lag), or the number of pa yments in ar rears as measured by the 𝑔 0 -measure; see the 𝑔 0 -measure from Botha et al. ( 2021 ). 2s-DtH-Bas; 2s- DtH-A dv Continued on next pag e 38 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study T able 3: (continued) V ariable Description Models g0_Delinq_Avg [ p ] Non-def aulted av erage delinquency 𝑔 0 across the por tf olio at the time. 1s-CP; 2s-LR; 2s-DtH-A dv; 2s- ST g0_Delinq_Any_Avg_1 [ p ] Non-def aulted fraction of the por tf olio with an y deg ree of delinquency be yond 𝑔 0 = 0 , lagged by 1 month. 2s-DtH-Bas g0_Delinq_Any_Avg_12 [ p ] 12-month lagged version of g0_Delinq_Any_Avg . 1s-Gaussian g0_Delinq_Num [ a ] Number of times that the 𝑔 0 -measure chang ed in value o ver loan lif e so far . 1s-Gaussian; 1s- CP; 2s-LR g0_Delinq_SD_6 [ a ] The sample standard deviation of the 𝑔 0 -measure [ g0_Delinq ] o ver a rolling 6-month window . 1s-Gaussian; LS-CP Instalment_Real [ a ] Inflation-adjusted expected instalment of the loan. LS-CP InterestRate_Nominal [ a ] Nominal interest rate per annum of a loan. 1s-Gaussian; 1s- CP; 2s-LR; 2s- DtH-A dv InterestRate_Mar_Med_2 [ p ] Median value across the por tf olio of the nominal interest rates of loans at the time, lagged by 2 months. 2s-LR; 2s-DtH- A dv; 2s-ST M_DebtToIncome_3 [ m ] Debt-to-Income: A verag e household debt e xpressed as a percentage of household income per quar ter , interpolated monthly , lagged by 3 months. LS-CP M_DebtToIncome_6 [ m ] 6-month lagged version of DTI growth rate, M_DebtToIncome . 2s-LR; 2s-ST M_DebtToIncome_12 [ m ] 12-month lagged version of DTI growth rate, M_DebtToIncome . 2s-DtH-A dv M_Inflation_Growth_3 [ m ] Y ear-on-y ear growth rate in inflation index ( Consumer Price Index [CPI]) per month, lagged by 3-months. 2s-LR M_Inflation_Growth_9 [ m ] 9-month lagged version of the CPI growth rate, M_Inflation_Growth . 2s-DtH-Bas M_Inflation_Growth_12 [ m ] 12-month lagged version of the CPI growth rate, M_Inflation_Growth . 1s-Gaussian; 2s- DtH-A dv M_RealGDP_Growth [ m ] Y ear-on-y ear growth rate in the 4-quarter moving av erage of real GDP per quarter, inter polated monthly , lagged by 12 months. 1s-CP; 2s-LR M_RealIncome_Growth_9 [ m ] Y ear-on-y ear growth rate in the 4-quarter moving av erage of real income per quarter, inter polated monthly , lagged by 9 months. 1s-CP; 2s-DtH- A dv M_RealIncome_Growth_12 [ m ] 12-month lagged version of real income growth, M_RealIncome_Growth . LS-CP M_Repo_Rate_2 [ m ] Prev ailing repurchase (or policy) rate set by the South Afr ican Reserv e Bank (S ARB), lagged by 2 months. 2s-LR M_Repo_Rate_6 [ m ] 6-month lagged version of M_Repo_Rate 1s-Gaussian M_Repo_Rate_9 [ m ] 9-month lagged version of M_Repo_Rate . 1s-CP M_Repo_Rate_12 [ m ] 12-month lagged version of M_Repo_Rate . 2s-DtH-A dv NewLoans_Pc [ a ] Fraction of the por tf olio that constitutes new loans. 1s-CP; 2s-DtH- A dv; LS-CP PayMethod [ a ] A categor ical variable designating different payment methods: 1) debit order (ref erence); 2) salary; 3) payroll or cash; and 4) missing. 2s-LR; 2s-DtH- A dv; 2s-ST ; LS- CP Principal_Real [ a ] Inflation-adjusted pr incipal loan amount. 2s-LR; 2s-Dth- A dv; LS-CP Continued on next pag e 39 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study T able 3: (continued) V ariable Description Models PrevDefaults [ a ] Indicating whether the loan experienced previous default spells. 1s-Gaussian; 1s- CP; 2s-LR; 2s- ST ; LS-CP SpellNum_Bn [ a ] The cur rent default spell number, or total number of visits to the default state ov er loan life, binned as ("1", "2", "3", "4+") spells. 1s-CP; 2s-LR; 2s-DtH-Bas; 2s- DtH-A dv; 2s- ST TimeSpell [ a ] Logarithm of the time spent in a def ault spell to wards embedding the baseline hazard. 2s-DtH-Bas TimeSpell_Bn [ a ] Binned v ersion of the time spent in a default spell. 2s-DtH-A dv TimeSpell*SpellNum_Bn [ a ] An interaction effect between the logarithm of the time spent in a default spell, and SpellNum_Bn . 2s-DtH-A dv R efer ences 1. Akritas, M. G. (1994). Nearest neighbor estimation of a bivariate distribution under random censor ing. The Annals of Statistics , 1299–1327. https://www.jstor.org/stable/2242227 2. Allison, P . D. (2010). Sur viv al Analysis Using SAS: A Practical Guide (2nd). S AS Institute. https : //support.sas.com/en/books/authors/paul- allison.html 3. Baesens, B., Rösc h, D., & Scheule, H. (2016). Credit risk analytics: Measurement tec hniques, applications, and examples in SAS . John Wile y & Sons. 4. Banasik, J., Crook, J. N ., & Thomas, L. C. (1999). N ot if but when will bor ro wers default. Journal of the Operational R esearc h Socie ty , 50 (12), 1185–1190. https :/ / doi. org / 10 . 1057 /palgrave . jors. 2600851 5. Bansal, A., & Heag er ty, P . J. (2018). A tutorial on ev aluating the time-varying discrimination accuracy of sur viv al models used in dynamic decision making. Medical Decision Making , 38 (8), 904–916. https: //doi.org/10.1177/0272989X18801312. 6. Bellotti, T ., & Crook, J. (2009). Credit scor ing with macroeconomic variables using sur viv al anal ysis. Jour nal of the Operational Resear ch Society , 60 (12), 1699–1707. https://doi.org/10.1057/jors.2008.130 7. Bellotti, T ., & Crook, J. (2013). F orecasting and stress tes ting credit card def ault using dynamic models. International Journal of F orecas ting , 29 (4), 563–574. https : / / doi . org / 10 . 1016 / j . ijforecast . 2013.04.003 8. Bellotti, T ., & Crook, J. (2014). Retail credit stress testing using a discrete hazard model with macroeconomic f actors. Jour nal of the Operational Resear ch Society , 65 (3), 340–350. https://doi.org/10.1057/jors. 2013.91 9. Betz, J., Kellner, R., & R ösch, D. (2021). Time matters: How default resolution times impact final loss rates. Journal of the Ro yal Statis tical Society Series C: Applied Statistics , 70 (3), 619–644. 10. Blumenstoc k, G., Lessmann, S., & Seow, H. - V . (2022). Deep lear ning for sur viv al and competing risk modelling. Jour nal of the Operational Resear ch Society , 73 (1), 26–38. https : / / doi . org / 10 . 1080 / 01605682.2020.1838960 40 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 11. Botha, A. (2021). A procedur e f or loss-optimising the timing of loan r ecov er y under uncer tainty [Doctoral disser tation, U niversity of Pretor ia]. https://doi.org/10.13140/RG.2.2.12015.30888/2 12. Botha, A., Be yers, C., & De Villiers, P . (2021). Simulation-based optimisation of the timing of loan reco very across different por tf olios. Exper t Sys tems with Applications , 177 . https://doi.org/10.1016/j.eswa. 2021.114878 13. Botha, A., & V erster, T . (2026). Approaches f or modelling the term-structure of default r isk under IFRS 9: A tutor ial using discrete-time survival anal ysis. International Journal of Data Science and Analytics . https://doi.org/10.48550/arXiv.2507.15441 14. Botha, A., V erster, T ., & Scheepers, B. (2025). T ow ards modelling lif etime default r isk: Explor ing different subtypes of recur rent ev ent Co x-reg ression models. arXiv . https:/ /doi.org/ 10.48550/ arXiv .2505. 01044 15. Breeden, J. L., & Crook, J. (2022). Multihor izon discrete time sur viv al models. Journal of the Operational Resear c h Society , 73 (1), 56–69. https://doi.org/10.1080/01605682.2020.1777907 16. Calabrese, R., & Zenga, M. (2010). Bank loan reco very rates: Measur ing and nonparametric density estimation. Journal of Banking & Finance , 34 (5), 903–911. https://doi.org/10.1016/j.jbankfin.2009.10.001 17. Cro wder, M. (2012). Multivariat e sur viv al analysis and competing risks . CRC Press. 18. Dirick, L., Claeskens, G., & Baesens, B. (2017). Time to default in credit scor ing using sur viv al analy sis: A benchmark study. Jour nal of the Operational Resear ch Society , 68 (6), 652–665. https: //doi. org/10. 1057/s41274- 016- 0128- 9; 19. Djeundje, V . B., & Crook, J. (2019). Dynamic sur viv al models with varying coefficients f or credit risks. Eur opean Journal of Operational Resear ch , 275 (1), 319–333. https : / / doi . org / 10 . 1016 / j . ejor . 2018.11.029 20. F aw cett, T . (2006). An introduction to roc anal ysis. P attern recognition le tters , 27 (8), 861–874. https : //doi.org/10.1016/j.patrec.2005.10.010 21. Fenec h, J. P ., Y ap, Y . K., & Shafik, S. (2016). Modelling the reco v er y outcomes f or defaulted loans: A survival anal ysis approach. Economics Lett ers , 145 , 79–82. https : // doi . org / 10. 1016 / j .econlet . 2016.05.015 22. Finla y, S. (2010). The manag ement of consumer credit: Theor y and pr actice (Second). Palgra ve Macmillan. 23. Frydman, H., & Matuszyk, A. (2022). Random survival f orest f or competing credit r isks. Jour nal of the Oper ational Researc h Society , 73 (1), 15–25. https://doi.org/10.1080/01605682.2020.1759385 24. Fu, W ., & Simonoff, J. S. (2017). Sur viv al trees f or left-truncated and right-censored data, with application to time-v ar ying cov ar iate data. Biostatistics , 18 (2), 352–369. https://doi.org/10.1093/biostatistics/ kxw047 25. Gabru, M., Muller, M., & Botha, A. (2026). Der iving the ter m-structure of loan wr ite-off r isk under ifrs 9 using survival analy sis [source code]. https://doi.org/10.5281/zenodo.18982288 26. Geisser, S. (1998). Compar ing two tests used f or diagnostic or screening pur poses. Statis tics & Probability Lett ers , 40 (2), 113–119. https://doi.org/10.1016/S0167- 7152(98)00067- 4 27. Graf, E., Schmoor, C., Sauerbrei, W ., & Schumacher, M. (1999). Assessment and compar ison of prognostic classification schemes f or sur viv al data. Statistics in Medicine , 18 (17-18), 2529–2545. https://doi.org/ 10.1002/(SICI)1097- 0258(19990915/30)18:17/18< 2529::AID- SIM274> 3.0.CO;2- 5 28. Gür tler, M., & Hibbeln, M. (2013). Improv ements in loss given default f orecasts f or bank loans. Jour nal of Banking & Finance , 37 (7), 2354–2366. https://doi.org/10.1016/j.jbankf in.2013.01.031 41 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 29. Heag er ty, P . J., Lumle y, T ., & P epe, M. S. (2000). Time-dependent R OC curv es f or censored sur viv al data and a diagnostic mark er. Biometrics , 56 (2), 337–344. https : / / doi . org / 10 . 1111 / j . 0006 - 341X.2000.00337.x 30. Hothorn, T ., Hornik, K., & Zeileis, A. (2006). Unbiased recursiv e par titioning: A conditional inf erence frame work. Jour nal of Computational and Graphical statistics , 15 (3), 651–674. https: //doi .org/ 10. 1198/106186006X133933 31. Hothorn, T ., & Zeileis, A. (2015). Partykit: A modular toolkit f or recursive par tytioning in r. The Jour nal of Mac hine Lear ning Resear ch , 16 (1), 3905–3909. https://jmlr.org/papers/v16/hothorn15a.html 32. IASB. (2014). International financial reporting standard (IFRS) 9: F inancial ins tr uments . IFRS Foundation: International Accounting Standards Board (IASB). London. https://www.ifrs.org/issued- standard s/list- of- standards/if rs- 9- financial- instruments/ 33. Jacobs, M. (2024). Modeling ultimate loss-giv en-default and time-to-resolution on cor porate debt. A v ailable at SSRN 4738268 . 34. James, G., Witten, D., Has tie, T ., & Tibshirani, R. (2013). An introduction to statistical lear ning: Wit h applications in R . Spr ing er. https://doi.org/10.1007/978- 1- 4614- 7138- 7 35. Jenkins, S. P . (2005). Sur vival analysis (V ol. 42). Citeseer. 36. Jouber t, M., V erster, T ., & Raubenheimer, H. (2018a). Default w eighted sur viv al analy sis to directly model loss giv en default. South African Statis tical Jour nal , 52 (2), 173–202. https://hdl.handle.net/10520/EJC- 10cdc036ea 37. Jouber t, M., V erster, T ., & Raubenheimer, H. (2018b). Making use of sur viv al anal ysis to indirectly model loss giv en default. ORiON , 34 (2), 107–132. https://doi.org/10.5784/34- 2- 588 38. Jouber t, M., V erster, T ., Raubenheimer, H., & Schutte, W . D. (2021). A dapting the def ault w eighted sur viv al anal ysis modelling approach to model IFRS 9 LGD. Risks , 9 (6), 103. 39. Kaiv anto, K. (2008). Maximization of the sum of sensitivity and specificity as a diagnostic cutpoint cr iterion. Journal of Clinical Epidemiology , 61 (5), 517–518. https://doi.org/10.1016/j.jclinepi.2007.10. 011 40. Kaplan, E. L., & Meier, P . (1958). Non parametr ic estimation from incomplete observations. Jour nal of the American Statistical Association , 53 (282), 457–481. R etr ie ved March 2, 2023, from http://www.jstor. org/stable/2281868 41. Kar tsonaki, C. (2016). Survival analy sis. Diagnostic Histopat hology , 22 (7), 263–270. https://doi.org/ 10.1016/j.mpdhp.2016.06.005 42. Kleinbaum, D. G., & Klein, M. (2012). Sur viv al analysis: A self-lear ning text (3rd ed.). Spr ing er. https: //doi.org/10.1007/978- 1- 4419- 6646- 9 43. Larney, J., Allison, J. S., Grobler, G. L., & Smuts, M. (2023). Modelling the time to write-off of non- performing loans using a promotion time cure model with parametr ic frailty. Mathematics , 11 (10), 2228. https://doi.org/10.3390/math11102228 44. Larney, J., Botha, A., Grobler, G. L., & Raubenheimer, H. (2025). A cost of capital approach to deter mining the LGD discount rate. Sout h African A ctuarial Journal , 25 (1), 93–117. https :/ / doi. org /10 . 4314/ saaj.v25i1.4 45. Leo w, M., & Mues, C. (2012). Predicting loss giv en def ault (lgd) f or residential mor tgag e loans: A two-s tage model and empir ical evidence f or uk bank data. International Journal of F orecasting , 28 (1), 183–195. https://doi.org/10.1016/j.ijforecast.2011.01.010 42 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 46. Li, A., Li, Z., & Bellotti, A. (2023). Predicting loss giv en default of unsecured consumer loans with time-v ar ying sur viv al scores. P acific-Basin Finance Journal , 78 , 101949. https://doi.org/10.1016/j. pacfin.2023.101949 47. Loterman, G., Brown, I., Mar tens, D., Mues, C., & Baesens, B. (2012). Benchmarking regression algor ithms f or loss giv en default modeling. International Jour nal of F or ecasting , 28 (1), 161–170. https://doi.org/ 10.1016/j.ijforecast.2011.01.006 48. Menard, S. (2000). Coefficients of determination for multiple logis tic regression anal ysis. The American Statis tician , 54 (1), 17–24. https://doi.org/https://doi.org/10.1080/00031305.2000.10474502 49. Ptak -Chmiele wska, A., & K opciuszew ski, P . (2024). Random sur viv al fores ts and co x reg ression in loss giv en def ault estimation. Journal of Credit Risk . https://doi.org/10.21314/JCR.2024.003 50. Putter, H., F iocco, M., & Geskus, R. B. (2007). T utor ial in biostatistics: Competing r isks and multi-state models. Statis tics in medicine , 26 (11), 2389–2430. https://doi.org/10.1002/sim.2712 51. Schis ter man, E. F ., F araggi, D., R eiser, B., & Hu, J. (2008). Y ouden inde x and the optimal threshold for markers with mass at zero. Statistics in Medicine , 27 (2), 297–315. https://doi.org/10.1002/sim.2993 52. Schober, P ., & V etter , T . R. (2018). Sur viv al analy sis and inter pretation of time-to-ev ent data: The tor - toise and the hare. Anesthesia and Analg esia , 127 (3), 792–798. https : / / doi . org / 10 . 1213 / ANE . 0000000000003653 53. Schuermann, T . (2004). What do we kno w about loss giv en default? [Whar ton Financial Institutions Center W orking Paper]. 54. Sing er, J. D., & Willett, J. B. (1993). It’ s about time: Using discrete-time sur viv al analy sis to study duration and the timing of e vents. Jour nal of Educational Statistics , 18 (2), 155–195. https://doi.org/10.2307/ 1165085 55. Stepano va, M., & Thomas, L. (2002). Survival analy sis methods f or personal loan data. Operations Resear c h , 50 (2), 277–289. https://doi.org/10.1287/opre.50.2.277.426 56. Suresh, K., Sev ern, C., & Ghosh, D. (2022). Sur viv al prediction models: An introduction to discrete-time modeling. BMC medical resear ch methodology , 22 (1), 1–18. https://doi.org/10.1186/s12874- 022- 01679- 6 57. V an Ges tel, T ., & Baesens, B. (2009). Cr edit risk manag ement: Basic concepts . Oxf ord U niv ersity Press. https://doi.org/10.1093/acprof:oso/9780199545117.001.0001 58. Willett, J. B., & Sing er, J. D. (1995). It ’ s déjà vu all o ver again: Using multiple-spell discrete-time survival anal ysis. Journal of Educational and Behavioral Statis tics , 20 (1), 41–67. https: //doi. org/10 .2307/ 1165387 59. Witzan y, J., Ry chno vsky, M., & Charamza, P . (2012). Survival analy sis in LGD modeling. European F inancial and Accounting Journal , 7 (1), 6–27. https://doi.org/10.18267/j.efaj.12 60. W ood, R., & Po well, D. (2017). A ddressing probationar y per iod within a competing r isks survival model f or retail mor tg age loss giv en default. Jour nal of Cr edit Risk , 13 (3). https://doi. org/10.21314/JCR. 2017.228 61. Zeng, G. (2013). Metric div erg ence measures and information value in credit scor ing. Jour nal of Mathematics , 2013 . https://doi.org/10.1155/2013/848271 62. Zhang, J., & Thomas, L. C. (2012). Compar isons of linear reg ression and survival analy sis using single and mixture distributions approaches in modelling LGD. International Journal of F orecas ting , 28 (1), 204–215. https://doi.org/10.1016/j.ijforecast.2010.06.002 43 Deriving the ter m-structure of loan write-off risk under IFRS 9 b y using sur viv al analy sis: A benchmark study 63. Zhang, Y . (2013). Lik elihood-based and bay esian methods f or tw eedie compound poisson linear mixed models. Statis tics and Computing , 23 (6), 743–757. https://doi.org/10.1007/s11222- 012- 9343- 7 44
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment