Our societies are increasingly dependent on services supplied by computers & their software. New technology only exacerbates this dependence by increasing the number, performance, and degree of autonomy and inter-connectivity of software-empowered computers and cyber-physical "things", which translates into unprecedented scenarios of interdependence. As a consequence, guaranteeing the persistence-of-identity of individual & collective software systems and software-backed organisations becomes an important prerequisite toward sustaining the safety, security, & quality of the computer services supporting human societies. Resilience is the term used to refer to the ability of a system to retain its functional and non-functional identity. In this article we conjecture that a better understanding of resilience may be reached by decomposing it into ancillary constituent properties, the same way as a better insight in system dependability was obtained by breaking it down into sub-properties. 3 of the main sub-properties of resilience proposed here refer respectively to the ability to perceive environmental changes; understand the implications introduced by those changes; and plan & enact adjustments intended to improve the system-environment fit. A fourth property characterises the way the above abilities manifest themselves in computer systems. The 4 properties are then analyzed in 3 families of case studies, each consisting of 3 software systems that embed different resilience methods. Our major conclusion is that reasoning in terms of resilience sub-properties may help revealing the characteristics and limitations of classic methods and tools meant to achieve system and organisational resilience. We conclude by suggesting that our method may prelude to meta-resilient systems -- systems, that is, able to adjust optimally their own resilience with respect to changing environmental conditions.
Computer systems are not dissimilar from critical infrastructures in which different mutually dependent componentsin fact, infrastructures themselves-contribute to the emergence of an intended service. Thus, in computer systems the software infrastructure relies on the quality of the hardware infrastructure, and vice-versa a hypothetically perfect hardware would not result in the intended service without a corresponding healthy software infrastructure. Software resilience refers to the robustness of the software infrastructure and may be defined as the trustworthiness of a software system to adapt itself so as to absorb and tolerate the consequences of failures, attacks, and changes within and without the system boundaries. As a resilient body is one that "subjected to an external force is able to recover its size and shape, following deformation" (Harris, 2005), likewise software is said to be resilient when it is able to recover its functional and non-functional characteristics-its "identity"-following failures, attacks, and environmental changes. As critical infrastructures call for organisational resilience, likewise mission-and businesscritical computer systems call for software resilience. Understanding and mastering software resilience is a key prerequisite towards being able to design effective services for complex and ever changing deployment environments such as those characterizing, e.g., ubiquitous and pervasive environments.
In what follows we consider resilience as a collective property, i.e., a property better captured by considering a number of sub-properties, each of which focuses on a particular aspect of the whole. This method was successfully used by Laprie and others to better characterize dependability as a collective property described by constituent attributes-availability, reliability, safety, maintainability, and others (Laprie, 1985;Laprie, 1995). A major result of applying this decomposition method is that it provides us with a “base” of attributes with which one may describeand limit-the characteristics of existing methods, systems, and algorithms. Thanks to Laprie’s efforts we now can more precisely describe the behaviours of a computer entity as being, e.g., reliable, but not available; or safe, but not reliable; and so forth. In turn, this makes it more apparent whether a certain solution matches an intended mission or a given hypothesised environment. This also helps reasoning about the consequences of erroneous deployments or those related to the drifting between the system assumptions and the actual environmental conditions (De Florio, 2010).
Aim of the current paper is to apply to resilience the same method that Laprie and others applied to dependability. Our conjecture and hope is that similar insight may ensue from this, and that our sub-properties may prove to constitute an effective resilience base, namely a set of independent and complementary variables useful to reason about the qualities and the shortcomings of a given resilient entity with respect to its deployment conditions.
The rest of the current paper is structured as follows: In Section 2 we define the terms and introduce the main method of our discussion and our base of orthogonal sub-properties of resilience. In Section 3 we use our base of sub-properties to describe and characterise three families of systems. Each family consists of a non-resilient member and two adaptive or resilient reformulations from our current and past research activity. In this section we also show how discussing the resilience of a (software) system in terms of its constituent properties helps exposing several important aspects of those systems, including their complexity, predictability, and dependence on design assumptions. We conjecture that this may provide designers with a convenient tool to reason about the cost-effectiveness of resilient methods and approaches. In turn, this feature may pave the way towards future autonomic meta-resilient systems-namely systems able to self-optimise their own resilience with respect to variable environmental conditions. Our conclusions are finally drawn in Section 4.
In this section we introduce the main terms and the method of our discussion. Our starting point is behavioural and extends the classic works of Rosenblueth (Rosenblueth et al., 1943) and Boulding (Boulding, 1956). The focus here is not on general systems behaviour but rather on resilient behaviours-those behaviours that are meant to guarantee the functional and non-functional identity of the system at hand. In what follows we first define resilience and then characterise four classes of behaviours that are typical of resilient systems. Finally, we present four independent properties that-we conjecture-collectively describe the major aspects of resilience.
Resilience is defined in (Meyer, 2009) as the ability to tolerate (or even profit from) the onset of unanticipated changes and environmental conditions that might otherw
This content is AI-processed based on open access ArXiv data.