Industrial Computing Systems: A Case Study of Fault Tolerance Analysis
Fault tolerance is a key factor of industrial computing systems design. But in practical terms, these systems, like every commercial product, are under great financial constraints and they have to rem
Fault tolerance is a key factor of industrial computing systems design. But in practical terms, these systems, like every commercial product, are under great financial constraints and they have to remain in operational state as long as possible due to their commercial attractiveness. This work provides an analysis of the instantaneous failure rate of these systems at the end of their life-time period. On the basis of this analysis, we determine the effect of a critical increase in the system failure rate and the basic condition of its existence. The next step determines the maintenance scheduling which can help to avoid this effect and to extend the system life-time in fault-tolerant mode.
💡 Research Summary
The paper addresses the reliability and fault‑tolerance design of industrial computing systems, focusing on the critical period at the end of their useful life when the instantaneous failure rate begins to rise sharply. Starting from the well‑known bathtub curve, the authors model the time‑dependent failure rate λ(t) of individual modules using a Weibull distribution with shape parameter β > 1 to capture wear‑out behavior. The system is assumed to consist of N identical modules arranged in a 1‑out‑of‑N fault‑tolerant architecture, a common pattern in programmable logic controllers (PLCs) and supervisory control and data acquisition (SCADA) installations. By applying reliability block diagram theory and a continuous‑time Markov model, the overall system failure rate λ_sys(t) is derived as λ_sys(t) = C(N, 1)·λ(t)·
📜 Original Paper Content
🚀 Synchronizing high-quality layout from 1TB storage...