Quantifying the resiliency of fail-operational real-time networked control systems

Arpan Gujarati, Mitra Nasri, Björn B. Brandenburg

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Citations (Scopus)

Abstract

In time-sensitive, safety-critical systems that must be fail-operational, active replication is commonly used to mitigate transient faults that arise due to electromagnetic interference (EMI). However, designing an effective and well-performing active replication scheme is challenging since replication conflicts with the size, weight, power, and cost constraints of embedded applications. To enable a systematic and rigorous exploration of the resulting tradeoffs, we present an analysis to quantify the resiliency of fail-operational networked control systems against EMI-induced memory corruption, host crashes, and retransmission delays. Since control systems are typically robust to a few failed iterations, e.g., one missed actuation does not crash an inverted pendulum, traditional solutions based on hard real-time assumptions are often too pessimistic. Our analysis reduces this pessimism by modeling a control system's inherent robustness as an (m, k)-firm specification. A case study with an active suspension workload indicates that the analytical bounds closely predict the failure rate estimates obtained through simulation, thereby enabling a meaningful design-space exploration, and also demonstrates the utility of the analysis in identifying non-trivial and non-obvious reliability tradeoffs.

Original languageEnglish
Title of host publication30th Euromicro Conference on Real-Time Systems, ECRTS 2018
EditorsSebastian Altmeyer
PublisherSchloss Dagstuhl - Leibniz-Zentrum für Informatik
ISBN (Electronic)9783959770750
DOIs
Publication statusPublished - 1 Jun 2018
Externally publishedYes
Event30th Euromicro Conference on Real-Time Systems, ECRTS 2018 - Barcelona, Spain
Duration: 3 Jun 20186 Jun 2018

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume106
ISSN (Print)1868-8969

Conference

Conference30th Euromicro Conference on Real-Time Systems, ECRTS 2018
Country/TerritorySpain
CityBarcelona
Period3/06/186/06/18

Keywords

  • Networked control systems
  • Probabilistic analysis
  • Reliability analysis

Fingerprint

Dive into the research topics of 'Quantifying the resiliency of fail-operational real-time networked control systems'. Together they form a unique fingerprint.

Cite this