Towards a Safe and Latency-Aware Fault-tolerant Scheduling Technique for Multi-rate Task Chains

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Downloads (Pure)

Abstract

In safety-critical real-time systems such as autonomous cars, fault-tolerance is essential for system reliability but can increase end-to-end latency and hinder schedulability. This paper presents a novel, safe, and latency-aware fault-tolerant scheduling technique for multi-rate task chains. A naive use of traditional fault-tolerance mechanisms, such as checkpointing and re-execution with recovery blocks, can violate end-to-end latency requirements. Our technique uses recovery blocks but leverages inherent task redundancies in multi-rate task chains (where data producers and consumers have different periods) to reduce the need for recovery. Moreover, it determines the priority of the recover blocks such that the end-to-end latency of the task chain is reduced in the presence of transient faults. Our experiments show that our technique significantly improves schedulability and reduces data age compared to the state-of-the-art checkpointing method. For instance, for systems with 4 to 16 cores and 10 to 40 tasks, we achieve up to 6 times higher schedulability and reduce data age by 21% under various fault levels.
Original languageEnglish
Title of host publicationRTNS '24
Subtitle of host publicationProceedings of the 32nd International Conference on Real-Time Networks and Systems
Place of PublicationNew York
PublisherAssociation for Computing Machinery, Inc
Pages25-36
Number of pages12
ISBN (Electronic)979-8-4007-1724-6
DOIs
Publication statusPublished - 3 Jan 2025
Event32nd International Conference on Real-Time Networks and Systems, RTNS 2024 - Porto, Portugal
Duration: 6 Nov 20248 Nov 2024

Conference

Conference32nd International Conference on Real-Time Networks and Systems, RTNS 2024
Abbreviated titleRTNS 2024
Country/TerritoryPortugal
CityPorto
Period6/11/248/11/24

Fingerprint

Dive into the research topics of 'Towards a Safe and Latency-Aware Fault-tolerant Scheduling Technique for Multi-rate Task Chains'. Together they form a unique fingerprint.

Cite this