Towards optimal resource allocation in partial-fault tolerant applications

N. Bansal, R. Bhagwan, N. Jain, Y. Park, D.S. Turaga, C. Venkatramani

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    22 Citations (Scopus)

    Abstract

    We introduce Zen, a new resource allocation framework that assigns application components to node clusters to achieve high availability for partial-fault tolerant (PFT) applications. These applications have the characteristic that under partial failures, they can still produce useful output though the output quality may be reduced. Thus, the primary goal of resource allocation for PFT applications is to prevent, delay, or minimize the impact of failures on the application output quality. This paper is the first to approach this resource allocation problem from a theoretical perspective, and obtains a series of results regarding component assignments that provide the highest service availability under the constraints imposed by the application data flow graph and the hosting clusters. We show that (1) even simple versions of this resource allocation problem are NP-Hard, (2) a 2-approximate polynomial-time algorithm works for tree topologies, and (3) a simple greedy component placement performs well in practice for general application topologies. We implement a system prototype to study the application availability achieved by Zen compared to failure-oblivious placement, replication, and Zen+replication. Our experimental results show that three PFT applications achieve significant data output quality and availability benefits using Zen.
    Original languageEnglish
    Title of host publicationProceedings 27th IEEE International Conference on Computer Communications (INFOCOM 2008, Phoenix AZ, USA, April 13-18, 2008)
    PublisherInstitute of Electrical and Electronics Engineers
    Pages1319-1327
    ISBN (Print)978-1-4244-2025-4
    DOIs
    Publication statusPublished - 2008
    Event27th Conference on Computer Communications (IEEE INFOCOM 2008), 13 – 18 April 2008, Phoenix, AZ - Phoenix, AZ, United States
    Duration: 13 Apr 200818 Jun 2008
    http://www.ieee-infocom.org/2008

    Conference

    Conference27th Conference on Computer Communications (IEEE INFOCOM 2008), 13 – 18 April 2008, Phoenix, AZ
    Abbreviated titleIEEE INFOCOM 2008
    CountryUnited States
    CityPhoenix, AZ
    Period13/04/0818/06/08
    Internet address

    Fingerprint

    Dive into the research topics of 'Towards optimal resource allocation in partial-fault tolerant applications'. Together they form a unique fingerprint.

    Cite this