High performance computing for haplotyping: models and platforms

Andrea Tangherloni, Leonardo Rundo, Simone Spolaor, Marco S. Nobile, Ivan Merelli, Daniela Besozzi, Giancarlo Mauri, Paolo Cazzaniga, Pietro Liò

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Citations (Scopus)

Abstract

The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.

Original languageEnglish
Title of host publicationEuro-Par 2018
Subtitle of host publicationParallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers
EditorsGabriele Mencagli, Dora B. Heras
PublisherSpringer
Pages650-661
Number of pages12
ISBN (Electronic)978-3-030-10549-5
ISBN (Print)978-3-030-10548-8
DOIs
Publication statusPublished - 1 Jan 2019
Externally publishedYes
Event24th International Conference on Parallel and Distributed Computing, Euro-Par 2018 - Turin, Italy
Duration: 27 Aug 201828 Aug 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11339 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Parallel and Distributed Computing, Euro-Par 2018
CountryItaly
CityTurin
Period27/08/1828/08/18

Fingerprint

Haplotype
High Performance
Chromosomes
Computing
Chromosome
Nanopores
Bioinformatics
Nucleotides
Polymorphism
Nanopore
Single nucleotide Polymorphism
Error Model
Empirical Model
Computational complexity
Genes
Genetic algorithms
Leverage
Model
Sequencing
Divides

Keywords

  • Future-generation sequencing
  • Genome Analysis Haplotype Assembly
  • High Performance Computing
  • Master-Slave paradigm

Cite this

Tangherloni, A., Rundo, L., Spolaor, S., Nobile, M. S., Merelli, I., Besozzi, D., ... Liò, P. (2019). High performance computing for haplotyping: models and platforms. In G. Mencagli, & D. B. Heras (Eds.), Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers (pp. 650-661). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11339 LNCS). Springer. https://doi.org/10.1007/978-3-030-10549-5_51
Tangherloni, Andrea ; Rundo, Leonardo ; Spolaor, Simone ; Nobile, Marco S. ; Merelli, Ivan ; Besozzi, Daniela ; Mauri, Giancarlo ; Cazzaniga, Paolo ; Liò, Pietro. / High performance computing for haplotyping : models and platforms. Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers. editor / Gabriele Mencagli ; Dora B. Heras. Springer, 2019. pp. 650-661 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{3069e099d83e42b5a69f5d690cafb90d,
title = "High performance computing for haplotyping: models and platforms",
abstract = "The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.",
keywords = "Future-generation sequencing, Genome Analysis Haplotype Assembly, High Performance Computing, Master-Slave paradigm",
author = "Andrea Tangherloni and Leonardo Rundo and Simone Spolaor and Nobile, {Marco S.} and Ivan Merelli and Daniela Besozzi and Giancarlo Mauri and Paolo Cazzaniga and Pietro Li{\`o}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-10549-5_51",
language = "English",
isbn = "978-3-030-10548-8",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "650--661",
editor = "Gabriele Mencagli and Heras, {Dora B.}",
booktitle = "Euro-Par 2018",
address = "Germany",

}

Tangherloni, A, Rundo, L, Spolaor, S, Nobile, MS, Merelli, I, Besozzi, D, Mauri, G, Cazzaniga, P & Liò, P 2019, High performance computing for haplotyping: models and platforms. in G Mencagli & DB Heras (eds), Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11339 LNCS, Springer, pp. 650-661, 24th International Conference on Parallel and Distributed Computing, Euro-Par 2018, Turin, Italy, 27/08/18. https://doi.org/10.1007/978-3-030-10549-5_51

High performance computing for haplotyping : models and platforms. / Tangherloni, Andrea; Rundo, Leonardo; Spolaor, Simone; Nobile, Marco S.; Merelli, Ivan; Besozzi, Daniela; Mauri, Giancarlo; Cazzaniga, Paolo; Liò, Pietro.

Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers. ed. / Gabriele Mencagli; Dora B. Heras. Springer, 2019. p. 650-661 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11339 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - High performance computing for haplotyping

T2 - models and platforms

AU - Tangherloni, Andrea

AU - Rundo, Leonardo

AU - Spolaor, Simone

AU - Nobile, Marco S.

AU - Merelli, Ivan

AU - Besozzi, Daniela

AU - Mauri, Giancarlo

AU - Cazzaniga, Paolo

AU - Liò, Pietro

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.

AB - The reconstruction of the haplotype pair for each chromosome is a hot topic in Bioinformatics and Genome Analysis. In Haplotype Assembly (HA), all heterozygous Single Nucleotide Polymorphisms (SNPs) have to be assigned to exactly one of the two chromosomes. In this work, we outline the state-of-the-art on HA approaches and present an in-depth analysis of the computational performance of GenHap, a recent method based on Genetic Algorithms. GenHap was designed to tackle the computational complexity of the HA problem by means of a divide-et-impera strategy that effectively leverages multi-core architectures. In order to evaluate GenHap’s performance, we generated different instances of synthetic (yet realistic) data exploiting empirical error models of four different sequencing platforms (namely, Illumina NovaSeq, Roche/454, PacBio RS II and Oxford Nanopore Technologies MinION). Our results show that the processing time generally decreases along with the read length, involving a lower number of sub-problems to be distributed on multiple cores.

KW - Future-generation sequencing

KW - Genome Analysis Haplotype Assembly

KW - High Performance Computing

KW - Master-Slave paradigm

UR - http://www.scopus.com/inward/record.url?scp=85061719471&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-10549-5_51

DO - 10.1007/978-3-030-10549-5_51

M3 - Conference contribution

AN - SCOPUS:85061719471

SN - 978-3-030-10548-8

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 650

EP - 661

BT - Euro-Par 2018

A2 - Mencagli, Gabriele

A2 - Heras, Dora B.

PB - Springer

ER -

Tangherloni A, Rundo L, Spolaor S, Nobile MS, Merelli I, Besozzi D et al. High performance computing for haplotyping: models and platforms. In Mencagli G, Heras DB, editors, Euro-Par 2018: Parallel Processing Workshops - Euro-Par 2018 International Workshops, Revised Selected Papers. Springer. 2019. p. 650-661. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-10549-5_51