Toggle Main Menu Toggle Search

ePrints

From scripted HPC-based NGS pipelines to workflows on the cloud

Lookup NU author(s): Dr Jacek Cala, Yaobo Xu, Dr Paolo Missier

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

In this paper we describe our initial experiences in the Cloud-e-Genome project with moving the whole exome sequencing pipeline from the scripted HPC-based solution to a workflow enactment system running in the cloud. We discuss shortcomings of the existing approach based on scripts and list benefits that a workflow-based solution can provide. Despite the effort it involved to wrap all required tools in the form of workflow blocks and the restrictions of the dataflow model used to represent workflows we expect the migration to significantly improve the current status of the pipeline. Our target is to enable flexibility, traceability and reproducibility of the solution, so that it can better fit the evolution of tools, data and pipeline itself and allow us to run it at national scale. This work will become foundation for the more complete system that includes variant filtering and interpretation for the diagnostic purposes.


Publication metadata

Author(s): Cala J, Xu YB, Wijaya EA, Missier P

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

Year of Conference: 2014

Pages: 694-700

Online publication date: 08/07/2014

Acceptance date: 01/01/1900

Publisher: IEEE

URL: http://dx.doi.org/10.1109/CCGrid.2014.128

DOI: 10.1109/CCGrid.2014.128

Library holdings: Search Newcastle University Library for this item

ISBN: 9781479927845


Actions

    Link to this publication


Share