Incremental workflow improvement through analysis of its data provenance

  1. Lookup NU author(s)
  2. Dr Paolo Missier
Author(s)Missier P
Editor(s)Buneman, P., Freire, J.
Publication type Conference Proceedings (inc. Abstract)
Conference Name3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP)
Conference LocationHeraklion, Crete, Greece
Year of Conference2011
Legacy Date20-21 June 2011
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
Repeated executions of resource-intensive workflows over a large number of runs are commonly observed in e-science practice. We explore the hypothesis that, in some cases, provenance traces recorded for past runs of a workflow can be used to make future runs more efficient. This investigation is an initial step into the systematic study of the role that provenance analysis can play in the broader context of self-managing software systems. We have tested our hypothesis on a concrete case study involving a Chemical Engineering workflow deployed on a cloud infrastructure, where we can measure the cost of its repeated execution. Our approach involves augmenting the workflow with a feedback loop in which incremental analysis of the provenance of past runs is used to control some of the workflow steps in subsequent executions. We present initial experimental results and hint at future improvements as part of ongoing work.