Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and are used to accelerate the pace of scientific discovery. In many disciplines, individual workflows are large and complicated due to the large quantities of data used. As such, the workflow construction is difficult or even impossible when relevant domain knowledge is missing or the workflows require collaboration within multiple domains. Recent efforts from scientific workflow community aiming at large-scale capturing of provenance present a new opportunity for using provenance to provide recommendations during building scientific workflows. This paper presents a method based on provenance to mine models for scientific workflows, including data and control dependency. The mining result can either suggest part of others' workflows for consideration, or make familiar part of workflow easily accessible, thus provide recommendation support for scientific workflow composition.
|Title of host publication
|Proceedings of the 2011 IEEE World Congress on Services (SERVICES 2011, Washington DC, USA, July 4-9, 2011)
|Institute of Electrical and Electronics Engineers
|Published - 2011