Bowers, Shawn; McPhillips, Timothy; Riddle, Sean; Anand, M.; Ludaescher, Bertram. 2008. Kepler/ppod: Scientific workflow and provenance support for assembling the tree of life. Proceedings of the International Provenance and Annotation Workshop, LNCS. Vol: 5272. Pages 70-78. (Abstract) (Online version)
The complexity of scientific workflows for analyzing biological data creates a number of challenges for current workflow and provenance systems. This complexity is due in part to the nature of scientific data (e.g., heterogeneous, nested data collections) and the programming constructs required for automation (e.g., nested workflows, looping, pipeline parallelism). We present an extended version of the Kepler scientific workflow system to address these challenges, tailored for the systematics community. Our system combines novel approaches for representing scientific data, modeling and automating complex analyses, and recording and browsing associated provenance information.