Skip to main content

National Center for Ecological Analysis and Synthesis

Search Results

5561-5570 of 6313
  1. Publication

    Actor-oriented design of scientific workflows

    Scientific workflows are becoming increasingly important as a unifying mechanism for interlinking scientific data management, analysis, simulation, and visualization tasks. Scientific workflow systems are problem-solving environments, supporting scientists in the creation and execution of scientific workflows. While current systems permit the creation of executable workflows, conceptual modeling and design of scientific workflows has largely been neglected.

  2. Publication

    Towards automatic generation of semantic types in scientific workflows

    Scientific workflow systems are problem-solving environments that allow scientists to automate and reproduce data management and analysis tasks. Workflow components include actors (e.g., queries, transformations, analyses, simulations, visualizations), and datasets which are produced and consumed by actors. The increasing number of such components creates the problem of discovering suitable components and of composing them to form the desired scientific workflow. In previous work we proposed the use of semantic types (annotations relative to an ontology) to solve these problems.

  3. Publication

    A calculus for propagating semantic annotations through scientific workflow queries

    Scientific workflows facilitate automation, reuse, and reproducibility of scientific data management and analysis tasks. Scientific workflows are often modeled as dataflow networks, chaining together processing components (called actors) that query, transform, analyse, and visualize scientific datasets. Semantic annotations relate data and actor schemas with conceptual information from a shared ontology, to support scientific workflow design, discovery, reuse, and validation in the presence of thousands of potentially useful actors and datasets.

  4. Publication

    Enabling scientific workflow reuse through structured composition of dataflow and control-flow

  5. Publication

    A model for user-oriented data provenance in pipelined scientific workflows

  6. Publication

    Provenance in collection-oriented workflows

    We describe a provenance model tailored to scientific workflows based on the collection-oriented modeling and design paradigm. Our implementation within the Kepler scientific workflow system captures the dependencies of data and collection creation events on preexisting data and collections, and embeds these provenance records within the data stream. A provenance query engine operates on self-contained workflow traces representing serializations of the output data stream for particular workflow runs. We demonstrate this approach in our response to the first provenance challenge.

  7. Publication

    Project histories: Managing data provenance across collection-oriented scientific workflow runs

    While a number of scientific workflow systems support data provenance, they primarily focus on collecting and querying provenance for single workflow runs. Scientific research projects, however, typically involve (1) many interrelated workflows (where data from one or more workflow runs are selected and used as input to subsequent runs) and (2) tasks between workflow runs that cannot be fully automated. This paper addresses the need for recording data dependencies across multiple workflow runs and accommodating data management activities performed between runs.

  8. Publication

    Kepler/ppod: Scientific workflow and provenance support for assembling the tree of life

    The complexity of scientific workflows for analyzing biological data creates a number of challenges for current workflow and provenance systems. This complexity is due in part to the nature of scientific data (e.g., heterogeneous, nested data collections) and the programming constructs required for automation (e.g., nested workflows, looping, pipeline parallelism). We present an extended version of the Kepler scientific workflow system to address these challenges, tailored for the systematics community.

  9. Publication

    A conceptual modeling framework for expressing observational data semantics

    Observational data (i.e., data that records observations and measurements) plays a key role in many scientific disciplines. Observational data, however, are typically structured and described in ad hoc ways, making its discovery and integration difficult. The wide range of data collected, the variety of ways the data are used, and the needs of existing analysis applications make it impractical to define “one-size-fits-all” schemas for most observational data sets. Instead, new approaches are needed to flexibly describe observational data for effective discovery and integration.