Global Grid Forum 2004 Special Issue of Concurrency and
Computation:Practice and Experience
Original Call can be found at:
Special Journal Issue for Grid
Workflow 2004
Original Workshop is at
http://www.extreme.indiana.edu/groc/ggf10-ww/index.html
- 854: Choreography for the Grid: Towards Fitting BPEL to the
Resource Framework
- Abstract: The inherent heterogeneity of the Grid demands
the ability to specify choreographies in portable manner. This ensures that a
choreography once specified can be deployed and executed in every workflow
system within a Grid environment. Likely, BPEL will have the corresponding
broad support in the industry. In order to become first class citizens in the
Grid choreographies have to comply with the resource framework. We suggest
steps to make BPEL compliant with the resource framework. As a result, features
of BPEL like extended transactions will be available in a Grid
environment.
- Frank Leymann
- University of Stuttgart, Germany & IBM Software Group
Frank.Leymann@informatik.uni-stuttgart.de
- Email: Frank.Leymann@informatik.uni-stuttgart.de
- Received June 13 2004, Revised February 14 2005, Accepted April
27 2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c854leymann/c854Leymann.pdf
- FINAL VERSION
Choreography
for the Grid: Towards Fitting BPEL to the Resource Framework
- 855: Refactoring Service-Based Systems: How to Avoid
Trusting a Workflow Service
- Abstract: Workflow processes in distributed systems often
span different organizations and security domains, and have security
requirements, such as restricting access to data or ensuring that process
constraints are observed. These requirements are usually managed by the
workflow component, because of the close association between this sub-system
and the processes it enacts; however, high quality security mechanisms and
complex functionality are difficult to combine, so designers and users of
workflow systems are faced with a tradeoff between security and functionality,
which is unable to provide confidence in the security implementation. This
paper resolves that tension by showing that process security can be enforced
outside the workflow component. Separating security and process functionality
in this way improves the quality of security protection, because it is
implemented by standard system mechanisms; it also allows the workflow
component to be deployed as a service, rather than a privileged system
component. To make this change of design philosophy accessible outside the
security community it is documented as a collection of refactorings, which
include problem templates that identify risky security practice, and target
patterns that provide solutions. Two worked examples show that these patterns
can be combined to implement practical process requirements.
- Howard Chivers
- Department of Computer Science, University of York, Heslington,
York, YO10 5DD, UK.
- Email: chive@cs.york.ac.uk
- Received June 1 2004, Revised March 3 2005, Accepted April 27
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c855chivers/c855refactoring.pdf
- FINAL VERSION
Refactoring
Service-Based Systems: How to Avoid Trusting a Workflow Service and
Extras
- 856: A Grid Workflow Infrastructure
- Abstract: In this paper we propose a Grid Workflow
Infrastructure, which serves as the base for specifying and executing
collaborative interactive workflows within computational grids. The
infrastructure is based on the Open Grid Services Architecture (OGSA) and
leverages the concepts of the Business Process Execution Language for Web
Services (BPEL4WS). Using OGSA enables us to exploit advanced Grid features
such as factories, lifecycle management and notifications. Leveraging BPEL4WS
to a Grid enabled workflow language has the advantage that basic workflow
functionalities, which are similar for Grid and Web Services, do not have to be
developed again. The result is a state of the art Grid Workflow Infrastructure
that was developed within a relatively short period. The main building blocks
of the infrastructure are the specification of the Grid Workflow Execution
Language (GWEL) notation and the implementation of a Grid workflow execution
engine, using Globus Toolkit 3 (GT3) technology, for processing e-Science
specific workflows specified in GWEL documents. The workflow engine itself is a
high-level Grid Service, hence automatically Grid aware, that can be used
within any GT3 environment.
- Dieter Cybok
- msg systems ag, Munich, Germany
- Email: dcy@gmx.net
- Received May 28 2004, Revised April 3 2005, Accepted April 27
2005
- Full Paper:../CCPEwebresource/c8545to872workflow/c856cybok/c856GWI_GGF10.pdf
- FINAL VERSION
A
Grid Workflow Infrastructure
- 857: Work-flow applications in PROGRESS and GridLab
environments
- Abstract: Nowadays, end users and application developers
are interested in building complex computing experiments consisting of
thousands of tasks and services which have to be executed or dynamically
created in distributed computing environments. Furthermore, users are able to
define a lot of different kinds of relationships and dependencies among tasks
in advance, for instance as a flow of data input and output files which must be
transmitted in order to appropriately setup a tasks environment. That
kind of computing experiments is often called work-flow, which
simply denotes a flow of execution tasks on which static or dynamic precedence
constraints have been defined by the user. These precedence constraints, e.g.
defined as a directed acyclic graph schema, result in a set of multiple
parallel and sequential tasks which have to be managed by a meta-scheduling
system to enforce ordering. If we realize the complexity of distributed
infrastructures on which work-flow experiments have to be performed
efficiently, some additional mechanisms to the meta-scheduling system are
required as well, e.g. data replica and management services. All these services
together should build a consistent and robust middleware layer and finally
support the whole process of work-flow execution. Basing on such well-defined
middleware services it is possible to provide higher level APIs and various
graphic access layers. Both higher level APIs and access layers should be
designed and then implemented in the way that allows end users to create even
very complex work-flow experiments in a friendly fashion. Therefore, various
graphic tools, portals and GUIs are nowadays desirable for end users as much as
the underlying middleware services. Research works in the field of grid
work-flows have also been undertaken in Poznan Supercomputing and Networking
Center. In this paper we present the results of these works, which have been
realized within the PROGRESS and GridLab projects.
- Michal Kosiedowski, Krzysztof Kurowski, Cezary Mazurek, Jarek
Nabrzyski, Juliusz Pukacki
- Poznan Supercomputing and Networking Center
- Email: kat@man.poznan.pl
- Received June 1 2004, Revised 23 March 2005, Accepted April 27
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c857kosiedowski/c857poznan_workflow.pdf
- FINAL VERSION
Workflow
applications in GridLab and PROGRESS projects
- 858: Grid-Flow: A Grid-Enabled Scientific Workflow System
with a Petri Net-Based Interface
- Abstract: In recent years, advances in computer
technologies have enabled scientists to explore research issues in their domain
at scales greater and finer than ever before. The availability of efficient
data collection and analysis tools presents researchers with vast opportunities
to process heterogeneous data within a distributed environment. To support the
opportunities enabled by massive computation, a suitable scientific workflow
system is needed to help the users to manage data and programs, and to design
reusable procedures of scientific experimental tasks. In this paper, the design
and prototype implementation of a scientific workflow infrastructure, called
Grid-Flow, is presented. Grid-Flow assists researchers in specifying scientific
experiments using a Petri Net-based interface. The Grid-Flow infrastructure is
designed as a Service Oriented Architecture (SOA) with multi-layer component
models. The contributions of Grid-Flow are as follows: 1) a new, light-weight,
programmable Grid workflow language, Grid Flow Description Language (GFDL), is
provided to describe the workflow process in a Grid environment; 2) a Petri
Net-based user interface, based on the Generic Modeling Environment (GME), is
demonstrated to help the user design the workflow process with a Petri Net
model; and 3) a program integration component of the Grid-Flow system is
presented to integrate all possible programs into the system.
- Zhijie Guan, Francisco Hernandez, Purushotham Bangalore, Jeff
Gray, Anthony Skjellum, Vijay Velusamy, Yin Liu
- Department of Computer and Information Sciences University of
Alabama at Birmingham
- Email: zhijie@ardra.hpcl.cis.uab.edu
- Received June 2 2004, Revised 15 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c858guan/c858GGF10Grid-Flow.pdf
- FINAL VERSION
Grid-Flow:
A Grid-Enabled Scientific Workflow System with a Petri Net-Based
Interface
- 859: Considerations on Constraint Modeling in Grid
Application Workflow Descriptions
- Abstract: The Grid is emerging as a specialized
distributed computation standard of unprecedented power and scope, promising to
turn commodity networks and computers into commodity computation. The Grid
concept has already been proven useful for science in many applications and
substantial infrastructure already exists or is being planned. Data processing
on the Grid ranges from tightly coupled computation within a single application
instance using standard interfaces such as MPI to multiple filtering or data
processing applications with large data flows between them. At the same time,
the requirements on such aggregate processing jobs must be coordinated among
many individual researchers or research groups within a Virtual Organization
(VO) or between multiple Grids. Especially in the case of workflows containing
a moderate number of application steps and service invocations, it becomes a
daunting task to check that all of the input parameters and software
configurations conform to decisions made at the collaboration level. It is
useful to have a language that is able to specify constraints on the parameters
of the individual workflow steps that bring them into line with collaborative
decisions coherently across the entire workflow, possibly even dynamically as
late decisions are being made about execution environment. In the paper, it
will be shown that:
- Collections of constraints can be gathered into documents
called contexts that function as operators on existing workflow graphs. An
algebra of contexts supporting composition can help different subgroups within
a VO work together though constraint sharing. Decomposition of contexts can
allow for variance of constraints simultaneously across several different
categories.
- Constraint expressions and contexts form an interesting and
hitherto largely unexplored area of data provenance. Knowledge of the
constraints implies that it is possible not only to know the values of
application input parameters, but also why they were set in particular ways.
- A web services infrastructure supporting the distribution of
constraints and allowing for delayed operation of constraints to workflows so
that constraint resolution can be delayed and can become part of the job
planning process.
The techniques developed here will find fruitful application
in the aspects of organizing input parameters and constraints and in the
aspects of sharing and enforcing collaborative decisions about those
constraints. A partial implementation of these ideas already exists in a
workflow building tool called MCRunjob for the Compact Muon Solenoid (CMS)
experiment, an High Energy Physics experiment based at the European Center for
Nuclear Research (CERN) in Geneva, Switzerland.
- Greg Graham, Anzar Afaq, David Evans, Gerald Guglielmo, Eric
Wicklund, Peter Love
- Fermi National Accelerator Laboratory, Batavia, IL, 60510-0500,
USA;
- Email: ggraham@fnal.gov
- Received June 2 2004, Revised 15 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c859graham/c859GGF10-GGraham.pdf
- FINAL VERSION
Contextual
Constraint Modeling in Grid Application Workflows
- 860: ScyFlow: An Environment for the Visual Specification
and Execution of Scientific Workflows
- Abstract: With the advent of grid technologies, scientists
and engineers are building more and more complex applications to utilize
distributed grid resources. The core grid services provide a path for accessing
and utilizing these resources in a secure and seamless fashion. However what
the scientists need is an environment that will allow them to specify their
application runs at a high organizational level, and then support efficient
execution across any given set or sets of resources. We have been designing and
implementing ScyFlow, a dual-interface architecture (both GUI and API) that
addresses this problem. The scientist/user specifies the application tasks
along with the necessary control and data flow, and monitors and manages the
execution of the resulting workflow across the distributed resources. In this
paper, we utilize two scenarios to provide the details of the two modules of
the project, the visual editor and the runtime workflow engine.
- Karen M. McCann, Maurice Yarrow, Adrian DeVivo, Piyush
Mehrotra
- NASA Ames Research Center Mail: MS T27A-1, NASA Ames Research
Center, Moffett Field, CA 94035; Computer Sciences Corporation
- Email: Piyush.Mehrotra@nasa.gov
- Received June 3 2004, Revised 18 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c860mehrota/c860ScyGate_article.pdf
- FINAL VERSION
ScyFlow:
An Environment for the Visual Specification and Execution of Scientific
Workflows
- 861: Automatic Grid Workflow Based on Imperative
Programming Languages
- Abstract: GRID superscalar is a GRID programming
environment that enables to parallelize the execution of sequential
applications in computational Grids. The run-time library automatically builds
a task data-dependence graph of the application, and, it can be seen as an
implicit workflow system. The current interface supports C/C++ and Perl
applications. The run-time library is based on Globus Toolkit 2.x using GRAM
and GSIFTP services. In this document we describe the GRID superscalar basics
emphasizing those aspects more related to Grid workflow, specially the
flexibility of using an imperative language to describe the application.
- Raul Sirvent, Josep M. Perez, Rosa M. Badia, and Jesus
Labarta
- CEPBA-IBM Research Institute, UPC, SPAIN
- Email: rosab@ac.upc.es
- Received June 4 2004, Revised 15 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c861badia/c861rosab.pdf
- FINAL VERSION
Automatic
Grid Workflow Based on Imperative Programming Languages and
Extras
- 862: GAUGE: Grid Automation and Generative Environment
- Abstract: The Grid has proven to be a successful paradigm
for distributed computing. However, constructing applications that exploit all
the benefits that the Grid offers is still not optimal for both inexperienced
and experienced users. Recent approaches to solving this problem employ a
high-level abstract layer to ease the construction of applications for
different Grid environments. These approaches help facilitate construction of
Grid applications, but they are still tied to specific programming languages or
platforms. A new approach is presented in this paper that uses concepts of
domain-specific modeling (DSM) to build a high-level abstract layer. With DSM,
the users are able to model Grid applications without being bound to specific
programming languages or platforms. An additional benefit of DSM provides the
capability to generate software artifacts for various Grid environments. This
paper presents the Grid Automation and Generative Environment (GAUGE). The goal
of GAUGE is to automate the generation of Grid applications to allow
inexperienced users to exploit the Grid fully. At the same time, GAUGE provides
an open framework in which experienced users can build upon and extend to
tailor their applications to particular Grid environments or specific
platforms. GAUGE employs domain-specific modeling techniques to accomplish this
challenging task.
- Francisco Hernandez, Purushotham Bangalore, Jeff Gray, Zhijie
Guan, Kevin Reilly
- Department of Computer and Information Sciences University of
Alabama at Birmingham
- Email: puri@cis.uab.edu
- Received June 1 2004, Revised 16 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c862hernandez/c862Hernandez%20-%20GAUGE.pdf
- FINAL VERSION
GAUGE:
Grid Automation and Generative Environment
- 863: Programming Scientific and Distributed Workflow with
Triana Services
- Abstract: In this paper, we discuss a real-world
application scenario that uses three distinct types of workflow within the
Triana problem solving environment: serial scientific workflow for the data
processing of gravitational wave signals; job submission workflows that execute
Triana services on a testbed; and monitoring workflows that examine and modify
the behaviour of the executing application. We briefly describe the Triana
distribution mechanisms and the underlying architectures that we can support.
Our middleware independent abstraction layer, called the GAP, enables us to
advertise, discover and communicate with Web and P2P Services. We show how
gravitational wave search algorithms have been implemented to distribute both
the search computation and data across the European GridLab testbed, using a
combination of Web Services, Globus interaction and P2P infrastructures.
- David Churches, Gabor Gombas, Andrew Harrison, Jason Maassen,
Craig Robinson, Matthew Shields, Ian Taylor, Ian Wang
- School of Physics & Astronomy, Cardiff University; Laboratory
of Parallel and Distributed Systems, MTA SZTAKI; School of Computer Science,
Cardiff University; Department of Computer Science, Vrije Universiteit,
Amsterdam; Schools of Physics & Astronomy and Computer Science, Cardiff
University
- Email: matthew.shields@astro.cf.ac.uk
- Received July 7 2004, Revised 11 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c863shields/c863CPandE_TrianaWorkflow.pdf
- FINAL VERSION
Programming
Scientific and Distributed Workflow with Triana Services
- 864: What makes workflows work in an opportunistic
environment?
- Abstract: In this paper, we examine the issues of workflow
mapping and execution in opportunistic environments such as the grid. As
applications become ever more complex, the process of choosing the appropriate
resources and successfully executing the application components becomes ever
more difficult. In this paper, we focus on the interplay between a workflow
mapping component that plans the high-level resource assignments and the
workflow executor that oversees the component execution. We concentrate
particularly on issues of data management and we draw from the experiences with
mapping and execution systems: Pegasus, DAGMan and Stork.
- Ewa Deelman, Tevfik Kosar, Carl Kesselman, Miron Livny
- USC Information Science Institute, Marina Del Rey, CA; Computer
Sciences Department, University of Wisconsin, Madison, WI
- Email: deelman@isi.edu
- Received June 1 2004, Revised 15 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c864deelan/c864deelman.pdf
- FINAL VERSION
What
makes workflows work in an opportunistic environment?
- 865: A question of scale: Bringing an existing bio-science
workflow engine to the grid
- Abstract:We describe the tool used at iBioS mine-it
to visually model and process bioinformatic workflows has been a joint
effort between iBioS and phase-it (now Europroteome, [www.europroteome.org]).
Although very successful in its early stages, we were soon confronted with many
issues of scale. Here, we present different strategies to overcome these issues
and a final pragmatic approach to migrate existing models and tools to a
standardized, grid-enabled environment. We evaluate three methods:
- porting the application to a J2EE-Architecture and using JMS
to connect to compute-agents,
- encapsulating workflows, distributing them using standard
job-schedulers and making these workflows accessible as Web services,
- encapsulating workflows, distributing them using standard
job-schedulers and making these workflows accessible as Web services,
- and extending thisWeb services-approach to more fully embrace
emerging grid standards
We will show the pros and cons of these approaches and from
there derive some general guidelines on how to adapt legacy-workflow systems to
a gridenvironment.
- Stefan Frank, Josh Moore, and Roland Eils
- intelligent Bioinformatics Systems German Cancer Research
Center
- Email: s.frank@dkfz.de
- Received June 1 2004; WITHDRAWN as no contact with authors
- Full Paper:../CCPEwebresource/c8545to872workflow/c865frank/c865Frank-and-Moore_A-Question-Of-Scale.pdf
- 866: Taverna: Lessons in creating a workflow environment
for the life sciences
- Abstract: Life sciences research is based on individuals,
often with diverse skills, assembled into research groups. These groups use
their specialist expertise to address scientific problems. The in silico
experiments undertaken by these research groups can be viewed as workflows
involving the co-ordinated use of analysis programs and information
repositories that may be globally distributed. With regards to Grid computing,
the requirements relate to the sharing of analysis and information resources
rather than sharing computational power. The Taverna project has developed a
toolkit for the composition and execution of workflows for the life sciences
community. This experience paper describes lessons learnt during the
development of Taverna, in particular areas highlighted by initial use cases,
and also where translating technological solutions into benefits for scientists
has proved harder than expected. A common theme in these lessons is the
importance of understanding how workflows fit into the scientists' experimental
context. The lessons reflect an evolving understanding of life scientists'
requirements on a workflow environment, which is relevant to other areas of
data intensive and exploratory science.
- Tom OINN, Mark GREENWOOD, Matthew ADDIS, Justin FERRIS, Kevin
GLOVER, Carole GOBLE, Duncan HULL, Darren MARVIN, Peter LI, Phillip LORD,
Matthew R. POCOCK, Martin SENGER, Anil WIPAT and Chris WROE
- EMBL European Bioinformatics Institute, Hinxton, Cambridge, CB10
1SD, UK; IT Innovation Centre, University of Southampton, SO16 7NP, UK; School
of Computer Science and Information Technology, University of Nottingham, NG8
1BB, UK; School of Computer Science, University of Manchester, M13 9PL, UK;
School of Computing Science, University of Newcastle, NE1 7RU, UK
- Email: markg@cs.man.ac.uk
- Received June 1 2004, Revised 30 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c866taverna/c866taverna-ccpe-revised-9july.pdf
- FINAL VERSION
Taverna:
Lessons in creating a workflow environment for the life sciences
- 867: User Tools and Languages for Graph-based Grid
Workflows
- Abstract: One of the main objectives of Grid computing is
the abstraction from the hardware infra-structure as well as hiding
implementation details of software components from the user. A modern Grid
infrastructure should enable the user not only to execute single tasks on
speci-fied hardware resources but also to compose and execute complex Grid
applications on distributed, heterogeneous and unreliable hardware resources
without taking care about lower-level details. With the Grid, a unified
infrastructure is becoming available which allows to host computational
resources and use them on demand, but also to combine them and organize
dataflow between them. For the latter purpose, the concept of Grid workflow has
emerged which describes patterns of control and dataflow between Grid
resources, including apart from software components and data sources
human actors participating in interactions. Several techniques have been
established in the Grid community in order to define the workflow of Grid jobs.
A very promising approach from the view of the unskilled user is
the usage of graphs for this purpose. While graphs are primarily mathematical
abstract entities, they possess very intuitive ways of visualization that can
be handled easily even by non-expert users. The main limitation of graphs,
however, is the fact that they may become very huge if you use them to model
complex workflows. In this case, a hierarchical graph definition that allows
graph coarsening and refinement may be a solution. Many Grid workflow
approaches build on a special subclass of graphs the directed acyclic
graphs (DAG) which are easy to implement, but restrict the kinds of
workflows that can be mod-eled. The aim of this paper is not to give a broad
overview about workflow description lan-guages and tools in general but it will
rather describe user tools and workflow schemes developed in the Fraunhofer
Resource Grid (FhRG) as exemplary solutions. In contrast to other workflow
approaches which usually are based on directed acyclic graphs, the FhRG
workflow is built on the more expressive formalism of Petri nets. Dynamic
workflow graph refinement is introduced as a powerful technique to transform
abstract workflow graphs into the concrete ones needed for execution and to
automatically add fault tolerance to complex workflows. The Fraunhofer Resource
Grid is a Grid initiative of several Fraunhofer institutes funded by the German
federal ministry of education and research with the main objective to de-velop
and to implement a stable and robust Grid infrastructure within the
Fraunhofer-Gesellschaft, to integrate available resources, and to provide
internal and external users with an easy-to-use interface for controlling
distributed applications and services in the Grid environment. The component
environment supports loosely coupled software components where each software
component represents an executable file that reads input files and writes
output files. The execution of such a software component we call atomic job. We
plan to include Grid Service invocations as atomic jobs in future releases of
the FhRG framework in order to make it OGSA compatible. We will distribute most
of the software developed within the Fraunhofer Resource Grid using an Open
Source License (GNU GPL) under the label eXeGrid.
- Andreas Hoheisel
- Fraunhofer Institute for Computer Architecture and Software
Technology (FIRST) Kekuléstr. 7, D-12489 Berlin, Germany
- Email: andreas.hoheisel@first.fraunhofer.de
- Received June 1 2004, Revised 12 April 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c867hoheisel/c867Hoheisel_original.pdf
- FINAL VERSION
User
Tools and Languages for Graph-based Grid Workflows
- 868: Implementing BPEL4WS: The Architecture of a BPEL4WS
Implementation.
- Abstract: BPEL4WS (BPEL in short) is a business process
definition language built natively on top of the Web services application
model. BPEL provides a workflow oriented composition model for Web services
applications, and is this a central piece in the heavily componentized service
oriented computing model. BPEL results form the merge of two distinct process
metamodels (the process algebra model of XLANG and the graph oriented model of
WSFL) into a coherent and powerful framework. Implementing BPEL presents for
this reasons significant challenges to middleware developers. This paper
discusses those challenges and describes the design and architecture of the
BPWS4J runtime, a full implementation of the BPELWS 1.1 specification.
- Francisco Curbera, Rania Khalaf, William A. Nagy, and Sanjiva
Weerawarana
- IBM T.J. Watson Research Center
- Email: rkhalaf@watson.ibm.com
- Received June 1 2004, Revised 24 April 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c868curbera/c868ImplementingBPEL4WS.pdf
- FINAL VERSION
Implementing
BPEL4WS: The Architecture of a BPEL4WS Implementation.
- 869: Scientific Workflow Management and the Kepler
System
- Abstract: Many scientific disciplines are now data and
information driven, and new scientific knowledge is often gained by scientists
putting together data analysis and knowledge discovery pipelines. A
related trend is that more and more scientific communities realize the benefits
of sharing their data and computational services, and are thus contributing to
a distributed data and computational community infrastructure (a.k.a. the
Grid). However, this infrastructure is only a means to an end and
scientists ideally should be bothered little with its existence. The goal is
for scientists to focus on development and use of what we call scientific
workflows. These are networks of analytical steps that may involve, e.g.,
database access and querying steps, data analysis and mining steps, and many
other steps including computationally intensive jobs on high performance
cluster computers. In this paper we describe characteristics of and
requirements for scientific workflows as identified in a number of our
application projects. We then elaborate on Kepler, a particular scientific
work-flow system, currently under development across a number of scientific
data management projects. We describe some key features of Kepler and its
underlying Ptolemy II system, planned extensions, and areas of future research.
Kepler is a community driven, open source project, and we always welcome
related projects and new contributors to join.
- Bertram Ludäscher, Ilkay Altintas, Chad Berkley, Dan
Higgins, Efrat Jaeger, Matthew Jones, Edward A. Lee, Yang Zhao
- San Diego Supercomputer Center, UC San Diego; National Center for
Ecological Analysis and Synthesis, UC Santa Barbara; Department of Electrical
Engineering and Computer Sciences, UC Berkeley
- Email: ludaesch@sdsc.edu
- Received June 1 2004, Revised 6 April 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c869kepler/c869kepler-swf.pdf
- FINAL VERSION
Scientific
Workflow Management and the Kepler System
- 870: Toward a Search Architecture for Software
Components
- Abstract: We describe a system that tackles the concept of
Workflow graphs for modeling Grid Application to compute a sort of static
importance value that will be used as a measure of the quality of
each application. The idea is rather simple: the more an application is
referred by other applications the more important this application is
considered. Note that this concept is very close to the well known PageRank
measure used by Google to rank the pages it stores.
- Fabrizio Silvestri, Diego Puppin, Domenico Laforenza
- HPC-Lab ISTI-CNR, Italy
- Email: fabrizio.silvestri@isti.cnr.it
- Received June 1 2004, Revised 15 March 2005, Accepted 27 April
2005
- Full Paper:
../CCPEwebresource/c8545to872workflow/c870silvestri/silvestriGRIDLE.pdf
- FINAL VERSION
Toward
a Search Architecture for Software Components
- 871: On Using BPEL Extensibility to Implement OGSI and WSRF
Grid Workflows
- 872: Overview of Workflow
- Abstract:
- Dennis Gannon and Workshop Organizers
- Indiana University
- Email: gannon@indiana.edu
- Received: Not Available yet
- Full Paper: