A Lightweight Middleware Monitor for Distributed Scientific Workflows.
ABSTRACT Monitoring the execution of distributed tasks within the workflow execution is not easy and is frequently controlled manually. This work presents a lightweight middleware monitor to design and control the parallel execution of tasks from a distributed scientific workflow. This middleware can be connected into a workflow management system. This middleware implementation is evaluated with the Kepler workflow management system, by including new modules to control and monitor the distributed execution of the tasks. These middleware modules were added to a bio informatics workflow to monitor parallel BLAST executions. Results show potential to high performance process execution while preserving the original features of the workflow.
- SourceAvailable from: Stephen Mock
Conference Proceeding: Kepler: an extensible system for design and execution of scientific workflows[show abstract] [hide abstract]
ABSTRACT: Most scientists conduct analyses and run models in several different software and hardware environments, mentally coordinating the export and import of data from one environment to another. The Kepler scientific workflow system provides domain scientists with an easy-to-use yet powerful system for capturing scientific workflows (SWFs). SWFs are a formalization of the ad-hoc process that a scientist may go through to get from raw data to publishable results. Kepler attempts to streamline the workflow creation and execution process so that scientists can design, execute, monitor, re-run, and communicate analytical procedures repeatedly with minimal effort. Kepler is unique in that it seamlessly combines high-level workflow design with execution and runtime interaction, access to local and remote data, and local and remote service invocation. SWFs are superficially similar to business process workflows but have several challenges not present in the business workflow scenario. For example, they often operate on large, complex and heterogeneous data, can be computationally intensive and produce complex derived data products that may be archived for use in reparameterized runs or other workflows. Moreover, unlike business workflows, SWFs are often dataflow-oriented as witnessed by a number of recent academic systems (e.g., DiscoveryNet, Taverna and Triana) and commercial systems (Scitegic/Pipeline-Pilot, Inforsense). In a sense, SWFs are often closer to signal-processing and data streaming applications than they are to control-oriented business workflow applications.Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on; 07/2004
- [show abstract] [hide abstract]
ABSTRACT: P-GRADE provides a high-level graphical environment to develop parallel applications transparently both for parallel systems and the Grid. P-GRADE supports the interactive execution of parallel programs as well as the creation of a Condor, Condor-G or Globus job to execute parallel programs in the Grid. In P-GRADE, the user can generate either PVM or MPI code according to the underlying Grid where the parallel application should be executed. PVM applications generated by P-GRADE can migrate between different Grid sites and as a result P-GRADE guarantees reliable, fault-tolerant parallel program execution in the Grid. The GRM/PROVE performance monitoring and visualisation toolset has been extended towards the Grid and connected to a general Grid monitor (Mercury) developed in the EU GridLab project. Using the Mercury/GRM/PROVE Grid application monitoring infrastructure any parallel application launched by P-GRADE can be remotely monitored and analysed at run time even if the application migrates among Grid sites. P-GRADE supports workflow definition and co-ordinated multi-job execution for the Grid. Such workflow management can provide parallel execution at both inter-job and intra-job level. Automatic checkpoint mechanism for parallel programs supports the migration of parallel jobs inside the workflow providing a fault-tolerant workflow execution mechanism. The paper describes all of these features of P-GRADE and their implementation concepts.Journal of Grid Computing 01/2003; 1:171-197. · 1.60 Impact Factor
Conference Proceeding: Decentralized Execution of Event-Driven Scientific Workflows.[show abstract] [hide abstract]
ABSTRACT: Scientific workflows (SWF) are traditionally coordinated and executed in a centralized fashion. This creates a sin- gle point of failure, forms a scalability bottleneck, and of- ten leads to too much message traffic routed back to the coordinator. We have developed PADRES, a content-based publish/subscribe platform that serves as a runtime envi- ronment for the decentralized execution, control, and moni- toring of SWF. Publish/subscribe is a natural paradigm for event-driven applications such as SWF management, as the loosely-coupled nature of publishers and subscribers re- lieves the coordinator from maintaining client connection and capability information. PADRES has been developed with features inspired by the requirements of SWF manage- ment. Its unique features include an expressive subscrip- tion language, composite subscription processing support, a rule-based matching and routing mechanism, a query- based historic data access mechanism, and support for the decentralized execution of SWFs specified in XML.Proceedings of the 2006 IEEE Services Computing Workshops (SCW 2006), 18-22 September 2006, Chicago, Illinois, USA; 01/2006