Conference Paper

Fine-Grained Profiling for Data-Intensive Workflows.

DOI: 10.1109/CCGRID.2010.29 Conference: 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGrid 2010, 17-20 May 2010, Melbourne, Victoria, Australia
Source: DBLP

ABSTRACT Profiling is an effective dynamic analysis approach to investigate complex applications. ParaTrac is a user-level profiler using file system and process tracing techniques for data-intensive workflow applications. In two respects ParaTrac helps users refine the orchestration of workflows. First, the profiles of I/O characteristics enable users to quickly identify bottlenecks of underlying I/O subsystems. Second, ParaTrac can exploit fine-grained data-processes interactions in workflow execution to help users understand, characterize, and manage realistic data-intensive workflows. Experiments on thoroughly profiling Montage workflow demonstrate that ParaTrac is scalable to tracing events of thousands of processes and effective in guiding fine-grained workflow scheduling or workflow management systems improvements.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Workflows have been used to model repeatable tasks or operati ons in a number of different industries includ- ing manufacturing and software. In recent years, workflows a re increasingly used in distributed resources and web services environments through resource models such as grid and cloud computing. These workflows often have disparate requirements and constraints that nee d to be accounted for during workflow orchestra- tion. In this paper, we present workflow examples from differ ent domains including bioinformatics and biomedical, weather and ocean modeling, astronomy detailing their data and computational requirements.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Researchers working on the planning, scheduling and execution of scientific workflows need access to a wide variety of scientific workflows to evaluate the performance of their implementations. We describe basic workflow structures that are composed into complex workflows by scientific communities. We provide a characterization of workflows from five diverse scientific applications, describing their composition and data and computational requirements. We also describe the effect of the size of the input datasets on the structure and execution profiles of these workflows. Finally, we describe a workflow generator that produces synthetic, parameterizable workflows that closely resemble the workflows that we characterize. We make these workflows available to the community to be used as benchmarks for evaluating various workflow systems and scheduling algorithms.
    Workflows in Support of Large-Scale Science, 2008. WORKS 2008. Third Workshop on; 12/2008
  • [Show abstract] [Hide abstract]
    ABSTRACT: Workflows for e-Science is divided into four parts, which represent four broad but distinct areas of scientific workflows. In the first part, Background, we introduce the concept of scientific workflows and set the scene by describing how they differ from their business workflow counterpart. In Part II, Application and User Perspective, we provide a number of scientific examples that currently use workflows
    false, 01/2007: pages 1-8; Springer London., ISBN: 978-1-84628-519-6

Full-text (2 Sources)

Available from
Oct 31, 2014