Scientific Workflow Systems for 21st Century e-Science, New Bottle or New Wine?

Reading time: 5 minute
...

📝 Original Info

  • Title: Scientific Workflow Systems for 21st Century e-Science, New Bottle or New Wine?
  • ArXiv ID: 0808.3545
  • Date: 2008-08-27
  • Authors: Researchers from original ArXiv paper

📝 Abstract

With the advances in e-Sciences and the growing complexity of scientific analyses, more and more scientists and researchers are relying on workflow systems for process coordination, derivation automation, provenance tracking, and bookkeeping. While workflow systems have been in use for decades, it is unclear whether scientific workflows can or even should build on existing workflow technologies, or they require fundamentally new approaches. In this paper, we analyze the status and challenges of scientific workflows, investigate both existing technologies and emerging languages, platforms and systems, and identify the key challenges that must be addressed by workflow systems for e-science in the 21st century.

💡 Deep Analysis

Deep Dive into Scientific Workflow Systems for 21st Century e-Science, New Bottle or New Wine?.

With the advances in e-Sciences and the growing complexity of scientific analyses, more and more scientists and researchers are relying on workflow systems for process coordination, derivation automation, provenance tracking, and bookkeeping. While workflow systems have been in use for decades, it is unclear whether scientific workflows can or even should build on existing workflow technologies, or they require fundamentally new approaches. In this paper, we analyze the status and challenges of scientific workflows, investigate both existing technologies and emerging languages, platforms and systems, and identify the key challenges that must be addressed by workflow systems for e-science in the 21st century.

📄 Full Content

Scientific Workflow Systems for 21st Century, New Bottle or New Wine? Invited Short Paper

1Yong Zhao, 2Ioan Raicu, 2,3,4Ian Foster 1Microsoft Corporation, Redmond, WA, USA 2 Department of Computer Science, University of Chicago, Chicago, IL, USA
3Computation Institute, University of Chicago, Chicago, IL, USA 4Math & Computer Science Division, Argonne National Laboratory, Argonne, IL, USA yozha@microsoft.com, iraicu@cs.uchicago.edu, foster@mcs.anl.gov

Abstract

With the advances in e-Sciences and the growing complexity of scientific analyses, more and more scientists and researchers are relying on workflow systems for process coordination, derivation automation, provenance tracking, and bookkeeping.
While workflow systems have been in use for decades, it is unclear whether scientific workflows can or even should build on existing workflow technologies, or they require fundamentally new approaches. In this paper, we analyze the status and challenges of scientific workflows, investigate both existing technologies and emerging languages, platforms and systems, and identify the key challenges that must be addressed by workflow systems for e-science in the 21st century.

  1. Introduction

Scientific workflow has become increasingly popular in modern scientific computation as more and more scientists and researchers are relying on workflow systems to conduct their daily science analysis and discovery. With technology advances in both scientific instrumentation and simulation, the amount of scientific datasets is growing exponentially each year, such large data size combined with growing complexity of data analysis procedures and algorithms have rendered traditional manual processing and exploration unfavorable as compared with modern in silico processes automated by scientific workflow systems (SWFS). While the term workflow speaks of different things in different context, we find in general SWFS are engaged and applied to the following aspects of scientific computations: 1) describing complex scientific procedures, 2) automating data derivation processes, 3) high performance computing (HPC) to improve throughput and performance, and 4) provenance management and query. Workflows are not a new concept and have been around for decades. There were a number of coordination languages and systems developed in the 80s and 90s [1,7], which share many common characteristic with workflow systems (i.e. they describe individual computation components and their ports and channels, and the data and event flow between them). They also coordinate the execution of the components, often on parallel computing resources. Furthermore, business process management systems have been developed and invested in for years; there are many mature commercial products and industry standards such as BPEL [2]. In the scientific community there are also many emerging systems for scientific programming and computation [5,22]. Before we jump on developing yet another workflow system, a fundamental question to ask is whether we can use existing technologies, or we should invent new languages and systems in order to achieve the four aspects mentioned earlier that are essential to scientific workflow systems. This paper identifies the challenges to workflow development in the context of scientific computation; we present an overview of some of the existing technologies and emerging systems, and discuss opportunities in addressing these challenges.

  1. Multi-core processor architectures

Software development has been on a free ride for performance gain as chipmakers continue to follow Moore’s Law in doubling up transistors in minuscule space. Little consideration has been given to code parallelization since it has not been essential for the average computer user until recently, when single CPU core performance growth stagnated and multi-core processors emerged on the market in 2005.
Due to the limitations to effectively increasing processor clock frequency, hardware manufactures started to physically reorganize chips into what we call the multi-core architecture [10], involving linking several microprocessor cores together on the same semiconductor. Various manufactures from Intel, AMD, IBM, Sun, have released dual-core, quad-core, eight-core, and 64-threaded processors in the past few years [13,21]. Given that 128-threaded SMP systems are a reality today [21], it is reasonable to assume that 1024 CPU cores/threads or more per SMP system will be available in the next decade.
The new multi-core architecture will force radical changes in software design and development. We are already seeing significant increase of research interests in concurrency and parallelism, and multi-core software development. The number of multiprocessor research papers has increased sharply since year 2001, surpassing the peak point in all the past years [10]. Con

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut