Sergei Gorlatch

University of Münster, Muenster, North Rhine-Westphalia, Germany

Are you Sergei Gorlatch?

Claim your profile

Publications (167)16.17 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing has a large potential in HIV diagnostics and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In the current study, we demonstrate the use of Graphics Processing Units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify more than 175,000 sequences/second on a NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable next-generation sequencing technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. Availability: The source code can be downloaded at http://www.heiderlab.de CONTACT: d.heider@wz-straubing.de.
    Bioinformatics (Oxford, England). 08/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Algorithmic skeletons simplify software development: they abstract typical patterns of parallelism and provide their efficient implementations, allowing the application developer to focus on the structure of algorithms, rather than on implementation details. This becomes especially important for modern parallel systems with multiple graphics processing units (GPUs) whose programming is complex and error-prone, because state-of-the-art programming approaches like CUDA and OpenCL lack high-level abstractions. We define a new algorithmic skeleton for allpairs computations which occur in real-world applications, ranging from bioinformatics to physics. We develop the skeleton’s generic parallel implementation for multi-GPU Systems in OpenCL. To enable the automatic use of the fast GPU memory, we identify and implement an optimized version of the allpairs skeleton with a customizing function that follows a certain memory access pattern. We use matrix multiplication as an application study for the allpairs skeleton and its two implementations and demonstrate that the skeleton greatly simplifies programming, saving up to 90 % of lines of code as compared to OpenCL. The performance of our optimized implementation is up to 6.8 times higher as compared with the generic implementation and is competitive to the performance of a manually written optimized OpenCL code.
    International Journal of Parallel Programming 08/2014; 42(4). · 0.40 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Massively Multi-player Online Games (MMOG) are characterized by intensive interactions between many simultaneous users and real-time demands on Quality of Service (QoS). Other examples of similar, real-time online interactive applications include various virtual environments, as well as multi-pupil e-learning and simulation-based training courses. A highly desirable enhancement for MMOG is the use of mobile devices for accessing the game application. However, the limited computing power of mobile devices is an obstacle for implementing computation-intensive parts of MMOG, in particular graphics processing, on mobile devices. This paper proposes a novel runtime system for mobile MMOG and other similar applications that moves computation-intensive tasks, including graphics processing, from the mobile devices to Cloud resources. We report experimental results of our runtime system using a realistic multi-player online game.
    IEEE INFOCOM 2014 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS); 04/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: We consider a challenging class of highly interactive virtual environments, also known as Real-Time Online Interactive Applications (ROIA). Popular examples of ROIA include multi-player online computer games, e-learning and training applications based on real-time simulations, among others. An emerging enhancement for ROIA is the use of mobile devices for accessing the application (mobile ROIA). However, the limited computing power of mobile devices is an obstacle for implementing computation-intensive parts of ROIA, in particular graphics processing, on mobile devices. This paper proposes a runtime system for mobile ROIA that moves computation-intensive tasks, including graphics processing, from the mobile devices to Cloud resources. We report experimental results of our runtime system using a multi-player online game with real-world characteristics.
    2014 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud); 04/2014
  • Sergei Gorlatch, Tim Humernbrum, Frank Glinka
    [Show abstract] [Hide abstract]
    ABSTRACT: Real-Time Online Interactive Applications (ROIA), e.g., multiplayer online games and simulation-based e-learning, are emerging internet applications that make high Quality of Service (QoS) demands on the underlying network. These demands depend on the number of users and the actual application state and, therefore, vary at runtime. Traditional networks have very limited possibilities of influencing the network behaviour to meet the dynamic QoS demands, such that most ROIA use the underlying network on a best-effort basis. The emerging architecture of Software-Defined Networking (SDN) decouples the control and forwarding logic from the network infrastructure, making the network behaviour programmable for applications. This paper analyses ROIA requirements on the underlying network and describes the specification of an SDN Northbound API that allows ROIA applications to specify their dynamic network requirements and to meet them using SDN networks.
    2014 International Conference on Computing, Networking and Communications (ICNC); 02/2014
  • Parallel Processing Letters 01/2014; 24(3).
  • Michel Steuwer, Sergei Gorlatch
    The Journal of Supercomputing 01/2014; 69(1):25-33. · 0.92 Impact Factor
  • Source
    Michel Steuwer, Sergei Gorlatch
    [Show abstract] [Hide abstract]
    ABSTRACT: Application development for modern high-performance systems with Graphics Processing Units (GPUs) relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs.In this paper, we present SkelCL – a high-level programming model for systems with multiple GPUs and its implementa- tion as a library on top of OpenCL. SkelCL provides three main enhancements to the OpenCL standard: 1) computations are conveniently expressed using parallel patterns (skeletons); 2) memory management is simplified using parallel container data types; 3) an automatic data (re)distribution mechanism allows for scalability when using multi-GPU systems.We use a real-world example from the field of medical imaging to motivate the design of our programming model and we show how application development using SkelCL is simplified without sacrificing performance: we were able to reduce the code size in our imaging example application by 50% while introducing only a moderate runtime overhead of less than 5%.
    Procedia Computer Science 01/2013; 18:749–758.
  • Michel Steuwer, Sergei Gorlatch
    Parallel Computing Technologies - 12th International Conference, PaCT 2013, St. Petersburg, Russia, September 30 - October 4, 2013. Proceedings; 01/2013
  • Philipp Kegel, Michel Steuer, Sergei Gorlatch
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern computer systems become increasingly distributed and heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e.g., MPI with OpenCL or CUDA) in order to exploit the system’s full performance potential. In this paper, we present dOpenCL (distributed OpenCL)—a uniform approach to programming distributed heterogeneous systems with accelerators. dOpenCL allows the user to run unmodified existing OpenCL applications in a heterogeneous distributed environment. We describe the challenges of implementing the OpenCL programming model for distributed systems, as well as its extension for running multiple applications concurrently. Using several example applications, we compare the performance of dOpenCL with MPI + OpenCL and standard OpenCL implementations.
    Journal of Parallel and Distributed Computing 01/2013; 73(12):1639–1648. · 1.12 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Real-time Online Interactive Applications (ROIA) like multiplayer online games usually work in a persistent environment (also called virtual world) which continues to exist and evolve also while the user is offline and away from the application. This paper deals with storing persistent data of real-time interactive applications in modern relational databases. We describe a preliminary design of the Entity Persistence Module (EPM) middleware which liberates the application developer from writing and maintaining complex and error-prone, application-specific code for persistent data management.
    Intelligent Software Methodologies, Tools and Techniques (SoMeT), 2013 IEEE 12th International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Application programming for GPUs (Graphics Processing Units) is complex and error-prone, because the popular approaches - CUDA and OpenCL - are intrinsically low-level and offer no special support for systems consisting of multiple GPUs. The SkelCL library offers pre-implemented recurring computation and communication patterns (skeletons) which greatly simplify programming for single- and multi-GPU systems. In this paper, we focus on applications that work on two-dimensional data. We extend SkelCL by the matrix data type and the MapOverlap skeleton which specifies computations that depend on neighboring elements in a matrix. The abstract data types and a high-level data (re)distribution mechanism of SkelCL shield the programmer from the low-level data transfers between the system’s main memory and multiple GPUs. We demonstrate how the extended SkelCL is used to implement real-world image processing applications on two-dimensional data. We show that both from a productivity and a performance point of view it is beneficial to use the high-level abstractions of SkelCL.
    Euro-Par 2012: Parallel Processing Workshops - BDMC, CGWS, HeteroPar, HiBB, OMHI, Paraphrase, PROPER, Resilience, UCHPC, VHPC; 08/2012
  • Michel Steuwer, Philipp Kegel, Sergei Gorlatch
    [Show abstract] [Hide abstract]
    ABSTRACT: Application programming for GPUs (Graphics Processing Units) is complex and error-prone, because the popular approaches - CUDA and OpenCL - are intrinsically low-level and offer no special support for systems consisting of multiple GPUs. The SkelCL library presented in this paper is built on top of the OpenCL standard and offers pre- implemented recurring computation and communication pat- terns (skeletons) which greatly simplify programming for multi- GPU systems. The library also provides an abstract vector data type and a high-level data (re)distribution mechanism to shield the programmer from the low-level data transfers between the system’s main memory and multiple GPUs. In this paper, we focus on the specific support in SkelCL for systems with multiple GPUs and use a real-world application study from the area of medical imaging to demonstrate the reduced programming effort and competitive performance of SkelCL as compared to OpenCL and CUDA. Besides, we illustrate how SkelCL adapts to large-scale, distributed heterogeneous systems in order to simplify their programming.
    2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW); 05/2012
  • Philipp Kegel, Michel Steuwer, Sergei Gorlatch
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern computer systems are becoming increasingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e.g., MPI with OpenCL or CUDA) in order to exploit the full compute capability of a system. In this paper, we present dOpenCL (Distributed OpenCL) -- a uniform approach to programming distributed heterogeneous systems with accelerators. dOpenCL extends the OpenCL standard, such that arbitrary computing devices installed on any node of a distributed system can be used together within a single application. dOpenCL allows moving data and program code to these devices in a transparent, portable manner. Since dOpenCL is designed as a fully-fledged implementation of the OpenCL API, it allows running existing OpenCL applications in a heterogeneous distributed environment without any modifications. We describe in detail the mechanisms that are required to implement OpenCL for distributed systems, including a device management mechanism for running multiple applications concurrently. Using three application studies, we compare the performance of dOpenCL with MPI+OpenCL and a standard OpenCL implementation.
    2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW); 05/2012
  • Michel Steuwer, Philipp Kegel, Sergei Gorlatch
    [Show abstract] [Hide abstract]
    ABSTRACT: Application programming for modern heterogeneous systems which comprise multiple accelerators (multi-core CPUs and GPUs) is complex and error-prone. Popular approaches, like OpenCL and CUDA, are low-level and offer no support for the two most complicated issues: 1) programming multiple GPUs within a stand-alone computer, and 2) managing distributed systems that integrate several such computers. In particular, distributed systems require application developers to use a mix of different programming models, e.g., MPI together with OpenCL or CUDA. We propose a uniform approach based on OpenCL for programming both stand-alone and distributed systems with GPUs. The approach implementation is based on two parts: 1) the SkelCL library for high-level application programming on heterogeneous stand-alone computers with multi-core CPUs and multiple GPUs, and 2) the dOpenCL middleware for transparent execution of OpenCL programs on several stand-alone computers connected over a network. Both SkelCL and dOpenCL are built on top of the OpenCL standard which ensures their high portability across different kinds of processors and GPUs. The dOpenCL middleware extends OpenCL, such that arbitrary computing devices (multi-core CPUs and GPUs) in a distributed system can be used within a single application, with data and program code moved to these devices transparently. The SkelCL library offers a set of pre-implemented patterns (skeletons) of parallel computation and communication which greatly simplify programming for multi-GPU systems. The library also provides an abstract vector data type and a high-level data (re)distribution mechanism to shield the programmer from the low-level data transfers between a system's main memory and multiple GPUs. This paper describes dOpenCL and SkelCL and illustrates how they are used to simplify programming of heterogeneous distributed systems with accelerators.
    New Trends in Software Methodologies, Tools and Techniques -- Proceedings of the Eleventh SoMeT'12; 01/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper focuses on providing an overview of the research challenges that have been identified toward the end of the S-Cube network in the area of service engineering. These challenges concern the need for agility and dynamicity of the development process for service-based applications, the importance of focusing on proper approaches to support migration of legacy application into service-based applications and the role of humans and of teams of humans in service-based applications.
    Proceedings of the ICSE 2012 Workshop on European Software Services and Systems Research -- Results and Challenges (S-Cube); 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: The paper studies an emerging class of Real-Time Online Interactive Applications (ROIA), including multi-player online computer games, e-learning and training applications based on real-time simulations, etc. ROIA combine the challenge of the scalability and real-time user interactivity with the problem of efficient and economic utilization of resources for changing number of users. To address these challenges, we develop a dynamic resource management system which implements load balancing for ROIA on Clouds. We illustrate three different load-balancing actions and report experimental results on bringing a multi-player online game on Cloud.
    01/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Skeleton programming can, to a large degree, solve the portability problem in multi- and many-core programming. We present SkePU and SkelCL, two recent skeleton programming systems that support multi-GPU systems, and demonstrate their usage and efficiency with several concrete application programs.
    Programming Multi-core and Many-core Computing Systems, Edited by Pllana Sabri, Xhafa Fatos, 01/2012: chapter Skeleton Programming for Portable Many-Core Computing; Wiley-Blackwell., ISBN: 0470936908
  • [Show abstract] [Hide abstract]
    ABSTRACT: We describe how the generic Lifecycle Model developed in the S-Cube project for the design and management of service-based applications (SBA) can be utilized in the context of Cloud Computing. In particular, we focus on the fact that the Infrastructure-as-a-Service approach enables the development of Real-Time Online Interactive Applications (ROIA), which include multi-player online computer games, interactive e-learning and training applications and high-performance simulations in virtual environments. We illustrate how the Lifecycle Model expresses the major design and execution aspects of ROIA on Clouds by addressing the specific characteristics of ROIA: a large number of concurrent users connected to a single application instance, enforcement of Quality of Service (QoS) parameters, adaptivity to changing loads, and frequent real-time interactions between users and services. We describe how our novel resource management system RTF-RMS implements concrete mechanisms that support the developer in designing adaptable ROIA on Clouds according to the different phases of the Lifecycle Model. Our experimental results demonstrate the influence of the proposed adaptation mechanisms on the application performance.
    Proceedings of the 2011 international conference on Service-Oriented Computing; 12/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: We consider a challenging class of highly interactive virtual environments, also known as Real-Time Online Interactive Applications (ROIA). Popular examples of ROIA include multi-player online computer games, e-learning and training applications based on real-time simulations, etc. ROIA combine high demands on the scalability and real-time user interactivity with the problem of efficient and economic utilization of resources, which is difficult to achieve due to the changing number of users. We address these challenges by developing the dynamic resource management system RTF-RMS which implements load balancing for ROIA on Clouds. We illustrate how RTF-RMS chooses between three different load-balancing actions and implements Cloud resource allocation. We report experimental results on the load balancing of a multi-player online game in a Cloud environment using RTF-RMS.
    Proceedings of the 2011 international conference on Parallel Processing; 08/2011

Publication Stats

1k Citations
16.17 Total Impact Points

Institutions

  • 1970–2014
    • University of Münster
      • • Department of Mathematics and Computer Sciences
      • • Institute for Geoinformatics
      • • Institute of Computer Science
      Muenster, North Rhine-Westphalia, Germany
  • 2006
    • Delft University Of Technology
      • Faculty of Electrical Engineering, Mathematics and Computer Sciences (EEMCS)
      Delft, South Holland, Netherlands
    • The University of Aizu
      Hukusima, Fukushima, Japan
  • 1994–2006
    • Universität Passau
      • Department of Informatics and Mathematics
      Passau, Bavaria, Germany
  • 2001–2003
    • Technische Universität Berlin
      • School IV Electrical Engineering and Computer Science
      Berlín, Berlin, Germany
  • 2000
    • VU University Amsterdam
      • Department of Mathematical & Computer Science
      Amsterdam, North Holland, Netherlands