
Michail Alvanos- PhD
- Researcher at The Cyprus Institute
Michail Alvanos
- PhD
- Researcher at The Cyprus Institute
About
30
Publications
4,206
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
149
Citations
Introduction
Current institution
Additional affiliations
January 2014 - December 2014
OnApp Limited
Position
- Researcher
Description
- Conducted research in emerging, power-efficient micro-server architectures as part of the Euroserver project. Ported Paravirtualized XEN on Zedboard. Backported a number of drivers to the 2.6.23 kernel for Zedboard.
October 2010 - November 2013
IBM Canada Labs
Position
- CAS Student
Description
- Visiting Student. 3 times X 3 moths.
Publications
Publications (30)
The study of atmospheric chemistry-climate interactions is one of today's great computational challenges. Advances in the architecture of Graphics Processing Units (GPUs) in both raw computational power and memory bandwidth sparked the interest for General-Purpose computing on graphics accelerators in scientific applications. However, the introduct...
This paper presents an application of GPU accelerators in Earth system modeling. We focus on atmospheric chemical kinetics, one of the most computationally intensive tasks in climate–chemistry model simulations. We developed a software package that automatically generates CUDA kernels to numerically integrate atmospheric chemical kinetics in the gl...
The global climate model ECHAM/MESSy Atmospheric Chemistry (EMAC) is a modular global model that simulates climate change and air quality scenarios. The application includes different sub-models for the calculation of chemical species concentrations, their interaction with land and sea, and the human interaction. The paper presents a source-to-sour...
This paper presents an application of GPU accelerators in Earth system modelling. We focus on atmospheric chemical kinetics, one of the most computationally intensive tasks in climate-chemistry model simulations. We developed a software package that automatically generates CUDA kernels to numerically integrate atmospheric chemical kinetics in the g...
Virtualization of server hardware is a commonly used practice to provide scalable resource management. In order to meet a variety of emerging technology
trends, a novel, super-lightweight Type I Hypervisor architecture is essential. The proposed ‘Microvisor’ architecture is significant for a number of reasons.
1. Resource utilization efficiency. Mi...
Attribution-NonCommercial-ShareAlike 4.0 International
KPP Fortran to CUDA source-to-source pre-processor: Each CPU process that offloads to GPU requires a chunk of the GPU VRAM memory, dependent on the number of species and reaction constants in the MECCA mechanism. The number of GPUs per node and VRAM memory available in each GPU dictates the to...
Attribution-NonCommercial-ShareAlike 4.0 International
KPP Fortran to CUDA source-to-source pre-processor: Each CPU process that offloads to GPU requires a chunk of the GPU VRAM memory, dependent on the number of species and reaction constants in the MECCA mechanism. The number of GPUs per node and VRAM memory available in each GPU dictates the to...
Attribution-NonCommercial-ShareAlike 4.0 International
KPP Fortran to CUDA source-to-source pre-processor: Each CPU process that offloads to GPU requires a chunk of the GPU VRAM memory, dependent on the number of species and reaction constants in the MECCA mechanism. The number of GPUs per node and VRAM memory available in each GPU dictates the to...
Significant progress has been made in the development of programming languages and tools that are suitable for hybrid computer architectures that group several shared-memory multicores interconnected through a network. This paper addresses important limitations in the code generation for partitioned global address space (PGAS) languages. These lang...
Programs written in Partitioned Global Address Space (PGAS) languages can access any location of the entire address space via standard read/write operations. However, the compiler have to create the communication mechanisms and the runtime system to use synchronization primitives to ensure the correct execution of the programs. However, PGAS progra...
Partitioned Global Address Space (PGAS) languages are a popular alternative when building applications to run on large scale parallel machines. Unified Parallel C (UPC) is a well known PGAS language that is available on most high performance computing systems. Good performance of UPC applications is often one important requirement for a system acqu...
An illustrative embodiment of a computer-implemented process for shared data prefetching and coalescing optimization versions a loop containing one or more shared references into an optimized loop and an un-optimized loop, transforms the optimized loop into a set of loops, and stores shared access associated information of the loop using a prologue...
Partitioned Global Address Space (PGAS) languages appeared to address programmer productivity in large scale parallel machines. The main goal of a PGAS language is to provide the ease of use of shared memory pro-gramming model with the performance of MPI. Unified Parallel C programs containing all-to-all communication can suffer from shared access...
Organizations of all size, whether in business or government are recognizing the strategic role of data and the huge challenge they have to derive value from it. Consumer businesses want to better understand and engage with their customers. Logistics ...
The goal of Partitioned Global Address Space (PGAS) languages is to improve programmer productivity in large scale parallel machines. However, PGAS programs may have many fine-grained shared accesses that lead to performance degradation. Manual code transformations or compiler optimizations are required to improve the performance of programs with f...
Future multi-core processors will necessitate exploitation of fine-grain, architecture-independent parallelism from applications to utilize many cores with relatively small local memories. We use c264, an end-to-end H.264 video encoder for the Cell processor based on x264, to show that exploiting fine-grain parallelism remains challenging and requi...
This article evaluates the scalability and productivity of six parallel programming models for heterogeneous architectures, and finds that task-based models using code and data annotations require the minimum programming effort while sustaining nearly best performance. However, achieving this result requires both extensions of programming models to...
Increasing the number of cores in modern CPUs is the main trend for improving system performance. A central challenge is the
runtime support that multi-core systems ought to use for sustaining high performance and scalability without increasing disproportionally
the effort required by the programmer. In this work we present Tagged Procedure Calls (...
Modern multicore processors with explicitly managed local memories, such as the Cell Broad- band Engine (Cell) constitute in many ways a significant departure from traditional high perfor- mance CPU designs. A main issue in evaluating the features and limitations of modern multicore CPUs is appropriate workloads. In this work we first port an avail...
Modern multi-coe processors with explictly managed local memories, such as the Cell Broadband Engine (Cell) constitute in many ways a significant departure from traditional high performance CPU designs. Such CPUs, on one hand bear the potential of higher performance in certain application domains and on the other hand require extensive application...