Shoaib Akram

Shoaib Akram
Australian National University | ANU · Research School of Computer Science

Doctor of Philosophy
Assistant Professor, Australian National University

About

29
Publications
1,341
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
239
Citations
Additional affiliations
March 2010 - June 2012
Foundation for Research and Technology - Hellas
Position
  • Researcher
Description
  • Research: Data-centric applications and infrastructure; Supported by Marie Curie Fellowship

Publications

Publications (29)
Article
Computer architects extensively use simulation to steer future processor research and development. Simulating large-scale multicore processors is extremely time-consuming and is sometimes impossible because of simulation infrastructure limitations. This paper proposes scale-model simulation, a novel methodology to predict large-scale multicore syst...
Preprint
Full-text available
Managed analytics frameworks (e.g., Spark) cache intermediate results in memory (on-heap) or storage devices (off-heap) to avoid costly recomputations, especially in graph processing. As datasets grow, on-heap caching requires more memory for long-lived objects, resulting in high garbage collection (GC) overhead. On the other hand, off-heap caching...
Preprint
Full-text available
In our information-driven societies, full-text search is ubiquitous. Search is memory-intensive. Quickly searching massive corpora requires building indices, which consumes big volatile heaps. Search is storage I/O-intensive. Limited main memory necessitates writing large partial indices on non-volatile storage, where they finally live in merged fo...
Article
Intel Optane memory offers non-volatility, byte addressability, and high capacity. It suits managed workloads that prefer large main memory heaps. We investigate Optane as the main memory for managed (Java) workloads, focusing on performance scalability. As the workload (core count) increases, we note Optane’s performance relative to DRAM. A few wo...
Article
Full-text available
Emerging workloads in cloud and data center infrastructures demand high main memory bandwidth and capacity. Unfortunately, DRAM alone is unable to satisfy contemporary main memory demands. High-bandwidth memory (HBM) uses 3D die-stacking to deliver 4–8× higher bandwidth. HBM has two drawbacks: (1) capacity is low, and (2) soft error rate is high. H...
Article
Emerging non-volatile memory (NVM) technologies offer greater capacity than DRAM. Unfortunately, production NVM exhibits high latency and low write endurance. Hybrid memory combines DRAM and NVM to deliver greater capacity, low latency, high endurance, and low energy consumption.Write-rationing garbage collection mitigates NVM wear-out by placing h...
Conference Paper
Emerging non-volatile memory (NVM) technologies offer greater capacity than DRAM. Unfortunately, production NVM exhibits high latency and low write endurance. Hybrid memory combines DRAM and NVM to deliver greater capacity, low latency, high en- durance, and low energy consumption. Write-rationing garbage col- lection mitigates NVM wear-out by plac...
Conference Paper
Semiconductor scaling trends are steering hardware towards greater heterogeneity. Heterogeneous processors and memories promise efficiency and scalability. Software must take advantage of hardware heterogeneity. We ask the question, "what is the right software layer to abstract the complexity of heterogeneous hardware?" Historically, the OS is the...
Article
Non-volatile memories (NVM) offer greater capacity than DRAM but suffer from high latency and low write endurance. Hybrid memories combine DRAM and NVM to form scalable memory systems with the promise of high capacity, low energy consumption, and high endurance. Automatically managing hybrid NVM-DRAM memories to achieve their promise without changi...
Preprint
Non-volatile memory (NVM) has the potential to disrupt the boundary between memory and storage, including the abstractions that manage this boundary. Researchers comparing the speed, durability, and abstractions of hybrid systems with DRAM, NVM, and disk to traditional systems typically use simulation, which makes it easy to evaluate different hard...
Article
This paper proposes RPPM, an analytical performance model that, based on a microarchitecture-independent profile of a multithreaded application, predicts its performance on a previously unseen multicore platform. RPPM breaks up multithreaded program execution into epochs based on synchronization primitives, and then predicts per-epoch active execut...
Conference Paper
Emerging Non-Volatile Memory (NVM) technologies offer high capacity and energy efficiency compared to DRAM, but suffer from limited write endurance and longer latencies. Prior work seeks the best of both technologies by combining DRAM and NVM in hybrid memories to attain low latency, high capacity, energy efficiency, and durability. Coarsegrained h...
Article
Emerging Non-Volatile Memory (NVM) technologies offer high capacity and energy efficiency compared to DRAM, but suffer from limited write endurance and longer latencies. Prior work seeks the best of both technologies by combining DRAM and NVM in hybrid memories to attain low latency, high capacity, energy efficiency, and durability. Coarsegrained h...
Conference Paper
Emerging Non-Volatile Memory (NVM) technologies offer more capacity and energy efficiency than DRAM, but their write endurance is lower and latency is higher. Hybrid memories seek the best of both worlds — scalability, efficiency, and performance — by combining DRAM and NVM. Our work proposes modifying a standard managed language runtime to allocat...
Conference Paper
In this talk, I will discuss new opportunities for improving the performance and efficiency of managed language applications running on heterogeneous hardware. I will first introduce a new performance predictor for multithreaded managed applications that enables accurately estimating the execution time of a managed application at a different freque...
Article
Making modern computer systems energy-efficient is of paramount importance. Dynamic Voltage and Frequency Scaling (DVFS) is widely used to manage the energy and power consumption in modern processors; however, for DVFS to be effective, we need the ability to accurately predict the performance impact of scaling a processor’s voltage and frequency. N...
Article
While hardware is evolving toward heterogeneous multicore architectures, modern software applications are increasingly written in managed languages. Heterogeneity was born of a need to improve energy efficiency; however, we want the performance of our applications not to suffer from limited resources. How best to schedule managed language applicati...
Conference Paper
Single-ISA heterogeneous multi-cores consisting of small (e.g., in-order) and big (e.g., out-of-order) cores dramatically improve energy- and power-efficiency by scheduling workloads on the most appropriate core type. A significant body of recent work has focused on improving system throughput through scheduling. However, none of the prior work has...
Conference Paper
Full-text available
Today, there is increased interest in understanding the impact of data-centric applications on compute and storage infrastructures as datasets are projected to grow dramatically. In this paper, we examine the storage I/O behavior of twelve data-centric applications as the number of cores per server grows. We configure these applications with realis...
Article
Full-text available
Building scalable back-end infrastructures for data-centric applications is becoming important. Applications used in data-centres have complex, multilayer software stacks and are required to scale to a large number of nodes. Today, there is increased interest in improving the efficiency of such software stacks. In this paper, we examine the efficie...
Article
Full-text available
Interconnection networks for multicore processors are traditionally designed to serve a diversity of workloads. However, different workloads or even different execution phases of the same workload may benefit from different interconnect configurations. In this paper, we first motivate the need for workload-adaptive interconnection networks. Subsequ...
Conference Paper
Full-text available
Interconnection networks for multicore processors are designed in a generic way to serve a diversity of workloads. For multicore processors, there is a considerable opportunity to achieve an improvement in performance by implementing interconnects which adapt to different program phases and to a variety of workloads. This paper proposes one such in...
Article
Full-text available
1. ABSTRACT In this work we examine the relative overhead of the op-erating system and using virtual machines on I/O intensive applications. We use real applications to calculate the cy-cles and energy per I/O using simple models, based on ac-tual measurements. Our results indicate that the OS can cost up to 60% in terms of energy spent per I/O ope...

Network

Cited By