Resit Sendag

Resit Sendag
University of Rhode Island | URI · Department of Electrical, Computer, and Biomedical Engineering

About

61
Publications
7,685
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
438
Citations
Citations since 2017
8 Research Items
96 Citations
2017201820192020202120222023051015
2017201820192020202120222023051015
2017201820192020202120222023051015
2017201820192020202120222023051015

Publications

Publications (61)
Conference Paper
Full-text available
The Vehicle Routing Problem (VRP) is fundamental to logistics operations. Finding optimal solutions for VRPs related to large, real-world operations is computationally expensive. Genetic algorithms (GA) have been used to find good solutions for different types of VRPs but are slow to converge. This work utilizes high-performance computing (HPC) pla...
Article
Lookup operations for in-memory databases are heavily memory bound, because they often rely on pointer-chasing linked data structure traversals. They also have many branches that are hard-to-predict due to random key lookups. In this study, we show that although cache misses are the primary bottleneck for these applications, without a method for el...
Article
Full-text available
Indirect memory accesses have irregular access patterns that limit the performance of conventional software and hardware-based prefetchers. To address this problem, we propose the Array Tracking Prefetcher (ATP), which tracks array-based indirect memory accesses using a novel combination of software and hardware. ATP is first configured by special...
Article
Full-text available
Lookup operations for in-memory databases are heavily memory-bound because they often rely on pointer-chasing linked data structure traversals. They are also branch heavy with branches that are hard-to-predict due to random key lookups. In this study, we show that although cache misses are the primary bottleneck for these applications, without a me...
Article
This article presents position statements and a question-and-answer session by panelists at the Fourth Workshop on Computer Architecture Research Directions. The subject of the debate was the use of field-programmable gate arrays versus GPUs in datacenters.
Article
This article presents position statements and a question-and-answer session by panelists at the 4th Workshop on Computer Architecture Research Directions. The subject of the debate was proprietary versus free and open instruction set architectures.
Article
This article presents position statements and a question-and-answer session by panelists at the 4th Workshop on Computer Architecture Research Directions. The subject of the debate was new technologies and their impact on future architectures.
Article
After over two decades of extensive research on branch prediction, branch mispredictions are still an important performance/power bottleneck for today's aggressive processors. In our prior work, to further understand the causes for mispredictions, we presented a source-code based classification of branch mispredictions extending the prior work on p...
Conference Paper
Full-text available
Current processors employ aggressive prediction mechanisms to improve performance and reduce power. It is increasingly important to understand and quantify a program's dynamic behavior to effectively design next-generation prediction mechanisms. In this paper, we develop algorithms and mechanisms inspired by DNA discovery tools to analyze and quant...
Conference Paper
In recent years, privacy management has become one of the most complex processes in the connected world. Fundamental technologies like GPS, cellular communications, and the Internet have become essential equipment in the modern vehicle. Subsequently, the vehicle became part of this connected world, wherein data are constantly sent and received. Acc...
Conference Paper
Full-text available
This paper explores the performance and energy efficiency of CUDA-enabled GPUs and multi-core SIMD CPUs using a set of kernels and full applications. Our implementations efficiently exploit both SIMD and thread-level parallelism on multi-core CPUs and the computational capabilities of CUDA-enabled GPUs. We discuss general optimization techniques fo...
Article
Full-text available
This paper covers the design and FPGA-based prototyping of a full-featured multi-core platform for use in computer architecture research studies. Existing platforms for performing studies include software simulators and hardware-assisted simulators, but there are no modular full-hardware platforms designed to measure a wide range of performance met...
Article
In this panel discussion from the 2009 Workshop on Computer Architecture Research Directions, David August and Keshav Pingali debate whether explicitly parallel programming is a necessary evil for applications programmers, assess the current state of parallel programming models, and discuss possible routes toward finding the programming model for t...
Article
Simulation is an indispensable tool for evaluation and analysis throughout the development cycle of a computer system, and even after the computer system is built. How simulation should evolve as the complexity of computer systems continues to grow is an open question and the subject of this panel from the 2009 Workshop on Computer Architecture Res...
Conference Paper
Full-text available
Branch prediction accuracy remains to be critical for high performance and low power. Prior work has studied causes of branch mispredictions in order to provide insights into how better branch predictors can be designed. However, most of the previous works have only considered run-time classification of branch mispredictions, leaving a large number...
Conference Paper
Software simulators remain several orders of magnitude slower than the modern microprocessor architectures they simulate. Although various reduced-time simulation tools are available to accurately help pick truncated benchmark simulation, they either come with a need for offline analysis of the benchmarks initially or require many iterative runs of...
Conference Paper
Full-text available
Although high branch prediction accuracy is necessary for high performance, it typically comes at the cost of larger predictor tables and/or more complex prediction algorithms. Unfortunately, large predictor tables and complex algorithms require more chip area and have higher power consumption, which precludes their use in embedded processors. As a...
Article
The core of current-generation high-performance multiprocessor systems is out-of-order execution processors with aggressive branch prediction. Despite their relatively high branch prediction accuracy, these processors still execute many memory instructions down mispredicted paths. Previous work that focused on uniprocessors showed that these wrong-...
Article
Due to the long simulation time of the reference input set, computer architects often use reduced time simulation techniques to shorten the simulation time. However, what has not yet been thoroughly evaluated is the accuracy of these techniques relative to the reference input set and with respect to each other. To rectify this deficiency, this pape...
Article
Today, with the increasing popularity of multicore processors, one approach to optimizing the processor's performance is to reduce the execution times of individual applications running on each core by designing and implementing more powerful cores. Another approach, which is the polar opposite of the first, optimizes the processor's performance by...
Article
One of the primary concerns for microprocessor designers has always been balancing power and thermal management while minimizing performance loss. rather than generate solutions to this dilemma, the advent of multicore chips has raised a host of new challenges. this discussion with Pradip Bose and Kanad Ghose, excerpted from a 2007 Card Workshop Pa...
Article
How can we ensure that platform hardware, firmware, and software work in concert to withstand rapidly evolving security threats? Architectural innovations bring performance gains but can also create new security vulnerabilities. In this panel discussion, from the 2007 workshop on Computer Architecture Research directions, we assess the current stat...
Article
Full-text available
In this paper, we propose a new class of branch predictors, complementary branch predictors, which can be easily added to any branch predictor to improve the overall prediction accuracy. This mechanism differs from conventional branch predictors in that it focuses only on mispredicted branches. As a result, this mechanism has the advantages of scal...
Conference Paper
Full-text available
To reduce the simulation time to a tractable amount or due to compilation (or other related) problems, computer architects often simulate only a subset of the benchmarks in a benchmark suite. However, if the architect chooses a subset of benchmarks that is not representative, the subsequent simulation results will, at best, be misleading or, at wor...
Article
Full-text available
The viability of bus interconnection models is explored, using the multiple-valued logic (MVL) paradigm to reduce the cost and energy consumption of off-chip and on-chip address, data and instruction buses within system-on-a-chip platforms. Data can be transferred over the buses using ternary, balanced ternary or quaternary number systems, rather t...
Conference Paper
Full-text available
High-performance multiprocessor systems built around out-of-order processors with aggressive branch predictors execute many memory references that turn out to be on a mispredicted branch path. Previous work that focused on uniprocessors showed that these wrong-path memory references may pollute the caches by bringing in data that are not needed on...
Article
Full-text available
Uniprocessor studies have shown that wrong-path memory references pollute the caches by bringing in data that are not needed for the correct execution path and by evicting useful data or instructions. Additionally, they also increase the amount of cache and memory traffic. On the positive side, however, they may have a prefetching effect for loads...
Conference Paper
Full-text available
In this paper, we propose three novel cache models using multiple-valued logic (MVL) paradigm to reduce the cache data storage area and cache energy consumption for embedded systems. Multiple-valued caches have significant potential for compact and power-efficient cache array design. The cache models differ from each other depending on whether they...
Article
The speculated execution of threads in a multithreaded architecture, plus the branch prediction used in each thread execution unit, allows many instructions to be executed speculatively, that is, before it is known whether they actually needed by the program. In this study, we examine how the load instructions executed on what turn out to be incorr...
Article
Full-text available
Due to the simulation time of the reference input set, architects often use alternative simulation techniques. Although these alternatives reduce the simulation time, what has not been evaluated is their accuracy relative to the reference input set, and with respect to each other. To rectify this deficiency, this paper uses three methods to charact...
Conference Paper
Full-text available
Address correlation is a technique that links the addresses that reference the same data values. Using a detailed source-code level analysis, a recent study (1) revealed that different addresses containing the same data can often be correlated at run-time to eliminate on-chip data cache misses. In this paper, we study the upper-bound performance of...
Conference Paper
Full-text available
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. The resulting speculative issuing of load instructions in these architectures can significantly impact the performance of the memory hierarchy as the system exploits high...
Conference Paper
Full-text available
Pointer-intensive and sparse numerical computations typically display irregular memory access behavior. This work presents a mathematical model, called the Self-tuning Adaptive Predictor (SAP), to characterize the behavior of load instructions in procedures with pointer-based data structures by using procedure call boundaries as the fundamental sam...
Article
Full-text available
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism in application programs. A single-threaded sequencing mechanism needs speculative execution beyond conditional branches in order to exploit more instruction-level parallelism. In addition, an aggressive multithreaded architecture should also use threa...
Conference Paper
Full-text available
As the degree of instruction-level parallelism in superscalar architectures increases, the gap between processor and memory performance continues to grow requiring more aggressive techniques to increase the performance of the memory system. Several data prefetching techniques have been proposed for hiding the latency of main memory accesses, all of...
Article
Full-text available
Value reuse improves a processor's performance by dynamically caching the results of previous instructions into the value reuse table and reusing those results to bypass the execution of future instructions that have the same opcode and input operands. However, replacing the least recently used entries with the results of the current instructions c...
Article
Full-text available
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. The resulting speculative issuing of load instructions in these architectures can significantly impact the performance of the memory hierarchy as the system exploits high...
Article
Full-text available
Relatively little background work has been done to examine the miss behavior of all static and dynamic load instructions, especially in the context of the entire program. This study addresses this gap in knowledge by presenting the whole-program (as opposed to sampling) profiling results for load behavior. Specifically, this study confirms the conc...
Article
Full-text available
We investigate a program phenomenon, Address Correlation, which links addresses that reference the same data.This work shows that different addresses containing the samedata can often be correlated at run-time to eliminate a load missor a partial hit. For ten of the SPEC CPU2000 benchmarks, 57 to99% of all L1 data cache load misses, and 4 to 85% of...
Article
Full-text available
Mathematical modeling is an important tool for understanding and improving the memory referencing behavior of the programs. For some programs, such as scientific codes operating on dense arrays or matrices, memory accesses exhibit strong regularity. However, pointer-intensive and sparse numerical computations typically display irregular memory acce...
Conference Paper
Full-text available
Value reuse improves a processor’s performance by dynamically caching the results of previous instructions and reusing those results to bypass the execution of future instructions that have the same opcode and input operands. However, continually replacing the least recently used entries could eventually fill the value reuse table with instructions...
Conference Paper
Full-text available
Value reuse improves a processor's performance by dynamically caching the results of previous instructions and reusing those results to bypass the execution of future instructions that have the same opcode and input operands. However, continually replacing the least recently used entries could eventually fill the value reuse table with instructions...
Conference Paper
As the degree of instruction-level parallelism in superscalar architectures increases, the gap between processor and memory performance continues to grow requiring more aggressive techniques to increase the performance of the memory system. We propose a new technique, which is based on the wrong-path execution of loads far beyond instruction fetch-...
Conference Paper
Full-text available
This paper considers the problem of routing and wavelength assignment (RWA) in optical passive star networks with non-uniform traffic load. The problem can be considered as designing a logical topology over an optical passive star physical topology with a given non-uniform traffic. The approach uses the bipartite graphs and the concept of time and...
Conference Paper
Full-text available
In this study, scattering of plane electromagnetic waves at the junction formed by a PEC half-plane and a half-plane with anisotropic conductivity is investigated. By using the Fourier transform technique the problem is formulated into a matrix Wiener-Hopf system and an exact closed-form solution is obtained for the most general case by factorizing...
Article
Full-text available
The performance of a processor is limited by the specific bottlenecks that a benchmark exposes while running on that processor. Since the quantification of these bottlenecks can be extremely time-consuming, our prior work proposed using the Plackett and Burman design as a statistically-rigorous, but time-efficient method of determining the processo...
Article
Full-text available
In this paper, we explore the potential of bus interconnection models using the Multiple-Valued Logic paradigm to reduce the power consumption of on-chip address and data buses within embedded SoC platforms. Data is sent over the buses using radix-r number system, i.e. ternary, balanced ternary or quaternary, rather than binary. This allows more co...
Article
Full-text available
Out-of-order execution processors with aggressive branch prediction are the core of current-generation high-performance multiprocessor systems. Despite their relatively high branch prediction accuracies, these processors still execute many memory instructions on the mispredicted path. These wrong-path memory references pollute the caches and increa...
Article
Thesis (Ph. D.)--University of Minnesota, 2003. Includes bibliographical references (leaves 107-114).

Network

Cited By