Rodolfo AzevedoState University of Campinas (UNICAMP) | UNICAMP · Departamento de Sistemas de Computação
Rodolfo Azevedo
PhD in Computer Science
About
131
Publications
33,906
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,089
Citations
Introduction
Additional affiliations
August 2002 - present
Publications
Publications (131)
Efficient data movement between nodes in a data center is essential for optimal performance of distributed workloads. With advancements in computing interconnection and memory, new opportunities have emerged. We propose a novel inter-node architecture and protocol called Flexible Memory Units (FMU) that uses optically disaggregated memory. FMUs can...
Automated grading systems (autograders) assist the process of teaching in introductory programming courses (CS1). However, the sole focus on correctness can obfuscate the assessment of other characteristics present in code. In this work, we investigated if code, deemed correct by an autograder, were developed with characteristics that indicated pot...
Perceptual Learning Modules (PLMs) is a variation of Perceptual Learning based on multiple-choice questionnaires. There exists successful research of the use of PLMs in math and flight training. The possibility of designing and adopting PLMs in Introductory Programming Courses (CS1) is still an open area of study. The goal of this study is to test...
The design and administration of methodological interventions is a possible way to address dropout and failure rates in undergraduate introductory programming courses (CS1). However, to implement such strategies, it is required to identify how CS1 courses are organized and offered to students. In this work, we analyzed 225 syllabi from CS1 courses...
Este artigo descreve o relato de experiência do Bacharelado Interdisciplinar em Tecnologia da Informação, ofertado na modalidade EaD pela Universidade Virtual do Estado de São Paulo (Univesp). Durante o processo de construção do curso, além das diretrizes curriculares e normativas, foram observados exemplos de outros bacharelados interdisciplinares...
O uso de sistemas de correção automática (autograders) auxilia o ensino de disciplinas de introdução à programação (CS1). No entanto, o foco na corretude pode ofuscar a verificação de outros problemas presentes no código. Neste trabalho, foi investigado se códigos, ditos corretos por um autograder, apresentavam comportamentos que poderiam indicar f...
Misconceptions in Correct Code (MC³) are undesirable programming behaviors, in terms of the learning objectives, that students have in code that generates the correct output. We manually analyzed 2441 students' submissions from a Python CS1 course, which were corrected by an automatic grading system (autograder), and identified 45 MC³, divided in 8...
The Uhlenbeck-Ford (UF) model was originally proposed for the theoretical study of imperfect gases, given that all its virial coefficients can be evaluated exactly, in principle. Here, in addition to computing the previously unknown coefficients B11 through B13, we assess its applicability as a reference system in fluid-phase free-energy calculatio...
Recent advances in integrated photonics enable the implementation of reconfigurable, high-bandwidth, and low energy-per-bit interconnects in next-generation data centers. We propose and evaluate an Optically Connected Memory (OCM) architecture that disaggregates the main memory from the computation nodes in data centers. OCM is based on micro-ring...
Este trabalho tem como objetivo o estudo e desenvolvimento de técnicas que visam auxiliar alunos e professores de disciplinas de introdução à programação nas universidades (CS1). Estas disciplinas, em que estudos reportam altos índices de evasão, são cada vez mais relevantes no ambiente acadêmico, sendo oferecidas não apenas a alunos de computação,...
Recent advances in integrated photonics enable the implementation of reconfigurable, high-bandwidth, and low energy-per-bit interconnects in next-generation data centers. We propose and evaluate an Optically Connected Memory (OCM) architecture that disaggregates the main memory from the computation nodes in data centers. OCM is based on micro-ring...
Approximate computing techniques enable significant improvements in energy efficiency by producing potentially incorrect outputs for a small subset of inputs of a given application. Approximations introduced at the hardware level, in particular, may be applicable in multiple scenarios and offer high power savings. Integrating and evaluating approxi...
A misconception is a common misunderstanding that students may have about a specific topic. The identification, documentation, and validation of misconceptions is a long and time-consuming work, usually carried out using iterative cycles of students answering open-ended questionnaires, interviews with instructors and students, exam analysis, and di...
DRAM manufacturers have been prioritizing memory capacity, yield, and bandwidth for years, while trying to keep the design complexity as simple as possible. DRAM chips do not carry out any computation or other important functions, such as security. Processors implement most of the existing security mechanisms that protect the system against securit...
Buscar métodos de reconhecer antecipadamente alunos equivocados e auxiliar no direcionamento é essencial. Este projeto, associado à pesquisa do Dr. Ricardo Caceffo, ambas supervisionadas pelo Dr. Rodolfo Azevedo, visa criar uma ferramenta de diagnóstico de problemas de compreensão dos alunos para disciplinas introdutórias de computação. Foram estud...
The purpose of this research is to identify the misconceptions held by undergraduate students when taking introductory CS1 courses using Python. The methodology of this work consisted of interviews with instructors of previous sections of an introductory CS1 course in Python at Unicamp, and through the analysis of past exams. As a result of this wo...
Value prediction improves instruction level parallelism in superscalar processors by breaking true data dependencies. Although this technique can significantly improve overall performance, most of the state-of-the-art value prediction approaches require high hardware cost, which is the main obstacle for its wide adoption in current processors. To t...
Increasing industry interest in the optimization of inter-GPU communication has motivated this work to explore new ways to enable peer-to-peer access. Specifically, this paper investigates how reconfigurable optical links between GPUs in multi-GPU servers can allow for minimized memory transfer latencies for given machine learning applications. Sil...
Interpretation is a simple and portable technique that enables em-ulation of instruction set architectures (ISAs). Even though other techniques, such as binary translation, still provide superior emulation performance, interpreters are easier to implement and debug. Because of that, they are often used to complement binary translation techniques or...
The use of hardware to perform part of CPU processing functions is a consolidated practice that produces good results in terms of power and performance when applied in embedded systems. This paper describes the changes in the processor architecture to embed the functions of a microkernel to boost the performance of task-based systems (TBS). Part of...
We present our experience in a Computer Science (CS) introductory course, where three teaching practices were implemented and compared: lectured-based learning, problem-based learning, and peer instruction. We chose Information Systems, a first-term undergraduate course, for this study. It overviews a variety of topics in CS, such as algorithms, da...
Raising awareness of the environmental impact of energy generation and consumption has been a recent concern of contemporary society worldwide. Underlying the awareness of energy consumption is an intricate network of perception and social interaction that can be mediated by technology. In this paper we argue that issues regarding energy, environme...
On-chip photonics has gained attention in computer architecture research for high-speed processor communication networks. Recent developments in optical fabrication techniques and data buffering offer new opportunities for processor systems. In this work, we evaluate a processor with a full optical memory system as main memory. We build it using re...
Many-core systems are a common place in the electronic consumer market. Thus, complex benchmark suites have been modeled for shared memory (SM) architectures since it is easier to develop applications (threads) for many-core. Shared memory presents a scalability limitation due to the number of memory accesses. One way to mitigate the SM limitations...
The Uhlenbeck-Ford (UF) model was originally proposed for the theoretical study of imperfect gases, given that all its virial coefficients can be evaluated exactly, in principle. Here, in addition to computing the previously unknown coefficients B 11 through B 13, we assess its applicability as a reference system in fluid-phase free-energy calculat...
Instruction Set Simulators (ISSs) play a critical role in the design cycle of embedded systems. However, as ISSs evolve and increase in complexity, not only new bugs might be introduced but also old latent bugs might be revealed. Finding these bugs based on the simulator output might be a challenging task. This paper presents HybridVerifier, a nove...
As novas tecnologias de memórias não voláteis, conhecidas coletivamente como NVMs, prometem rivalizar com a DRAM na disputa pela escolha da tecnologia da memória principal. As NVMs possibilitam, por exemplo, a manipulação de dados persistentes sem o uso de cópias transientes das mesmas. Apesar disso as NVMs ainda não são capazes de oferecer um dese...
Energy consumption constraints have become a critical issue in Multiprocessor Systems on Chip (MPSoC) designs. Whereas processor performance comes with a high power cost, there is an increasing interest in exploring the trade-off between power and performance, taking into account the target application domain. Dynamic Voltage and Frequency Scaling...
Frequent value locality is a type of locality based on the observation that a small set of values is accessed very frequently. Several works have exploited it to construct different architectural schemes, such as memory and cache designs or bus and network optimizations. Although these previous works consider different criteria to establish what is...
Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before defining the final MPSoC architecture details. Motivated by the lack of MPSoC virtual platforms prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools, we...
A Concept Inventory (CI) is a set of multiple choice questions used to reveal student's misconceptions related to some topic. Each available choice (besides the correct choice) is a distractor that is carefully developed to address a specific misunderstanding, a student wrong thought. In computer science introductory programming courses, the develo...
The modern embedded market massively relies on RISC processors. The code density of such processors directly affects memory usage, an expensive resource. Solutions to mitigate this issue include code compression techniques and ISAs extensions with reduced instructions bit-width, such as Thumb2 and MicroMIPS. This paper proposes a 16-bit extension t...
Para realizar exploração de desempenho e eficiência energética em projetos de MPSoCs, faz-se necessária uma infraestrutura de simulação em nível de sistema que forneça recursos para avaliar consumo de energia nos estágios iniciais do projeto. Este artigo apresenta uma extensão de um framework para projetos MPSoCs que fornece suporte à escalabilidade...
This paper presents an alternative methodology to a traditional high-level power estimation flow, that enables the fast gathering of switching activity from SystemC RTL descriptions. The proposed methodology requires minimum effort from the designer, while reducing the steps necessary to obtain switching activity information, and requires only a C+...
In the last years, several approaches were proposed to improve embedded systems performance by extending base processors to fit specific applications performance demands. Although the majority of the contributions focus specially on the architectural challenges, which range from completely reconfigurable hardware to custom ASICs, every work faces a...
In this paper we propose an early idea about a new hardware transactional memory system that implements snapshot isolation (SI) using logs. With this scheme, we avoid specific costly hardware resources (multiversion memory) to keep the snapshots, and maintain the advantages of snapshot isolation (low abort rates). In our proposal, the aborting proc...
The Active Learning Model (ALM) is an educational model which proposes that students should participate, along with the teacher, as direct agents of their learning process. Computer systems created to implement and support the ALM are known as Classroom Response Systems (CRS). The CRS, usually supported by traditional pen-based Tablet PCs, allow th...
The last real time graphic processors in the market are implementa- tions based on the programmable stream processor model. This works presents the internals of these processors and the programming languages that were cre- ated to develop software to them, focusing on the NVIDIA 8 series and the CUDA programming model. With this new programming mod...
Microprocessor manufacturers typically keep old instruction sets in modern processors to ensure backward compatibility with legacy software. The introduction of newer extensions to the ISA increases the design complexity of microprocessor front-ends, exacerbates the consumption of precious on-chip resources (e.g., silicon area and energy), and dema...
Microprocessor manufacturers typically keep old instruction sets in modern processors to ensure backward compatibility with legacy software. The introduction of newer extensions to the ISA increases the design complexity of microprocessor front-ends, exacerbates the consumption of precious on-chip resources (e.g., silicon area and energy), and dema...
Harnessing the flexibility and scaling features of the cloud can open up opportunities to address some relevant research problems in scientific computing. Nevertheless, cloudbased parallel programming models need to address some relevant issues, namely communication overhead, workload balance and fault tolerance. Programming models, which work well...
Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before defining the final MPSoC architecture details. However, the simulation can only be efficiently performed when using a modeling and simulation engine that supports the system behavior description in a high abstraction le...
Phase-Change Memory (PCM) is new memory technology and a possible replacement for DRAM, whose scaling limitations require new lithography technologies. Despite being promising, PCM has limited endurance (its cells withstand roughly 108 bit-flips before failing), which prompted the adoption of Error Correction Techniques (ECTs). However, previous li...
The constructivist theory indicates that knowledge is not something finished and complete. However, the individuals must construct it through the interaction with the physical and social environment. The Active Learning is a methodology designed to support the constructivism through the involvement of students in their learning process, allowing th...
Due to projections of high scalability, Phase-Change Memory (PCM) is seen as a new main memory for computer systems. In fact, PCM may even replace DRAM, whose scaling limitations require new lithography technologies that are still unknown. On the down-side, PCM has low endurance when compared with DRAM, i.e., on average, a PCM cell can only withsta...
Memória Transacional (TM) é um mecanismo recente de sincronização que objetiva ao mesmo tempo facilitar o desenvolvimento de aplicações concorrentes e fornecer desempenho. A maioria dos trabalhos em TM focam avaliação de desempenho, negligenciando outras métricas como consumo de energia. Trabalhos anteriores analisaram a eficiência energética de imp...
The design complexity of integrated circuits requires techniques that automate and ease common tasks, allowing developers to keep up with the rapid growth and demand of the industry. This paper presents acSynth, an integrated framework for development and synthesis based on ArchC ADL descriptions and introduces a new power characterization method a...
Zombie is an endurance management framework that enables a variety of error correction mechanisms to extend the lifetimes of memories that suffer from bit failures caused by wearout, such as phase-change memory (PCM). Zombie supports both single-level cell (SLC) and multi-level cell (MLC) variants. It extends the lifetime of blocks in working memor...
Microprocessor designers such as Intel and AMD implement old instruction sets at their modern processors to ensure backward compatibility with legacy code. In addition to old backward compatibility instructions, new extensions are constantly introduced to add functionalities. In this way, the size of the IA-32 ISA is growing at a fast pace, reachin...
Zombie is an endurance management framework that enables a variety of error correction mechanisms to extend the lifetimes of memories that suffer from bit failures caused by wearout, such as phase-change memory (PCM). Zombie supports both single-level cell (SLC) and multi-level cell (MLC) variants. It extends the lifetime of blocks in working memor...
The vast majority of Internet services available today rely on the computing capabilities of data centers. The load in a data center varies according to the time of day, week and year, thus it is important to be able to optimize the use of computing resources dynamically and continuously. This paper presents a framework for modeling and simulating...
This paper presents acSynth, an ArchC framework for energy characterization and simulation. Based on Tiwari’s Method, a subject processor is characterized in an affordable time and the information is fed into acSynth to bring architecture level power analysis. The framework provides power reports and energy profiling. The experimental results show...
Transactional memory (TM) is a new synchronization mechanism devised to simplify parallel programming, thereby helping programmers to unleash the power of current multicore processors. Although software implementations of TM (STM) have been extensively analyzed in terms of runtime performance, little attention has been paid to an equally important...
Contemporary SoC design involves the proper selection of cores from a reference platform. Such selection implies the design exploration of CPUs, which requires simulation platforms with high performance and flexibility. Applying retarget able instruction-set simulation tools in this environment can simplify the design of new architectures. The incr...
Commodity processors, which are prevalent in Internet-based data centers, do not have internal sensors for monitoring energy consumption. Such processors usually feature performance counters which can be used to indirectly estimate power consumption [1]. The usual approach in those studies is to derive linear power models based on the usage numbers...
Resumo Este trabalho apresenta uma proposta de desenvolvimento de uma ferramenta educacional baseada no uso de dispositivos móveis com caneta eletrônica, com o intuito de permitir a construção de um ambiente educacional colaborativo, fundamentado no modelo do aprendizado ativo. Experimentos foram realizados com o intuito de medir a interação dos al...
Virtual platforms are of paramount importance for design space exploration and their usage in early software development and verification is crucial. In particular, enabling accurate and fast simulation is specially useful, but such features are usually conflicting and tradeoffs have to be made. In this paper we describe how we integrated TLM commu...
Systems-on-chip (SoCs) result from the evolution of VLSI technology and the growth of integrated circuit complexity. As it happens each time design complexity impairs the expected time-to-market, the quest for higher productivity involves abstraction, reuse, automation, and exploration. This chapter reviews how such notions are handled at the Elect...
The main goal of this book is to enable Electronic System Level (ESL) research based on an open-source infrastructure. Two key components in this infrastructure are SystemC, as a hardware and system description language, and ArchC, as an architecture description language. In order to make it possible to readers that are not familiar with SystemC an...
Although SystemC is considered the most promising language for SoC functional modeling, it does not come with built-in power modeling capabilities. This chapter presents PowerSC, a power estimation framework which instruments SystemC for power characterization, modeling and estimation. Since it is entirely based on SystemC, PowerSC allows consisten...
This chapter will guide you through the process of designing a virtual platform. We start with a very simple Hello World example and expand it to a dual-core platform to serve as a basis for a MP3 decoder platform that is explained and implemented in the second part of the chapter. The MP3 decoder design starts with a profilling as the basis for a...
The rise of SoCs caused a paradigm shift on system design flow. The TLM methodology was created in the search for a new paradigm that could allow design representation at an intermediate level of abstraction between paper specification and RTL models. This chapter introduces the Transaction Level Modeling (TLM) design methodology. Its main goals ar...
This chapter presents a guideline to help designing new processors with ArchC. Instead of covering the syntax and semantics of the language, as Chap. 2 does, we will focus on the design flow of a new processor description, starting from the basic architectural declaration, following by instruction declaration and implementation. After this basic st...
This book intends to provide grounds for further research on electronic system level design (ESL), by means of open-source artifacts and tools, thereby stimulating the unconstrained deployment of new concepts, tools, and methodologies. It devises ESL design from the pragmatic perspective of a SystemC-based representation, by showing how to build an...
The design of new architectures can be simplified with the use of retargetable instruction set simulation tools, which can validate the decisions in the design exploration cycle with high flexibility and reduced cost. The increasing system complexity makes the traditional approach to simulation inefficient for today's architectures. The compiled si...
The shift towards multicore processors and the well-known drawbacks imposed by lock-based synchronization have forced researchers to devise new alternatives for building concurrent software, of which transactional memory is a promising one. This work presents a comprehensive study on the energy consumption of a state-of-the-art STM (Software Transa...
RISC processors can be used to face the ever increasing demand for performance required by embedded systems. Nevertheless, this solution comes with the cost of poor code density. Alternative encodings for instruction sets, such as MIPS16 and Thumb, represent an effective approach to deal with this drawback. This article proposes to apply a new enco...
Traditional software transactional memory designs are tar-geted towards performance and therefore little is known about their impact on energy consumption. We provide, in this pa-per, a comprehensive energy analysis of a standard STM de-sign and propose novel scratchpad-based energy-aware STM design strategies. Experimental results collected throug...
The image foresting transform (IFT) is a general tool for the design of image processing operators based on dynamic programming. Silicon image forest transform (SIFT) is a fast 8-bit data architecture for IFT-based operators in FPGA. It can implement queue-based methods such as morphological reconstructions, watershed transforms, shape saliences, d...