Fernanda Lima Kastensmidt

Fernanda Lima Kastensmidt
Federal University of Rio Grande do Sul | UFRGS · Institute of Informatics

PhD

About

279
Publications
38,961
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,710
Citations
Additional affiliations
March 2005 - October 2016
Federal University of Rio Grande do Sul
Position
  • Professor (Associate)

Publications

Publications (279)
Article
This work investigates the impact of neutron and heavy ion radiation-induced soft errors on Arm Cortex-M Systems-on-Chip and proposes a fault injection methodology designed for the early assessment of these effects on the embedded processors. Our methodology is then employed to assess the effectiveness of software design exploration and implementin...
Article
This work investigates the impacts of neutron-induced soft errors on the reliability of aerial image classification neural networks running on a softcore GPU implemented in an SRAM-based FPGA. We designed and trained fixed-point and floating-point all-convolutional neural networks to classify four-channel aerial images from the SAT-6 dataset, extra...
Article
This work investigates selective mitigation techniques to improve the reliability of a configurable open-source softcore GPU implemented in an SRAM-based FPGA against configuration memory faults. It explores the unhardened and hardened reliability curves of isolated groups of the target GPU's modules to guide the decision of the best candidates for...
Poster
Full-text available
The advancement in the integrated circuit design has provided more robust technologies to radiation effects. However, the soft error susceptibility remains a significant challenge due to the demand for higher operating frequencies and low-power circuits, besides the manufacturing process variability. This work explores different transistor arrangem...
Chapter
In this chapter, we have covered some of the general challenges in the physical stress of electronic devices. The benefits of using particles beams to evaluate the reliability of devices are various, such as a realistic error rate and a realistic error model. Additionally, thanks to the accelerated beam, a statistically significant amount of data c...
Conference Paper
Full-text available
This work analyzes the efficiency of transistor folding combined with diffusion splitting in improving the Single-Event Transient robustness of digital circuits. Results show that these techniques can reduce cross-section and increase threshold Linear Energy Transfer.
Conference Paper
This work investigates the vulnerability of an image classification engine under heavy-ions accelerated irradiation. The engine is based on all-convolutional neural-network trained with the GTSRB traffic sign recognition benchmark and embedded into 28nm SRAMbased FPGA.
Article
The NanoXplore is the European pioneer vendor to develop ITAR-free radiation-hardened SRAM-based FPGAs. This work is the first to explore dynamic SEE tests in the NG-Medium FPGA device. The reliability-performance analysis of an embedded unmitigated design is performed under heavy ion-induced errors. Moreover, the improvements of additional user le...
Conference Paper
This work investigates how the approximate computing paradigm can be exploited to provide low-cost fault tolerant architectures. In particular, we focus on the implementation of Approximate Triple Modular Redundancy (ATMR) designs using the precision reduction technique. The proposed method is applied to two benchmarks and a multitude of ATMR desig...
Article
In this paper we experimentally and analytically evaluate the reliability of two state-of-the-art neural networks for linear regression and pattern recognition (Multi-Layer Perceptron and Single-Layer Perceptron) implemented in a System on Chip composed of an FPGA and a microprocessor. We have considered, for each neural network, three different ac...
Conference Paper
Full-text available
This work evaluates the SET response of FinFET- based Majority Voter circuits under the process-variability impact of metal-gate Work-Function Fluctuation (WFF). Results show that SET pulsewidth is expected to increase under WFF effects. Moreover, results show that the relative standard deviation of the SET pulses can increase from 50% up to 80% ac...
Conference Paper
Full-text available
Traditional CMOS technology has reached its limit in the deep submicron era. Hence, advanced technology nodes require novel device structures and new materials to overcome the challenges faced when dealing with planar devices for nanocircuits. As technology scales down, the circuits are becoming more susceptible to the increase of the uncertainty d...
Conference Paper
This paper evaluates the efficiency and performance impact of a dual-core lockstep as a method for fault-tolerance running on top of FreeRTOS applications. The method was implemented on a dual-core ARM Cortez-A9 processor embedded into the Zynq-7000 APSoC. Fault injection experiments show that the method can mitigate up to 63% on the FreeRTOS appli...
Article
This work shows the impact of low LET (Linear Energy Transfer) heavy ions on the reliability of 28-nm Bulk SRAM (Static Random Access Memory) cells from Artix-7 FPGA (Field-Programmable Gate Array). Irradiation tests on the ground showed significant differences in the MBU (Multiple Bit Upset) cross section of configuration (CRAM) and Block (BRAM) m...
Article
Radiation-induced soft error is an ever-increasing concern in the microelectronic industry in order to provide reliable VLSI systems at advanced technology nodes. Most of the redundancy-based methodologies adopt majority voters to ensure the fault masking. This paper presents a comparative analysis of different majority voter designs in 7 nm FinFET...
Article
All Programmable System-on-Chip (APSoC) devices are designed to provide higher overall programmable flexibility and system performance at lower costs. Such characteristics make APSoCs very suitable and attractive for critical environments, such as the one encountered in the accelerators chain of the European Organization for Nuclear Research (CERN)...
Article
This paper presents an approach based on software-based fault tolerance techniques applied at low abstraction level to detect SEU faults in register files of Graphics Processing Units. SEU faults have a major influence on such architectures, especially affecting register files and cache memory. In order to harden the system's register files, softwa...
Article
This paper presents an analysis of the efficiency of traditional fault tolerance methods on parallel systems running on top of Linux OS. It starts by studying the occurrence of software errors at systems presenting different levels of complexity, from sequential bare-metal to parallel Linux applications. Then two traditional fault tolerance mechani...
Article
In this paper, we investigate the impact of register file errors in modern embedded microprocessors reliability through fault-injection and heavy-ion experiments. Additionally, we evaluate how different levels of compiler optimization modify the usage and failure probability of a processor register file. We select six representative benchmarks, eac...
Article
SRAM-based FPGAs are attractive to critical applications due to their reconfiguration capability, which allows the design to be adapted on the field under different upset rate environments. High level Synthesis (HLS) is a powerful method to explore different design architectures in FPGAs. In this paper, the HLS tool from Xilinx is used to generate...
Conference Paper
This paper explores the use of dual-core lockstep as a fault-tolerance solution to increase the dependability in hard-core processors embedded in APSoCs. As a case study, we designed and implemented an approach based on lockstep to protect a dual-core ARM Cortex-A9 processor embedded into Zynq-7000 APSoC. Experimental results show the effectiveness...
Conference Paper
This paper investigates the use of Triple Modular Redundancy (TMR) in hardware accelerators designs described in C programming language and synthesized by High Level Synthesis (HLS). A setup composed of a soft-core processor and a matrix multiplication design protected by TMR and embedded into an SRAM-based FPGA was analyzed under accumulated bit-f...
Conference Paper
In this paper we present an extended fault injection approach to configuration memory of SRAM-based FPGAs consisting of inter frame many bits upsets to be used as an evaluation tool for attack detection capability and countermeasure effectiveness in security sensitive design modules. The work presented in this paper is twofold. First, we present th...
Conference Paper
Artificial Neural Networks (ANNs) have gained a considerable interest in clustering, pattern recognition, function approximation and many others applications, due to its parallel capability of processing the data. Moreover, ANNs can also be used to accelerate parts out of an algorithm which the data can be approximated using NPUs (Neural Processing...
Conference Paper
The power density of integrated circuits increases with the technology scaling, so the need of implementing low-power designs is increasing. The clock gating technique is typically employed to reduce the dynamic power consumption in digital integrated circuits. However, the use of this approach could affect the reliability of the device in the pres...
Article
Because of technology scaling, the soft error rate has been increasing in digital circuits, which affects system reliability. Therefore, modern processors, including VLIW architectures, must have means to mitigate such effects to guarantee reliable computing. In this scenario, our work proposes three low overhead fault tolerance approaches based on...
Article
The increasing system complexity of FPGA-based hardware designs and shortening of time-to-market have motivated the adoption of new designing methodologies focused on addressing the current need for high-performance circuits. High-Level Synthesis (HLS) tools can generate Register Transfer Level (RTL) designs from high-level software programming lan...
Article
This paper presents an analysis of the occurrence of software errors at parallel applications using POSIX Threads (Pthreads) versus their OpenMP counterparts and sequential versions. All cases were tested at the ARM Cortex-A9 dual-core processor that is embedded in many commercial SoC available in the market. The OVP simulator platform is used to i...
Article
ARM processors are leaders in embedded systems, delivering high-performance computing, power efficiency, and reduced cost. For this reason, there is a relevant interest for its use in the aerospace industry. However, the use of sub-micron technologies has increased the sensitivity to radiation-induced transient faults. Thus, the mitigation of soft...
Article
All Programmable System-on-Chip (APSoC) devices are designed to provide higher overall system performance and programmable flexibility at lower power consumption and costs. Although modern commercial APSoCs offer a plethora of advantages, they are prone to experience Single Event Upsets. We investigate the impact of using different system architect...
Conference Paper
SRAM-based FPGAs are attractive to critical applications due to their reconfiguration capability, which allows the design to be adapted on the field under different upset rate environments. High level Synthesis (HLS) is a powerful method to explore different design architectures in FPGAs. In this paper, we analyze four different design architecture...
Conference Paper
Este trabalho analisa o timing vulnerability factor (TVF) em tecnologias nanométricas de um flip-flop D do tipo mestre-escravo na presença de bit-flips em estruturas de pipeline. O foco do trabalho consiste em determinar o quanto o TVF é impactado usando diferentes frequências de operação, nodos tecnológicos distintos e diferentes atrasos de caminh...
Chapter
Critical applications must rely on fault-tolerant systems in order to guarantee an error-free execution since the cost of a system fault can be paid in terms of millions of dollars or, even worse, in terms of human lives. In this context, Dynamic Partial Reconfiguration (DPR) enables a more optimized and reliable usage of state-of-the-art Xilinx SR...
Chapter
This chapter describes a neutron-induced Single Event Effect test in a commercial Mixed-Signal Programmable System-on-Chip FPGA from Microsemi. The main objective is to investigate the digital and analog parts reliability for critical application projects. The case-study circuit is a data acquisition system that uses analog blocks, buses and interf...
Chapter
Fault injection by emulation is a well-known method to analyze the reliability of a circuit. SRAM-based FPGAs provide the hardware infrastructure to implement fault injectors taking advantage of dynamic partial reconfiguration. This chapter presents the details of a Multiple Fault Injection Platform and the analysis of the configuration memory upse...
Chapter
There is an increasing interest in aerospace industry to increment the flexibility of the systems and reduce their cost. In this way, FPGAs offer several advantages as low-cost platform to deploy customized systems. However, the use of sub-micron technologies has increased their sensitivity to radiation-induced transient faults. Therefore, the miti...
Chapter
Triple Modular redundancy technique is mostly used to mask transient faults in circuits operating in dependable systems. The generalization of this technique (known as nMR) allows the use of more than three redundant copies of the circuit to increase the reliability under multiple faults. The main drawback of nMR is its high power consumption, whic...
Chapter
This book introduces the concepts of soft errors in FPGAs and GPUs. The chapters cover radiation effects in FPGAs, fault-tolerant techniques for FPGAs, use of COTS FPGAs in aerospace applications, experimental data of FPGAs under radiation, FPGA embedded processors under radiation, and fault injection in FPGAs. Since dedicated parallel processing a...
Book
This book introduces the concepts of soft errors in FPGAs, as well as the motivation for using commercial, off-the-shelf (COTS) FPGAs in mission-critical and remote applications, such as aerospace. The authors describe the effects of radiation in FPGAs, present a large set of soft-error mitigation techniques that can be applied in these circuits, a...
Article
Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system’s performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess w...
Article
Radiation effects such as soft errors are the major threat to the reliability of SRAM-based FPGAs. This work analyzes the effectiveness in correcting soft errors of a novel scrubbing technique using internal frame redundancy called Frame-level Redundancy Scrubbing (FLR-scrubbing). This correction technique can be implemented in a coarse grain TMR d...
Article
Software-based techniques offer several advantages to increase the reliability of processor-based systems at very low cost, but they cause performance degradation and an increase of the code size. To meet constraints in performance and memory, we propose SETA, a new control-flow software-only technique that uses assertions to detect errors affectin...
Conference Paper
Full-text available
Increasing chip power densities allied to the continuous technology shrink is making emerging multiprocessor embedded systems more vulnerable to soft errors. Due the high cost and design time inherent to board-based fault injection approaches, more appropriate and efficient simulation-based fault injection frameworks become crucial to guarantee the...
Article
Redundancy is the most popular technique to add fault tolerance at system level to electronic systems. Redundancy with hardware and software diversity of digital computers is currently employed in safety critical applications, as, for example, in spacecrafts and commercial aircrafts, to increase the reliability of such systems. This work presents a...
Conference Paper
We investigate the impact of using different system architectures on an APSoC, such as memory organization, communication schemes and by using hard- and soft-cores in the same context, in the final system failure rate.
Conference Paper
The reliability of modern devices like All Programmable SoC (APSoC) devices to Soft Errors has decreasing with the constant technology scaling due to the reduction of transistor size and reduced voltage supply. This work presents static tests performed with heavy ions and protons irradiations in the Xilinx Zynq-7000 APSoC to measure the sensitivity...
Conference Paper
A set of software-based techniques to detect soft errors in embedded ARM processors at low costs is presented. Fault injection results show high fault coverage at performance and memory overheads inferior to state-of-the-art techniques.
Conference Paper
The third dimension is becoming an attractive solution to integrate components in a single integrated circuit. Therefore, 3D Networks-on-Chip (NoCs) are usually adopted to provide fast connections between the layers by using Through-Silicon-Vias (TSVs). However, many challenges during the 3D manufacturing phase are making the circuits more vulnerab...
Article
The use of Triple Modular Redundancy (TMR) with majority voters can guarantee 100% single fault masking coverage for a given circuit against transient faults. However, this methodology presents a minimum area overhead of 200% compared to the original circuit. In order to reduce considerably the area overhead without compromising significantly the f...
Conference Paper
Because of technology scaling, the soft error rate has been increasing in digital circuits, which in turn affects system reliability. Therefore, modern processors, including VLIW architectures, must have means to mitigate such effects to guarantee reliable computation. In this scenario, our work proposes two new low overhead fault tolerance approac...
Conference Paper
Soft errors are becoming a major concern in integrated circuits fabricated in nanometer technology working in dependable applications. The goal of this paper is to determine the dependency of soft errors in integrated circuits with its operating frequency and variety of delays in the combinational logic paths. Each circuit flip-flop has a different...
Conference Paper
Full-text available
The recent advance of silicon technology has allowed the integration of complex systems in a single chip. Nowadays, Field Programmable Gate Array (FPGA) devices are composed not only of the programmable fabric but also by hard-core processors, dedicated processing block interfaces to various peripherals, on-chip bus structures and analog blocks. Am...
Conference Paper
N-Modular Redundancy (NMR) with majority voters has been widely used to increase reliability. While bit-voters perform bit by bit comparisons, which are the most basic and fast voting scheme, word-voters consider all bits in parallel to determine the final output, increasing data integrity but likewise the area. This paper proposes to merge the adv...
Article
SRAM-based FPGAs are attractive to many high reliable applications at ground level due to its high density and configurability. However, due to its high sensitivity to neutroninduced soft errors, the FPGA configuration memory bits may suffer unexpected bit-flips and consequently critical errors may occur. To cope with this problem, authors have pro...
Conference Paper
Full-text available
This paper analyses the nature of fault tolerance software-based techniques and the influence of their overheads to determine an efficient strategy for applying those techniques in a selective way. Several considerations that have to be taken into account are presented in this work. These include an analysis of fault coverage and overheads when sel...
Conference Paper
Full-text available
The use of Triple Modular Redundancy (TMR) with majority voters can guarantee full single fault masking coverage for a given circuit against transient faults. However, it presents a minimum area overhead of 200% compared to the original circuit. In order to reduce area overhead drastically without compromising significantly the fault coverage, TMR...
Conference Paper
TMR is the most widely used technique to increase the reliability of SRAM-based FPGAs used in safety-critical applications. In this paper we evaluate experimentally the realistic effectiveness of several TMR schemes implemented with different levels of granularity. We measure and compare the dynamic cross-section of the TMRd circuits as well as num...
Article
Full-text available
There is an increasing concern to reduce the cost and overheads during the development of reliable systems. Selective protection of most critical parts of the systems represents a viable solution to obtain a high level of reliability at a fraction of the cost. In particular to design a selective fault mitigation strategy for processor-based systems...
Article
Modern System on Chips (SoCs) and embedded electronic devices work at very high frequencies, which have the countermeasure of increasing the power dissipation and, consequently, the silicon die temperature. The presented radiation experiments on a 28nm FPGA-based SoC demonstrate that the temperature variation caused by a higher operating frequency...
Chapter
Different fault tolerance techniques can be applied to FPGAs according to their type of configuration technology, architecture and target operating environment. This chapter will present a set of fault mitigation techniques for SRAM, FLASH and ANTIFUSE-based FPGAs and a test methodology to characterize those FPGA under radiation. Results from neutr...
Article
In this paper, we propose a method that combines dedicated test designs, readback and bitstream comparisons to investigate soft errors in a nanoscale SRAM-based FPGA under photoelectric stimulation. Static test is performed to analyze the SEU dependency to voltage supply. Static cross-section and threshold energy are presented. Dynamic test is acco...
Article
This paper explores the concept of Design Diversity Redundancy applied to SRAM-based FPGAs as a proposal to decrease failure rate. A 32-bit RISC processor MIPS was protected by coarse grain Triple Modular Redundancy (TMR) and by Diverse TMR (DTMR). Experimental results under neutron flux radiation show that DTMR can reduce in 40% the Failure in Tim...
Article
As the semiconductor technology advances, transistor size decreases and become more susceptible to upsets. In certain fields, such as space applications, multiple faults may occur at the same time. Traditional fault-tolerance techniques, such as N-Modular Redundancy (NMR) with majority voters, have been used to increase system reliability. Voters c...
Conference Paper
Full-text available
Soft errors are a major concern in aerospace applications. Software-based fault-tolerance techniques offer several advantages to increase the reliability of these applications if a microprocessor or microcontroller is utilized. However, the protection of the data-flow implies data and instruction redundancy which brings significant increment in exe...
Conference Paper
The susceptibility of SRAM-based FPGAs to soft errors increases with each technology node due to the reduction of transistor size, the reduction of voltage supply and the increase of density of devices. This work presents the actual impact of voltage reductions for neutron-induced soft errors in SRAM-based FPGAs. We run neutron radiation experiment...
Conference Paper
Triple Modular redundancy technique is mostly used to mask transient faults in circuits operating in dependable systems. The generalization of this technique (known as nMR) allows the use of more than three redundant copies of the circuit to increase the reliability under multiple faults. The main drawback of nMR is its high power consumption, whic...
Chapter
This chapter introduces the main technical terms used in this text, describes the microprocessor architecture used as a case study, and discusses background information required for better enlightenment of the topics in this book.
Chapter
As stated in the previous Chapters, software-based techniques are unable to detect all faults affecting the control flow, while hardware-based techniques cannot protect processors without at least doubling its area. On the other hand, combined into hybrid techniques , they can not only present increase their detection rates, but also be optimized i...