Jing-Yang Jou

Jing-Yang Jou
National Chiao Tung University | NCTU · Department of Electronics Engineering

About

152
Publications
12,886
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,344
Citations

Publications

Publications (152)
Article
Multiprocessors have become the main architecture trend in modern systems due to the superior performance; nevertheless, the power consumption remains a critical challenge. Global power management (GPM) aims at dynamically finding the power state combination that satisfies the power budget constraint while maximizing the overall performance (or vic...
Article
The recently proposed GPGPU architecture has added a multi-level hierarchy of shared cache to better exploit the data locality of general purpose applications. The GPGPU design philosophy allocates most of the chip area to processing cores, and thus results in a relatively small cache shared by a large number of cores when compared with conventiona...
Article
Nowadays, computer systems are limited by the power and memory wall. As the Dynamic Random Access Memory (DRAM) has dominated the power consumption in modern devices, developing power-saving approaches on DRAM has become more and more important. Among several techniques on different abstract levels, scheduling-based power management policies can be...
Article
Deploying the Shared Last-Level Cache (SLLC) is an effective way to alleviate the memory bottleneck in modern throughput processors, such as GPGPUs. A commonly used scheduling policy of throughput processors is to render the maximum possible thread-level parallelism. However, this greedy policy usually causes serious cache contention on the SLLC an...
Article
SystemVerilog provides powerful language constructs for verification, and one of them is the covergroup functional coverage model. This model is designed as a complement to assertion verification, that is, it has the advantage of defining cross-coverage over multiple coverage points. In this article, a coverage-driven verification (CDV) approach is...
Article
Dynamic power management has become an imperative design factor to attain the energy efficiency in modern systems. Among various power management schemes, learning-based policies that are adaptive to different environments and applications have demonstrated superior performance to other approaches. However, they suffer the scalability problem for m...
Article
The notion of multiple constant multiplication (MCM) is extensively adopted in digital signal processing (DSP) applications such as finite impulse filter (FIR) designs. A set of adders is utilized to replace regular multipliers for the multiplications between input data and constant filter coefficients. Though many algorithms have been proposed to...
Conference Paper
The demand of high performance and low power has increased the importance of power efficiency in multi-core systems. In modern multi-core architectures, DRAM has dominated the power consumption and therefore reordering based DRAM scheduling has been intensively studied to reduce the power. However, the benefit of reordering is not fully explored by...
Conference Paper
On-chip shared cache is effective to alleviate the memory bottleneck in modern many-core systems, such as GPGPUs. However, when scheduling numerous concurrent threads on a GPGPU, a cache capacity agnostic scheduling scheme could lead to severe cache contention among threads and thus significant performance degradation. Moreover, the diverse working...
Conference Paper
Improving functional coverage efficiently in a verification environment based on constrained random simulation could be a difficult task, since some design states are hard to be reached by random input patterns. On the other hand, manually crafting direct test patterns may be time consuming. In this paper, a functional test pattern generation (FTPG...
Article
In deep-submicron era, wire delay is becoming a bottleneck while pursuing higher system clock speed. Several distributed register (DR) architectures are proposed to cope with this problem by keeping most wires local. In this article, we propose the distributed register-file microarchitecture with inter-island delay (DRFM-IID). Though DRFM-IID is al...
Article
Memory Coalescing and on-chip shared Cache are two effective techniques to alleviate the memory bottleneck in modern GPGPUs. These two techniques are very useful on applications with regular memory accesses. However, they become ineffective on concurrent threads with large numbers of uncoordinated accesses and the potential performance benefit coul...
Conference Paper
While the technology node continually and aggressively scales, the resolution of FIB techniques does not scale as fast. Thus, the percentage of nets which can be observed or repaired through FIB probing or circuit editing is significantly decreased for advanced process technologies, which limits the candidates that can be physically examined throug...
Article
Accurate peak power measurement requires detailed switching and delay information for each signal over the entire simulation. Performing a full signal dump and power calculation for a long simulation run usually spends a large amount of time and disk space. In this paper, we propose an Essential-signal-based methodology which only performs power an...
Conference Paper
ó Non-Manhattan structures, such as the X and Y architectures, propose different a vors in reducing the use of physical resources, such as total wirelength and number of vias. However, in order to take full advantage of these structures, we need to develop new tools for these architectures, especially for early stages in physical design. In this pa...
Conference Paper
This paper presents a formal method for equivalence checking between the descriptions before and after scheduling in high-level synthesis (HLS). Both descriptions are represented by finite state machine with datapaths (FSMDs) and are then characterized through finite sets of paths. The main target of our proposed method is to verify scheduling empl...
Conference Paper
Taking full advantage of 3DIC technology requires innovative EDA tools that can optimize a multi-layered complex system. However, the optimization algorithms on the multi-layered 3DIC are usually computationally expensive. Parallel computing has been considered as a solution to manage the exploding computational requirement of future EDA tools. Thi...
Article
Full-text available
Verifying whether a building block conforms to a specific interface protocol is one of the important steps in a platform-based system-on-a-chip design methodology. There are limitations for most of the existing methods for interface protocol compliance verification. Simulation-based methods have the false positive problem while formal property chec...
Conference Paper
Fast Fourier Transform (FFT) cores are extensively used in digital signal processing (DSP) applications like communication systems. Many pipelined FFT architectures optimized for different objectives have been proposed in past few decades. Though a fixed pipelined FFT architecture can generally provide good throughput at reasonable hardware cost, i...
Conference Paper
In deep-submicron era, wire delay is becoming the bottleneck while pursuing high system clock speed. Several distributed register (DR) architectures are proposed to cope with this problem by keeping most wires local. In this paper, we propose the distributed register-file microarchitecture with inter-island delay (DRFM-IID). With such delay conside...
Article
Using voltage island methodology to reduce power consumption for System-on-a-Chip (SoC) designs has become more and more popular recently. Currently this approach has been considered either in system-level architecture or postplacement stage. Since hierarchical design and reusable intellectual property (IP) are widely used, it is necessary to optim...
Article
Full-text available
Verifying if an integrated component is compliant with certain interface protocol is a vital issue in component-based system-on-a-chip (SoC) designs. For simulation-based verification, generating massive constrained simulation stimuli is becoming crucial to achieve a high verification quality. To further improve the quality, stimulus biasing techni...
Conference Paper
Full-text available
In this paper, we propose one hierarchical 2-D mesh network-on-chip (NoC) platform to support applications with the complexity of several hundreds of tasks or with huge amount of transmission data. Moreover, applying the task binding method by considering communication amount, communication data contention and bandwidth penalty to enhance the syste...
Conference Paper
Full-text available
The fault diagnosis has become an increasing portion of todaypsilas IC-design cycle and significantly determines productpsilas time-to-market. However, the failure behaviors from the defective chips may not be fully represented by the single fault model. In this paper, we propose a fault-diagnosis framework targeting multiple stuck-at faults. This...
Article
When hardware description languages (HDLs) are used in describing the behavior of a digital circuit, design errors (or bugs) almost inevitably appear in the HDL code of the circuit. Existing approaches attempt to reduce efforts involved in this debugging process by extracting a reduced set of error candidates. However, the derived set can still con...
Conference Paper
In this paper, a crosstalk-avoidance (CA) bus coding approach is developed to produce area-efficient, low-power codes for global data busses. Proposed codes are codes without memory using shielding boundary strategy. Related code generation algorithm is also developed. The probabilistic distribution of input data could be included to reduce the pow...
Article
Full-text available
Before verifying the functionality of SoCs, designers must ensure the correctness of the pin-accurate interfaces of up to hundreds of integrated IP blocks. This article presents a new connection model and a corresponding error model for pin-accurate port connections, along with an algorithm for generating the minimum pattern set, a methodology for...
Conference Paper
In this paper, a bus encoding approach including related code generation algorithm for global data busses is developed to produce area-efficient crosstalk-avoidance (CA) codes with considering low-power requirements. Proposed codes are codes with memory using overlapping boundary strategy. The probabilistic distribution of input data could be inclu...
Article
Quickly and accurately predicting the performance based on the requirements for IP-based system implementations optimizes the design and reduces the design time and overall cost. This study describes a novel hybrid method for the word-length optimization of pipelined FFT processors that is the arithmetic kernel of OFDM-based systems. This methodolo...
Article
Simulation-based functional validation is still one of the primary approaches for verifying designs described in hardware description languages. Traditional code coverage metrics do not address the observability issue and may overestimate the extent of functional validation. Observability-based code coverage metric (OCCOM) is the first code coverag...
Conference Paper
As technology advances, the global interconnect delay and the power consumption of long wires become crucial issues in nanometer technologies. In particular, both inductive and capacitive coupling effects between wires result in serious problems such as crosstalk delay, coupling noise, and power consumption. In this paper, we propose a new bus enco...
Article
Full-text available
For complex digital circuits, building their power models is a popular approach to estimate their power consumption without detailed circuit information. In the literature, most of power models have to increase their complexity in order to meet the accuracy requirement. In this paper, we propose a tableless power model for complex circuits that use...
Conference Paper
Full-text available
On an SoC bus, contentions occur while different IP cores request the bus access at the same time. Hence an arbiter is mandatory to deal with the contention issue on a shared bus system. In different applications, IPs may have real-time and/or bandwidth requirements. It is very difficult to design an arbitration algorithm to simultaneously meet the...
Article
Diagnosing counterexamples with error traces has acted as one of the most critical steps in functional verification. Unfortunately, error traces are normally very lengthy such that designers need to spend considerable effort to understand them. To alleviate the designers' burden for debugging, we present a SAT-based algorithm for reducing the lengt...
Article
This paper shows that the worst case switching pattern that incurs the longest bus delay while considering the RLC effect is quite different from that while considering the RC effect alone. It implies that the existing encoding schemes based on the RC model may not improve or possibly worsen the delay when the inductance effects become dominant. A...
Conference Paper
Full-text available
In this paper, we propose a novel SoC design methodology referred to as multi-project system-on-a-chip (MP-SoC), which can integrate multiple heterogeneous SoC design projects into a single chip such that the total silicon prototyping cost for these projects can be greatly reduced due to the sharing of a common SoC platform. The design flows for th...
Conference Paper
Full-text available
Interface compliance verification plays a very important role in modern SoC designs. In order to perform a quantitative analysis of simulation completeness, adequate coverage metrics are mandatory. In this paper, we propose a finite state machine (FSM) based transaction-level functional coverage methodology for interface compliance verification. A...
Conference Paper
Full-text available
In shared SoC bus systems, arbiters are usually adopted to solve bus contentions with various kinds of arbitration algorithms. We propose an arbitration algorithm, RT_lottery, which is designed to meet both hard real-time and bandwidth requirements. For fast evaluation and exploration, we use high abstract-level models in our system simulation envi...
Article
As technology advances apace, crosstalk becomes a design metric of comparable importance to area and delay. This article focuses mainly on the crosstalk issue, specifically on the impacts of physical design and process variation on crosstalk. While the feature size shrinks below 0.25μm, the impact of process variation on crosstalk increases rapidly...
Article
In this paper, a novel strategy for designing the heterogeneous tree multiplexer is proposed. The authors build the multiplexer delay model by curve fitting and then formulate the heterogeneous tree multiplexer design problem as a special type of optimization problem called mixed-integer nonlinear programming (MINLP). A new design parameter, the sw...
Conference Paper
With the continuous shrinkage of device sizes, the global interconnect delay becomes a dominant factor of chip performance in deep-submicron technology. Furthermore, as the working frequency of integrated circuits increasing above GHz, the inductive crosstalk will also have very significant influence on the global interconnect delay. However, most...
Conference Paper
Full-text available
Verifying whether a building block conforms to certain interface protocol is one of the important steps while constructing an SoC. However, most existing methods have their own limitations. Simulation-based methods have the false positive problem while formal property checking methods may suffer from memory explosion and excessive runtime. In this...
Conference Paper
Full-text available
Network-on-chip is a new design paradigm for designing core based system-on-chip. It features high degree of reusability and scalability. In this paper, we propose a switch which employs the latency insensitive concepts and applies the round-robin scheduling techniques to achieve high communication resource utilization. Based on the assumptions of...
Conference Paper
Full-text available
Simulation based validation approaches are still the primary workhorse for solving the verification problem of getting the initial HDL description correct, especially for large scaled designs. However, most of existing code coverage metrics do not address observability issue (Fallah et al., 2001). Therefore, we intend to provide additional observab...
Conference Paper
Full-text available
Inductance effects of on-chip interconnects have become more and more significant in today's high-speed digital circuits, especially for global interconnects such as signal buses. However, most existing works consider only RC effects, e.g., the worst-case switching pattern resulting from coupling capacitance, to develop their encoding schemes to re...
Conference Paper
Debugging priority is a helpful technique to assist debugging faulty HDL designs. However, the debugging priority obtained by confidence score sorting is not good enough due to the inaccuracy in estimating the likelihood of correctness for error candidates. Therefore, we developed the refined confidence score for deriving better debugging priority.
Conference Paper
Diagnosing counterexamples with error traces has acted as one of the most critical steps in functional verification. Unfortunately, error traces are normally very lengthy such that designers need to spend considerable effort to understand them. To alleviate designers' burden for debugging, we present a SAT-based algorithm for reducing the lengths o...
Conference Paper
Full-text available
Verifying if an integrated component is compliant with certain interface protocol is a big issue in component-based SOC designs. Massive constrained random simulation stimuli are becoming crucial to achieve a high verification quality. To further improve the quality, the stimulus biasing technique should be used to guide the simulation to hit desig...
Conference Paper
We present an effective algorithm to enhance sequential depth computation. The sequential depth plays the most crucial role to the completeness of bounded model checking. Previous work computes the sequential depth by exhaustively searching the state space, which is unable to keep pace with the exponential growth of design complexity. To improve th...
Conference Paper
Full-text available
In a system-on-a-chip (SOC) design, several to hundreds of design blocks or intellectual properties (IPs) are integrated to form a complex function. Prior to verify the functionality of the integrated IPs, it is very important to ensure the correctness of the port connections among these IPs. This work addresses the problem of verification on port...
Conference Paper
Full-text available
We propose a graph automorphism-based algorithm for computing maximal sets of symmetric inputs of circuits. It can be used to identify nonsymmetric inputs in a circuit and enhance the efficiency of input matching, library binding, as well as logic verification problems. We conduct the experiments on some benchmarks. The experimental results demonst...
Article
For large circuits, vector compaction techniques could provide a faster solution for power estimation with reasonable accuracy. Because traditional sampling approach will incur useless transitions between every sampled pattern pairs after they are concatenated into a single sequence for simulation, we proposed a vector compaction method with groupi...
Conference Paper
Inductance effects on on-chip interconnects have become more and more significant in today's high-speed digital circuits, especially on global interconnects such as signal buses. However, most existing works consider only RC effects, e.g., the worst-case switching pattern resulting from coupling capacitance, to develop their encoding schemes to red...
Conference Paper
Nowadays, finding subcircuits in a larger circuit is primarily solved by using various heuristics based on graph isomorphism. These approaches are addressed on identifying one specific subcircuit at each time and may take numerous runs if many subcircuits need to be extracted. Therefore, they are not quite suitable for converting the whole circuit...
Conference Paper
The growth of modern IC design complexity leads the consistency checks and design verification during every level in design flow to be an important and challenged issue. We propose an efficient approach to rebuild the hierarchial level from low level circuits. Our approach is based on the structure equivalent expansion algorithm to find repeated su...
Article
Full-text available
As technology advances and the number of interconnections among modules rapidly increases, timing closure, and design convergence are the most important concerns. Hence, it is desirable to consider interconnect optimization as early as possible. Previous work for this issue can be classified into two directions: wire planning and buffer-block plann...
Article
Testbenches play one of the most important roles in simulation-based design verification. Given a simulation scenario, a testbench provides specific vectors to simulate the design, then collects responses from the design to monitor whether the simulation has satisfied the scenario. The major bottleneck in writing testbenches is generating valid sim...
Conference Paper
Full-text available
A novel strategy for designing the heterogeneous-tree multiplexer is proposed. We build the multiplexer delay model by curve fitting and then formulate the heterogeneous-tree multiplexer design problem as a special type of optimization problem called mixed-integer nonlinear programming (MINLP). A new design parameter, the switch size in each stage,...
Conference Paper
Full-text available
As the operation frequency reaches gigahertz in deep-submicron designs, the effects of inductance on noise and delay can no longer be neglected. Some of the previous techniques such as net ordering, shield insertion, twisted-bundle layout structure, and interdigitated techniques are either inefficient or incur too much area penalty. We present two...
Conference Paper
Full-text available
We employ a monitor-based approach for on-chip bus (OCB) compliance test. To describe the OCB protocols, we propose a FSM model, which can help to extract the necessary properties systematically and verify the data part of a bus transfer efficiently. To demonstrate our methodology, we illustrate two OCB protocols, WISHBONE and AMBA AHB, as the stud...
Conference Paper
A fast and accurate power estimation of circuits is definitely required when the low power issues become more and more important. For large circuits, vector compaction techniques could provide a fast solution for power estimation with reasonable accuracy. In previous work, we proposed an efficient vector compaction method with grouping and single-s...
Conference Paper
This paper presents an effective multiplier synthesis algorithm for cell-based multipliers. By using a novel tree generation algorithm with timing consideration for each vertical compressor slice(VCS), our synthesizer generates multipliers automatically with very promising results.
Article
Noise, as well as area, delay, and power, is one of the most important concerns in the design of deep sub-micron ICs. Currently existing algorithms can not handle simultaneous switching conditions of signals for noise minimization. In this paper, we model not only physical coupling capacitance, but also simultaneous switching behavior for noise opt...
Article
In this paper, we propose an efficient IP-Level power model with a small lookup table for complex CMOS circuits. The table has only one dimension that maps the zero-delay charging and discharging capacitance (CDC) into the real power consumption of pattern pairs but still has high accuracy. In order to reduce the table size, we collect those patter...
Conference Paper
The growing of the modern design complexity leads the design error diagnosis to be a challenge for designers. In this paper, we propose an efficient approach for design error diagnosis automatically for designs in HDL. This approach can handle multiple errors occurring in a HDL design simultaneously with only one test case by analyzing the simulati...
Article
Full-text available
As the operation frequency reaches gigahertz in deep-submicron designs, the effects of inductance on noise and delay can no longer be neglected. Most of the previous works on inductance extraction are field-solvers, which are intrinsically more accurate but computationally expensive. Others focus on modeling the inductances of special routing topol...
Article
This technique reduces the number of required functional patterns by first defining conditions for hard-to-control (HTC) code in a hardware-description-language design and then using an algorithm to detect such code automatically. A second algorithm eliminates these HTC points by selecting a minimum number of nodes for control point insertion.
Conference Paper
Full-text available
This paper presents an automatic interconnection rectification (AIR) technique to correct the misplaced interconnection occurring in the integration of an SoC design automatically. The experimental results show that the AIR can correct the misplaced interconnection and therefore accelerates the integration verification of an SoC design.
Conference Paper
Full-text available
As technology advances and the number of interconnections among modules rapidly increases, timing closure and design convergence are the most important concerns. Hence, it is desirable to consider interconnect optimization as early as possible. In this paper, we first address simultaneous floorplanning and buffer block planning (i.e., integrating b...
Conference Paper
Full-text available
In this paper, we propose an efficient IP-level power model with a small lookup table for complex CMOS circuits. The table has only one dimension that maps the zero-delay charging and discharging capacitance into the real power consumption of pattern pairs but still has high accuracy. In order to improve the efficiency of the characterization proce...
Article
Full-text available
Embedded cores are being increasingly used in large system-on-a-chip (SoC) designs. The high complexity of SoC designs lead the design verification to be a challenge for system integrators. This paper presents an automatic interconnection rectification (AIR) technique based on the port order fault model to detect, diagnose, and correct the misplace...
Conference Paper
Full-text available
This paper presents an automatic interconnection rectification (AIR) technique to correct the misplaced interconnection occurred in the integration of a SoC design automatically. The experimental results show that the AIR can correct the misplaced interconnection and therefore accelerates the integration verification of a SoC design.
Article
Full-text available
Embedded cores are being increasingly used in the design of large system-on-a-chip (SoC). Because of the high complexity of SoC, the design verification is a challenge for system integrators. To reduce the verification complexity, the port-order fault (POF) model was proposed. It has been used for verifying core-based designs and the corresponding...
Article
Monte Carlo approach for power estimation is based on the assumption that the samples of power are Normally distributed. However, the power distribution of a circuit is not always Normal in the real world. In this paper, the Bootstrap method is adopted to adjust the confidence interval and redeem the deficiency of the conventional Monte Carlo metho...
Article
Full-text available
Embedded cores are being increasingly used in the design of large system-on-a-chip (SoC). Because of the high complexity of SoC, the design verification is a challenge for system integrators. To reduce the verification complexity, the port-order fault (POF) model has been used for verifying core-based designs (Tang and Jou, 1998). In this paper, we...