
José Flich- Polytechnic University of Valencia
José Flich
- Polytechnic University of Valencia
About
201
Publications
24,333
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,041
Citations
Current institution
Publications
Publications (201)
Neural networks are widely used in critical environments such as healthcare, autonomous vehicles, or video surveillance. To ensure the safety of the systems that rely on their functionality, it is essential to validate their correct behaviour in the presence of faults. This paper studies the behaviour of state-of-the-art neural network models with...
The automation of railroad operations is a rapidly growing industry. In 2023, a new European standard for the automated Grade of Automation (GoA) 2 over European Train Control System (ETCS) driving is anticipated. Meanwhile, railway stakeholders are already planning their research initiatives for driverless and unattended autonomous driving systems...
Neural networks (NN) for image processing in embedded systems expose two conflicting requirements: increasing computing power needs as models become more complex and constrained resource budget. In order to alleviate this problems , model compression based on quantization and pruning techniques are common. Derived models then need to fit on reconfi...
The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target...
The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target...
The evolution of High-Performance Computing (HPC) platforms enables the design and execution of progressively larger and more complex workflow applications in these systems. The complexity comes not only from the number of elements that compose the workflows but also from the type of computations they perform. While traditional HPC workflows target...
At the present time, we are immersed in the convergence between Big Data, High-Performance Computing and Artificial Intelligence. Technological progress in these three areas has accelerated in recent years, forcing different players like software companies and stakeholders to move quickly. The European Union is dedicating a lot of resources to main...
Deadlock-free dynamic network reconfiguration process is usually studied from the routing algorithm restrictions and resource reservation perspective. The dynamic nature yielded by the transition process from one routing function to another is often managed by restricting resource usage in a static predefined manner, which often limits the supporte...
During the recent years HPC systems are being targeted as suitable systems to run DeepLearning workloads. In that respect, a number of machine learning libraries exist targeting different HPC computing platforms. In the context of the European DeepHealth project, the European Distributed Deep Learning library (EDDL) and the European Computing Visio...
Component reliability and performance pose a great challenge for interconnection networks. Future technology scaling such as transistor integration capacity in VLSI design will result in higher device degradation and manufacture variability. As a consequence, changes in the network arise, often rendering irregular topologies. This paper proposes a...
Deadlock-free dynamic network reconfiguration process is usually studied from the routing algorithm restrictions and resource reservation perspective. The dynamic nature yielded by the transition process from one routing function to another is often managed by restricting resource usage in a static predefined manner, which often limits the supporte...
The ever need for higher performance forces industry to include technology based on multi-processors system on chip (MPSoCs) in their safety-critical embedded systems. MPSoCs include a network-on-chip (NoC) to interconnect the cores between them and with memory and the rest of shared resources. Unfortunately, the inclusion of NoCs compromises guara...
The need for increasing the performance of critical real-time embedded systems pushes the industry to adopt complex multi-core processor designs with embedded networks-on-chip. In this paper we present hp-DCFNoC, a distributed dynamic scheduler design that by relying on the key properties of a delayed conflict-free NoC (DCFNoC) is able to achieve p...
The adoption of many-cores in safety-critical systems requires real-time capable networks on chip (NoC). In this paper we propose a new time-predictable NoC design paradigm where contention within the network is eliminated. This new paradigm builds on the Channel Dependency Graph (CDG) and guarantees by design the absence of contention. Our delayed...
The transition to Exascale computing is going to be characterised by an increased range of application classes. In addition to traditional massively parallel "number crunching" applications, new classes are emerging such as real-time HPC and data-intensive scalable computing. Furthermore, Exascale computing is characterised by a "democratisation" o...
The Horizon 2020 MANGO project aims at exploring deeply heterogeneous accelerators for use in High-Performance Computing systems running multiple applications with different Quality of Service (QoS) levels. The main goal of the project is to exploit customization to adapt computing resources to reach the desired QoS. For this purpose, it explores d...
Dynamic Voltage and Frequency Scaling (DVFS) can be a very effective power management strategy not only for on-chip processing elements but also for the network-on-chip (NoC). In this paper we propose a new approach to DVFS in NoC, which combines a congestion management strategy with a feedback-loop controller. The controller sets frequency and vol...
As technology advances, applications demand more and more computing power. However, achieving the required performance is not nowadays the single target, as reducing power consumption has become a key issue. In that sense, power-control mechanisms such as Dynamic Voltage and Frequency Scaling (DVFS) are introduced in order to dynamically adapt freq...
Buffer resource minimization plays an important role to achieve power-efficient NoC designs. At the same time, advanced switching mechanisms like virtual cut-through (VCT) are appealing due to their inherited benefits (less network contention, higher throughput, and simpler broadcast implementations). Moreover, adaptive routing algorithms exploit t...
Interconnection networks are key components in high-performance computing (HPC) systems, their performance having a strong influence on the overall system one. However, at high load, congestion and its negative effects (e.g., Head-of-line blocking) threaten the performance of the network, and so the one of the entire system. Congestion control (CC)...
As multi-core systems transition to the many-core realm, the pressure on the interconnection network is substantially elevated. The Network-on-Chip (NoC) is expected to undertake the expanding demands of the ever-increasing numbers of processing elements, while—at the same time—technological and application constraints increase the pressure for inc...
In application-specific SoCs, the irregularity of the topology ends up in a complex and customized implementation of the routing algorithm, usually relying on routing tables implemented with memory structures at source end nodes. As system size increases, the routing tables also increase in size with nonnegligible impact on power, area, and latency...
Combining the benefits of 3D ICs and Networks-on-Chip (NoCs) schemes provides a significant performance gain in Chip Multiprocessors (CMPs) architectures. As multicast communication is commonly used in cache coherence protocols for CMPs and in various parallel applications, the performance of these systems can be significantly improved if multicast...
Most of the network traffic in a Chip Multiprocessor (CMP) is due to messages exchanged by the caches according to the cache coherence protocol. Different types of messages have different requirements as far as latency and bandwidth are concerned, so the Network-on-Chip (NoC) should be tailored to fit the needs of each class of messages to maximize...
As parallel computing systems increase in size, the interconnection network is becoming a critical subsystem. The current trend in network design is to use as few components as possible to interconnect the end nodes, thereby reducing cost and power consumption. However, this increases the probability of congestion appearing in the network. As conge...
In order to cope with an increased level of resource contention and dynamic application behaviour, the runtime reconfiguration of the routing function of an on-chip interconnection network is a desirable feature for multi-core hardware platforms in the embedded computing domain. The most intuitive approach consists of draining the network from ongo...
On-chip networks (NoCs) promise to become an efficient communication infrastructure for multi-core architectures. However, there is still a need for efficient power-performance methodologies, since the interconnect power-envelope is really slim and cannot be neglected. Indeed, new power-aware design explorations in current and future multicore syst...
As InfiniBand clusters grow in size and complexity, the need arises to segment the network into manageable sections. Up until now, InfiniBand routers have not been used extensively and little research has been done to accommodate them. However, the limits imposed on local addressing space, inability to logically segment fabrics, long reconfiguratio...
Many-core chip designs are the current manufacturing trend for high-performance computing. Different challenges lead to different designs, whether general purpose-driven chip multiprocessors (CMPs) or application-specific multiprocessor system-on-chips (MPSoCs) are deployed. An emerging problem is on-chip network congestion, due either to several t...
In this paper, we propose a fast algorithm to reprogram the routing function of an on-chip network (NoC) at runtime. This reconfiguration algorithm comes with the following key novelties. First, it deals with the lack of routing tables, which are poorly scalable and lengthy to reconfigure. Second, it can deal with any number of faults that might be...
Future chip multiprocessors (CMPs) may have hundreds to thousands of threads competing to access shared resources, and will require quality-of-service (QoS) support to improve system utilization. This paper introduces Globally-Synchronized Frames (GSF), ...
Networks-on-Chip (NoC) with low-radix switches forming a simple and planar topology is typically accepted as the right interconnection infrastructure for current Chip Multi Processor and high-end Multi Processor System-on-Chip. This is mainly due to its simplicity in the physical mapping on the chip. However, as the network diameter increases, late...
Embedded devices are becoming more and more present everywhere. Moreover, mobile devices are becoming also more computationally powerful. These embedded architectures present new challenges since they execute several applications that must preserve security, allow sharing information in a coherent way, to be scalable and provide the required levels...
Networks-on-FPGA consist of a network of switches connected with point-to-point links and can cover sufficiently the communication needs of complex systems implemented on FPGA platforms. The efficient implementation of such networks requires the appropriate tuning of their components to the characteristics of the FPGA's logic and memory resources....
Current and future on-chip networks will feature an enhanced degree of reconfigurability. Power management and virtualization strategies as well as the need to survive to the progressive onset of wear-out faults are root causes for that. In all these cases, a non-intrusive and efficient reconfiguration method is needed to allow the network to funct...
High-end MPSoC systems with built-in high-radix topologies achieve good performance because of the improved connectivity and the reduced network diameter. In high-end MPSoC systems, fault tolerance support is becoming a compulsory feature. In this work, we propose a combined method to address permanent and transient link and router failures in thos...
Most standard cluster interconnect technologies are flexible with respect to network topology. This has spawned a substantial amount of research on topology agnostic routing algo- rithms, which make no assumption about the network structure, thus providing the flexibility needed to route on irregular net- works. Actually, such an irregularity shoul...
It is expected that Chip Multiprocessor Systems (CMPs) will contain more and more cores in every new generation. However, applications for these systems do not scale at the same pace. In order to obtain a good CMP utilization several applications will need to coexist in the system and in those cases virtualization of the CMP system will become mand...
Current integration scales allow designing chip multiprocessors (CMP), where cores are interconnected by means of a network-on-chip (NoC). Unfortunately, the small feature size of current integration scales causes some unpredictability in manufactured devices because of process variation. In NoCs, variability may affect links and routers causing th...
Chip Multiprocessor systems (CMPs) contain more and more cores in every new generation. However, applications for these systems do not scale at the same pace. Thus, in order to obtain a good utilization several applications will need to coexist in the system and in those cases virtualization of the CMP system will become mandatory. In this paper we...
The NaNoC project is progressing toward an innovative design platform for multicore systems based on future networks-on-chip. This platform enables the design, manufacturing and management of networks-on-chip by tackling new requirements of future systems like virtualization, power, thermal and application management, as well as new challenges in t...
Networks-on-chip need to survive to manufacturing faults in order to sustain yield. An effective testing and configuration strategy however implies two opposite requirements. One one hand, a fast and scalable built-in self-testing and self-diagnosis procedure has to be carried out concurrently at NoC switches. On the other hand, programming the NoC...
The fat-tree is one of the most common topologies among the interconnection networks of the systems currently used for high-performance parallel computing. Among other advantages, fat-trees allow the use of simple but very efficient routing schemes. One of them is a deterministic routing algorithm that has been recently proposed, offering a similar...
As technology evolves, networks-on-chip will need to survive to manufacturing faults in order to sustain yield. An effective configuration strategy implies the design of an efficient routing infrastructure, that enables a fast and efficient configuration of the NoC system to go around faulty links and switches. The strategy must minimize the overhe...
As technology advances, the number of cores in Chip MultiProcessor systems and MultiProcessor Systems-on-Chips keeps increasing. The network must provide sustained throughput and ultra-low latencies.In this paper we propose new pipelined switch designs focused in reducing the switch latency. We identify the switch components that limit the switch f...
High-speed interconnection networks are essential elements for different high-performance parallel-computing systems. One of the most common interconnection network topologies is the fat-tree, whose advantages have turned it into the favorite topology of many interconnect designers. One of these advantages is the possibility of using simple but eff...
It is well-known that current Chip Multiprocessor (CMP) and high-end MultiProcessor System-on-Chip (MPSoC) designs are growing in their number of components. Networks-on-Chip (NoC) provide the required connectivity for such CMP and MPSoC designs at reasonable costs. However, as technology advances, links become the critical component in the NoC. Fi...
Existing congestion control mechanisms in interconnects can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. These two approaches have different, but non-overlapping weaknesses. In this paper...
We present the Homogeneous-Parallel-Concentrated-Mesh topology (HPC-Mesh). This NoC topology provides four disjoint homogeneous concentrated mesh networks. The network interface at each core provides connectivity to all these networks by using a novel injection algorithm. Indeed, the topology is dynamically adjusted to the working conditions of the...
At NoC level, the traffic interferences can be drastically reduced by using virtualization mechanisms. An effective strategy to virtualize a NoC consists in dividing the network in different partitions, each one serving different applications and traffic flows. In this paper, we propose a NoC reconfiguration mechanism to support NoC virtualization...
We present a novel network on-chip topology, PC-Mesh (Parallel Concentrated Mesh), suitable for tiled CMP systems. The topology is built using four concentrated mesh (C-Mesh) networks and a new network interface able to inject packets through different networks. The goal of the new combined topology is to minimize the power consumption of the netwo...
In this paper, we present a flexible network on-chip topology: NR-Mesh (Nearest neighboR Mesh). The topology gives an end node the choice to inject a message through different neighboring routers, thereby reducing hop count and saving latency. At the receiver side, a message may be delivered to the end node through different routers, thus reducing...
We consider a geographically distributed request processing system composed of various organizations and their servers connected by the Internet. The latency a user observes is a sum of communication delays and the time needed to handle the request on ...
Networks on Chip (NoCs) have been shown as an efficient solution to the complex on-chip communication problems derived from
the increasing number of processor cores. One of the key issues in the design of NoCs is the reduction of both area and power
dissipation. As a result, two-dimensional meshes have become the preferred topology, since it offers...
Current integration scales make possible to design chip multiprocessors with a large amount of cores interconnected by a NoC. Unfortunately, they also bring process variation, posing a new burden to processor manufacturers.Regarding the NoC, variability causes that the delays of links and routers do not match those initially established at design t...
In this demonstration we present an enhanced version of the usual Spidergon STNoC design flow. In addition, we show the automatic generation of a simulation platform that can be used to perform early architecture exploration.
In application-specific SoCs, the irregularity of the topology ends up in a complex implementation of the routing algorithm, usually relying on routing tables implemented with memory structures. As system size increases, the routing table increases in size with non-negligible impact on power, area and latency overheads. In this paper we present a r...
The high-performance computing domain is enriching with the inclusion of networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area, and latency constraints. Designers must address new challenges that were not present before. Defective com...
In application-specific SoCs, the irregularity of the topology ends up in a complex implementation of the routing algorithm, usually relying on routing tables implemented with memory structures. As system size increases, the routing table increases in size with non-negligible impact on power, area and latency overheads. In this paper we present a r...
The number of cores on a single silicon chip is rapidly growing and chips containing tens or even hundreds of identical cores are expected in the future. To take advantage of multicore chips, multiple applications will run simultaneously. As a consequence, the traffic interferences between applications increases and the performance of individual ap...
Recently, one of the most critical issues in mobile ad hoc networks (MANETs) is providing quality of service (QoS) through routing, access/admission control, resource reservation, and mobility management. However, most existing solutions do not provide ...
Recently, 3D stacking has been proposed to alleviate the memory bandwidth limitation arising in chip multiprocessors (CMPs). As the number of integrated cores in the chip increases the access to external memory becomes the bottleneck, thus demanding larger memory amounts inside the chip. The most accepted solution to implement vertical links betwee...
NoCs have become a critical component in many-core architectures. Usually, the preferred topology is the 2D-Mesh as it enables a tile-based layout significantly reducing the design effort. However, new emerging challenges such as power consumption need to be addressed. Looking at the NoC, routers and links not being used must be switched off, thus...
Going beyond isolated research ideas and design experiences, Designing Network On-Chip Architectures in the Nanoscale Era covers the foundations and design methods of network on-chip (NoC) technology. The contributors draw on their own lessons learned to provide strong practical guidance on various design issues. Exploring the design process of the...
On-chip networks have rapidly emerged as the best interconnection choice for high-core count chip multiprocessors (CMPs) because of the good scalability properties they present. Their fast evolution has been accelerated by the large inheritance from the offchip network domain. Many of the mechanisms and techniques previously developed in that area...
The Interconnection networks are essential elements in current computing systems. For this reason, achieving the best network performance, even in congestion situations, has been a primary goal in recent years. In that sense, there exist several techniques focused on eliminating the main negative effect of congestion: the Head of Line (HOL) blockin...
As technology advances, the number of cores in Chip Multi Processor systems (CMPs) and Multi Processor Systems-on-Chips (MPSoCs) keeps increasing. Current test chips and products reach tens of cores, and it is expected to reach hundreds of cores in the near future. Such complexity demands for an efficient network-on-chip (NoC). The common choice to...
Interconnection networks are key elements in current scalable compute and storage systems, such as parallel computers, networks
of workstations, clusters, and even on-chip interconnects. In all these systems, common aspects of communication are of high
interest, including advances in the design, implementation, and evaluation of interconnection net...
Congestion management is likely to become a critical issue in interconnection networks, as increasing power consumption and cost concerns lead to improvements in the efficiency of network resources. In previous configurations, networks were usually oversized and underutilized. In a smaller network, however, contention is more likely to occur and bl...
Efficient data motion is the cornerstone of both traditional parallel computing and more recent stream processing systems. This special combined meeting of the Communication Architecture for Clusters (CAC) and the Scalable Stream Processing Systems (SSPS) workshops showcases the latest research advances in both forms of data motion: network communi...
Current integration scales allow designing chip multiprocessors (CMP) where cores are interconnected by means of a network-on-chip (NoC). Unfortunately, the small feature size of current integration scales cause some unpredictability in manufactured devices because of process variation. In NoCs,variability may affect links and routers causing that...
The high-performance computing domain is enriching with the inclusion of Networks-on-chip (NoCs) as a key component of many-core (CMPs or MPSoCs) architectures. NoCs face the communication scalability challenge while meeting tight power, area and latency constraints. Designers must address new challenges that were not present before. Defective comp...
As the number of processing nodes on chip multi-processors (CMPs) keeps increasing, providing efficient communication with the on-chip interconnect becomes increasingly critical. With 32-core CMP designs on the drawing table of engineers, there is a demand for accurate simulation models that capture all the complexities and interactions of the diff...
Network-on-Chip technology is gaining wide popularity for the interconnection of an increasing number of processor cores on the same silicon die. However, growing process variations cause interconnect malfunction or prevent the network from working at the intended frequency, directly impacting yield and manufacturing cost. Topology agnostic routing...
The expected increase in number of cores on a single chip leads to the necessity of high-performance on chip interconnects (NoC). Furthermore, in order to fully utilize the abundance of cores, the chip is expected to support a number of applications running on the chip simultaneously. It is therefore necessary to partition the chip to support numer...
Chip multiprocessors (CMPs) are gaining momentum in the high-performance computing domain. Networks-on-chip (NoCs) are key components of CMP architectures, in that they have to deal with the communication scalability challenge while meeting tight power, area and latency constraints. 2D mesh topologies are usually preferred by designers of general p...
Fault tolerance mechanisms become indispensable as the number of processors increases in large systems. Measuring the effectiveness
of such mechanisms before its implementation becomes mandatory. Research toward understanding the effects of different network
parameters on the dependability parameters, like mean time to network failure or availabili...