David A. Patterson's research while affiliated with University of California and other places

Publications (45)

Article
Innovations like domain-specific hardware, enhanced security, open instruction sets, and agile chip development will lead the way.
Article
Contenido: Fundamentos del diseño de computadoras; Principios y ejemplos del conjunto de instrucciones; Paralelismo del nivel de instrucción y su explotación dinámica; Explotación del paralelismo del conjunto de instrucciones con acercamientos de software; Diseño de la jerarquía de la memoria; Multiprocesadores y paralelismo en el nivel de los hilo...
Chapter
This chapter presents the construction of the datapath and control unit for two different implementations of the millions of instructions per second (MIPS) instruction set. It reviews the core of the MIPS instruction set, including the memory-reference instructions load word (1 w) and store word (sw), the arithmetic-logical instructions add, sub, a...
Chapter
This chapter provides an overview of interfacing processors and peripherals. Many of the characteristics of input/output (I/O) systems are driven by technology in processors, for example, the properties of disk drives affect how the disks should be connected to the processor and how the operating system interacts with the disks. I/O systems, howeve...
Article
This chapter discusses memory hierarchy. Programs exhibit both temporal locality, that is, the tendency to re-use recently accessed data items, and spatial locality, that is, the tendency to reference data items that are close to other recently accessed items. Memory hierarchies take advantage of temporal locality by keeping more recently accessed...
Chapter
Computer words are composed of bits and thus, words can be represented as binary numbers. Although the natural numbers 0,1, 2, and so on can be represented either in decimal or binary form, the question arises regarding the way in which the other numbers that commonly occur are represented. This chapter discusses the representation of numbers, arit...
Chapter
This chapter focuses on the basic ideas and definitions, the major components of software and hardware, and integrated circuits, the technology that fuels the computer revolution. Both hardware and software designers construct computer systems in hierarchical layers, with each lower layer hiding details from the level above. This principle of abstr...
Chapter
This chapter explores the instruction set of a real computer, both in the form written by humans and in the form read by the machine. Starting from a notation that looks like a restricted programming language, it is refined step-by-step until one sees the real language of a real computer. The chapter presents an instruction set that follows the adv...
Chapter
Pipelining is a technique that exploits parallelism among the instructions in a sequential instruction stream. It has the substantial advantage that unlike some speedup techniques, it can be invisible to the programmer. This chapter reviews the concept of pipelining using the millions of instructions per second (MIPS) instruction subset and a simpl...
Chapter
This chapter focuses on performance and its evaluation. All computer designers must balance performance and cost. There exists a domain of high-performance design, in which performance is the primary goal and cost is secondary. Much of the supercomputer industry designs in this fashion. At the other extreme is low-cost design, where cost takes prec...
Chapter
This chapter provides an overview of parallel processors. It discusses single instruction stream, multiple data streams (SIMD) computers, multiple instruction streams, multiple data streams (MIMD) computers, programming MIMDs, and MIMDs connected by a single bus and a network. The virtues of SIMD are that all the parallel execution units are synchr...
Chapter
Architekten gestalten ein Gebäude, aber das Zimmerhandwerk bestimmt die Qualität seiner Konstruktion. Das Zimmerhandwerk der Rechentechnik ist die Implementierung, die zwei von drei Leistungskomponenten festlegt: CPI (clock cycles per instruction) und die Taktzykluszeit.
Chapter
Pipelining ist eine Implementierungsmethode, bei der mehrere Befehle überlappt abgearbeitet werden. Heute ist Pipelining eine Schlüsselimplementierungsmethode, um schnelle Prozessoren zu realisieren.
Chapter
Die Ein-/Ausgabe war immer das Stiefkind der Rechnerarchitektur. Stets durch die CPU-Enthusiasten vernachlässigt, wurde das Vorurteil gegen E/A im meistverbreiteten Leistungsmaß, der CPU-Zeit (Seite 35), institutionalisiert. Ob ein Computer das beste oder schlechteste E/A-System hat, kann man nicht mittels der CPU-Zeit feststellen, weil deren Defin...
Chapter
In diesem Kapitel werden wir einige spezifische Architekturen untersuchen und detaillierte Architekturmessungen ausführen. Bevor wir jedoch beginnen, wollen wir zunächst erörtern, was und warum wir messen könnten, aber auch wie zu messen ist.
Chapter
Warum entwerfen Ingenieure verschiedene Rechner? Warum werden sie angewendet? Wie unterscheiden die Käufer zwischen den Rechnern? Gibt es eine rationale Basis für ihre Entscheidungen? Wenn ja, können Ingenieure sie für den Entwurf besserer Rechner nutzen? Das sind einige der Fragen, denen sich dieses Kapitel widmet.
Chapter
In den ersten neun Kapiteln haben wir uns auf Ideen beschränkt, die am Markt erprobt sind. Ja, die Grundlagen dieser Kapitel findet man schon in der ersten Veröffentlichung über speicherprogrammierte Rechner. Die Zitate auf der Titelseite belegen aber, daß die Tage der traditionellen Rechner gezählt sind. Aber sie haben überzeugend ihre Lebensfähig...
Chapter
In diesem und dem nachfolgenden Kapitel wollen wir uns auf die Befehlssatz-Architektur konzentrieren. Das ist der für den Programmierer oder Compilerentwickler sichtbare Teil der Maschine. In diesem Kapitel wird eine Vielzahl von Entwurfsalternativen eingeführt, die vom Architekten des Befehlssatzes angeboten werden. Dieses Kapitel ist insbesondere...
Chapter
Im letzten Kapitel betrachteten wir Pipelining im Detail und sahen, daß Pipeline-Scheduling, die Übergabe mehrerer Befehle pro Taktzyklus und tieferes Pipelining eines Prozessors die Leistung einer Maschine ungefähr verdoppeln konnte. Doch es gibt Grenzen der Leistungsverbesserung, die Pipelining erreichen kann. Diese Grenzen werden von zwei primär...
Chapter
Die Computertechnik hat im letzten halben Jahrhundert einen unglaublichen Fortschritt erzielt. Heute bekommt man für ein paar Tausend Dollar einen Pc, der über mehr Leistung, Hauptspeicher und Plattenkapazität verfügt, als ein Rechner, den man 1965 noch für eine Million kaufte. Diese rasante Entwicklung ist sowohl auf die Realisierungstechnologie f...
Chapter
Die Rechnerpioniere haben richtig vorausgesagt, daß die Programmierer einen unbegrenzten Bedarf an schnellem Speicherplatz haben. Wie die 90/10-Regel aus dem ersten Kapitel besagt, greifen die meisten Programme auf den größten Teil von Code und Daten nicht gleichmäßig zu (siehe Abschnitt 1.3, Seite 8–12). Die 90/10-Regel kann auch als Prinzip der L...
Article
Traducción de: Computer Organization and Design. The Hardware/Software Interface Contiene: El simulador SPIM para los MIPS R2000/3000; Abstracciones y tecnología de computadores; El papel del rendimiento; Instrucciones: lenguaje de la máquina; Aritmética para computadores; El procesador: camino de datos y control; Mejora del rendimiento con la segm...
Article
Traducción de: Computer Organization and Design. The Hardware/Software Interface Contenido: Abstracciones y tecnología de los computadores; El papel del rendimiento; Instrucciones: lenguaje máquina; Aritmética para computadores; El procesador: proceso de datos y control; Mejora del rendimiento con la segmentación; Jerarquía de memorias; Interfaz en...
Article
Traducción de: Computer organization and design. The hardware/software interfaces Incluye bibliografía e índice
Article
Contiene: 1. Abstracciones y tecnología de computadores; 2. Instrucciones: lenguaje de la máquina; 3. Aritmética para computadores; 4. El procesador; 5. Grande y rápido: explotar la jerarquía de memoria; 6. Almacenamiento y otros temas de I/O; 7. Multinúcleos, multiprocesadores y clusters. Apéndices: A. Gráficas y computación de unidades de procesa...
Article
Contenido: 1) Fundamentos de diseño de computadoras; 2) Paralelismo del nivel de instrucción y su explotación; 3) Límites del paralelismo del conjunto de instrucciones; 4) Multiprocesadores y paralelismo en el nivel de los hilos; 5) Diseño de jerarquía de memoria; 6) Sistemas de almacenamiento. Apéndices.

Citations

... The long-anticipated end of Moore's Law and Dennard Scaling has dramatically increased commercial and academic interest in computational accelerators [2,7,14,25,28,31,33]. As with any hardware artifact, accelerators require many iterations of design, debugging, testing, and software development. ...
... ALU is utilized in many handling and processing gadgets, because of quick advancement of innovation the quicker number juggling unit is required as well as less territory and low power numbercrunching units are required and because of the expanding combination complexities of IC's the Optimized ALU actualized some of the time may mal-work, so testing capacity must be given and this is expert by Built In Self-Test (BIST) for Optimized ALU. Area develops an ALU from four equipment building squares (AND as well as gates, inverters, and multiplexors) and outlines how combinational rationale works [20]. ...
... Especially when there are dependencies or resources conflicts between these instructions. Such troubles are called hazards since they can randomly happen during the program execution [7][8] [9]. ...
... Considering the importance of logical operations in living systems, this similarity between biological organisms and computers may not be just a coincidence. The general structure of a computer, according to von Neumann, consists of: The above structure of computers is almost universal and applies to today's general-purpose machines independently of their scale [23]. By adopting the term used for computer design, we call architecture the set of the fundamental principles governing the structure and the functionalities of a living system. ...
... In the nano-meter regime, the scaling of the transistor feature size increases both the integration levels and the power density. To increase the performance in the post Dennard era, Multi-Processor System on Chip (MPSoC), has emerged as a paradigm shift in processor design [1]. It has been reported in [2] that for the same integration level, two small cores increase the performance by 70-80% compared with a large processor. ...
... The software (WIRESHARK) decoded packet contents of the interface for readability. Its output was analyzed to verify if the security on the network was enhanced through the injection of WIRESHARK (Patterson, 2008). ...
... full protection) can be achieved by waiting for that one of the entries in R-caches is written back to LLM, so that there is a space to replicate the new dirty data. This is feasible, because the FSM would rarely stay in WORK_S1_S or WORK_S2_S states due to the high hit rate [1], and consequently, the waiting will not cost large clocks. On the other hand, the original writebacks are performed in the background by the write-buffer, which can compensate the waiting cost. ...
... Data dependencies are another inherent characteristic of an application that impact performance. Data dependencies can flow through registers or memory locations, limiting the number of simultaneous instructions issued to an execution unit (instruction-level parallelism, or ILP), and the number of outstanding memory requests (memory-level parallelism, or MLP) [69]. ...
... The approach may be succinctly described as estimates obtained from models trained out of either architectural or microarchitectural instrumentation data. Note that we distinguish between architecture and microarchitecture using the classical interpretation [110]. Now, the term "architecture" is severely overloaded and its interpretation can easily differ from that which we wish to exploit. ...
... Shade [46], MASE [103], Synchrotrace [157] and TaskSim [150] are representative examples of the trace-driven simulators. Typical execution-driven simulators include SimpleScalar [18], SPIM [138], PTLSim [192], ESESC [17] and Fast [40]. ...