Conference Paper

Microprogramming heritage of RISC design

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The paper summarizes the current design trends of micro-architectures and analyzes the tradeoffs between the RISC approach and the microprogramming approach especially on vertical migration among hardware, firmware, compilation, and software. RISCs simplicity is contrasted by the regularity of microprogrammed control. The RISC design incentives are categorized into three perspectives, namely, technology-driven, application-driven, and performance-driven. Traditional firmware migration approaches are reviewed and related to the RISC design philosophy as well as, the writable instruction set computer (WISC) concept. Research such as firmware migration candidates selection can be applied to RISC instruction set design. Similarly, micro-code generation and compaction research can be used to construct smart, optimizing RISC compilers. Horizontal microcoding is interpreted by the very long instruction word (VLIW) architecture

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

Article
Full-text available
Article
Just a few years ago,the idea of a new computer architecture based on simplified, streamlined central processors was mainly an academic curiosity. Invented at IBM and shaped at Berkeley and Stanford, the reduced instruction set computers (RISC) principle embodied a heresy: that most commercial microprocessor architecture had bloated far beyond the optimum level. Laboratory experiments have proved that lean chips with reduced instruction sets can run benchmark tests at fantastic speeds, but some system designers remain unconvinced that RISC will be as useful in the real world of complex systems and applications. The state of RISC is presented, its roots are traced and four implementation are described. Its applications in Artificial Intelligence (AI) are also considered.
Article
The performance of microprogrammable computer system architectures can be improved considerably by microcoding frequently executed and CPU intensive software functions. This paper reports on quantitative performance enhancements obtained by practical application of vertical migration techniques in a hierarchically layered software/firmware/hardware system Burroughs B 1726. Various representative, CPU intensive and runtime intensive software primitives have been selected by measurements — operating system primitives as well as application software primitives —, have been microcoded and migrated to the firmware layer. Quantitative analyses summarized in this presentation demonstrate the range of performance enhancements which can be expected by the application of this technique.
Article
Analysing architecture performance is a step of utmost importance and interest in the process of computer architecture design. The paper deals particularly with special architectures of the Reduced Instruction Set Computer Space (RISCS) in order to obtain a preliminary evaluation of RISC's architectures performance and the approach to an optimal instruction set for an effective High Level Language Computer (HLLC). The new modular microcomputer consisting of instruction groups modules called MODHEL that has already been proposed has facilitated the investigation of the performance of a set of computers allocated within the RISCS. A comparative study of the dynamic instruction mix of four benchmark programs is performed with respect to the VAX 11/780, RISCI (Berkeley) and five versions of MODHEL. Preliminary results indicate that there is a strong chance that RISC-type computers could be designed to function as efficient low-cost systems.
Thesis
Very Long Instruction Word architectures are reduced-instruction set machines with a large number of parallel, pipelined functional units but only a single thread of control. These machines offer the promise of an immediate order-of-magnitude speed-up for general-purpose scientific computing. But unlike previous machines such as the Cray and the FPS-164, it is impossible to program VLIW machines in machine language - only a compiler for a high-level language (Fortran.) makes these machines feasible. This thesis demonstrates, via a working compiler, that this symbiosis of new architecture and new compiling technology is practicable. A traditional compiler couldn't find enough parallelism in scientific programs to utilize a VLIW effectively. The Bulldog compiler uses several new compilation techniques: trace scheduling to find more parallelism, memory reference and memory bank disambiguation to increase memory bandwidth, and new code-generation algorithms. Results of preliminary experiments testing both the Bulldog compiler and various aspects of VLIW architectures are included.
Article
The NP-Completeness of the selection problem of vertical migration candidates is shown by reducing it to the NP-complete knapsack problem. Based on approximation algorithms for the latter problem a new algorithm is presented, which takes into account the call relations between functions in a complex system. Then the developed algorithm is applied to the UNIX operating system as a representative for complex systems.
Article
We are defining a computer architecture to support a decentralized operating system that is under development. Since complex functions of this system could appear directly in the architecture's instruction set, we are immersed in the current RISC/CISC debate where we have found many questions and few answers. Hoping to shed light on aspects of this conflict, we have embarked on two studies. The first seeks to find an environment where complex instructions can enhance system performance. The second tries to decouple multiple register set performance from instruction set performance in general-purpose register machines.
Article
This paper examines suitability and possible architectural modifications necessary for GaAs implementation of the UCB-RISC (reduced instruction set computer developed at UC Berkeley). This architecture was designed and optimized for the Silicon environment, and introduced a revolution in the area of VLSI processor architectures. In addition to its already recognized qualities related to Silicon, it incorporates several features which make it a candidate for GaAs. This paper discusses the effects that a GaAs implementation will most likely have on performance of reduced architectures in general, and UCB-RISC in particular. First, the advantages and disadvantages of GaAs, compared to Silicon, are examined. Next, the UCB-RISC architectural features which are affected by these differences are recognized, and some possible solutions for their implementation in GaAs are explored. The introduction of some new features which might be useful for a GaAs implementation is also explored. The use of fault-tolerance to improve GaAs yield is another topic of this paper, followed by a quantitative analysis of the expected improvements in performance. In principle, we believe that the solution for GaAs is in synergism between hardware design and compiler design. However, in this paper we concentrate on the first issue only.
Conference Paper
This paper presents a method for mapping computation algorithms to parallel machines architectures. The approach is based on a fine grain mapping system, FGMS, whose basic rationale is to achieve better matchings between computations and architectures. FGMS consists of four stages, i.e., data flow graph generation, vertical mapping into fine grain graphs, horizontal mapping across interconnected processors and instruction or microcode generation for individual processors.
Article
Reduced instruction set computers aim for both simplicity in hardware and synergy between architectures and compilers. Optimizing compilers are used to compile programming languages down to instructions that are as unencumbered as microinstructions in a large virtual address space, and to make the instruction cycle time as fast as possible.
Article
A processor architecture attempts to compromise between the needs of programs hosted on the architecture and the performance attainable in implementing the architecture. The needs of programs are most accurately reflected by the dynamic use of the instruction set as the target for a high level language compiler. In VLSI, the issue of implementation of an instruction set architecture is significant in determining the features of the architecture. Recent processor architectures have focused on two major trends: large microcoded instruction sets and simplified, or reduced, instruction sets. The attractiveness of these two approaches is affected by the choice of a single-chip implementation. The two different styles require different tradeoffs to attain an implementation in silicon with a reasonable area. The two styles consume the chip area for different purposes, thus achieving performance by different strategies. In a VLSI implementation of an architecture, many problems can arise from the base technology and its limitations. Although circuit design techniques can help alleviate many of these problems, the architects must be aware of these limitations and understand their implications at the instruction set level.
Article
In this study ″trace scheduling″ is developed as a solution to the global compaction problem. Trace scheduling works on traces (or paths) through microprograms. Compacting is thus done with a broad overview of the program. Important operations are given priority, no matter what their source block was. This is in sharp contrast with earlier methods, which compact one block at a time and then attempt iterative improvement. It is argued that those methods suffer from the lack of an overview and make many undesirable compactions, often preventing desirable ones. Loops are handled using the reducible property of most flow graphs. The loop handling technique permits the operations to move around loops, as well as into loops, where appropriate.
Article
A description is given of the FRISC-E, a 32-bit fast RISC (reduced instruction set computer) design using advanced differential bipolar logic. FRISC-E was designed to solve the problems caused by partitioning high-speed microprocessors, which is necessary to increase yield and reduce heat flux but requires die-to-die interconnections that cause signal propagation delays. The design and packaging of FRISC-E is described, and its use of pipelining is discussed. System architecture and performance are examined.< >
Article
Three new RISC (reduced instruction set computer) architectures, the Intel i860, the Motorola 88000, and the Sun Microsystems Sparc, are compared. The relative importance of an architectural comparison, as opposed to a comparison of implementations, is discussed, followed by a high-level overview of each of the architectures. The relative strengths and weaknesses of the three architectures in a number of key areas are analyzed, and their overall relative strengths and weaknesses are summarized.< >
Article
The design and implementation of the RISC (reduced-instruction-set computer) 88000 system in high-speed, complementary metal-oxide semiconductor (HCMOS) technology is described. The total system consists of the 88100 processor and two 88200 cache memory management units (CMMUs). The various features and components of the 88000 are discussed.< >
Article
The design principles of reduced-instruction-set computer (RISC) architectures as they apply to VLSI implementation for high-level languages (HLLs) are presented. The nature of general-purpose HLL computations is discussed in terms of static and dynamic program measurements, and the HLL features that need efficient support are identified. CISC (complex-instruction-set computer) and RISC approaches to general-purpose HLL computers are outlined, the effects of instruction-set reduction on both code size and execution time are evaluated, and the delayed-jump concept is introduced. The Berkeley RISC architecture is presented as an example.< >
Article
A successful technique for improving system performance has been to migrate often used software functions into firmware. An extension of this technique, called vertical migration, has been developed that deals with migrations of functions between any two levels in a multilevel interpretive hierarchy. In both of these approaches the effect of multiple migrations on system performance is not fully understood and has not been formally analyzed. In this paper the effects of multiple migrations on system performance are formally described. Three distinct types of migration are identified and formulas for the execution time savings of each type are derived.
Article
The authors illustrate the application areas into which RISC (reduced-instruction-set computer) philosophy has moved, the wide range of technologies being used for processor implementation, features that seem to be common with all designs, and unique features introduced to solve specific problems. They investigate the CPU design first, emphasizing instruction set, datapath, and memory-system design. Next they investigate the RISC as a system, emphasizing multiple execution units, coprocessor support, multiprocessing, operating-system support, language support, and family requirements. They briefly compare the performance of various current RISC systems.
Intel merge RISC aad CISC with the up
  • Alan Joch
Alan Joch, T i Intel merge RISC aad CISC with the up"@ 4867," Mini-Micro Wtems, pp. 30-34, Feb. 1989.
Reduced Instruction Set ComputersA Comparison of RISC ArchitectureHaw much of a RISC?
  • S Nicdau David A Patterson Richard
  • William S Pkpho
  • Wu
  • Shi
A Nicdau, "Percolation Scheduling: A Parallel Compilation Technique," TR 85478, May 1985. (241 David A Patterson, "Reduced Instruction Set Computers," Comm. of the ACM, pp. 8-21, Jan. 1985. Richard S. Pkpho and William S. Wu, "A Comparison of RISC Architecture," IEEE MICRO, pp. 5142, August, 1989. [U] [XI Phillip Robinsoa, "Haw much of a RISC?," *, pp. 141-150, Apr. 1987. (27 L Shi & ck Papachmtou, "Mapping ol Micro Data Flow C o m p u t a h onto Parallel M i r c h i t e c t u n q " P r o ~. MICRO-21.1968.
Trae Scheduling: A Technique for Global Microcode Compaction VLIW Architecture: Supercomputing via Overlapped Ekecution A Survey of RISC Processors and Computers of the Mid-1980 Practical Experiences with Vertical Migration
  • J R J Ellis
  • Fisher
J.R. Ellis, Bulldog: A Compiler for VLIW Architecture, MIT Press, 1986. J. k Fiiher, 'Trae Scheduling: A Technique for Global Microcode Compaction, " IEEE Compaction, " IEEE Trans. on Computers. No. 7, pp. 478490,1981. J A Fisher, " VLIW Architecture: Supercomputing via Overlapped Ekecution, " Proc. Second International Conference on Suwrcomputing, May 1987. Charles E. Gimarc and Veljko M. Milutinovic, " A Survey of RISC Processors and Computers of the Mid-1980, " IEEE Computer, pp. 59-68, September 1989. Barbara Grossmann, Eva Kwee, and Axel Lehmann, " Practical Experiences with Vertical Migration, " Microprocessing & Microprogramming, pp. 185-192,12, 1983. HJ. Grueb, J.F. McDonald, and T. Creedon, " FRISC-E: A 250-MIPS Hybrid Microprocessior, " IEEE Circuit & Devices, May 1990.
An Algorithm for S e k d i ofMitianCandidates
  • B Hdtkamp
  • P Wagner
B. Hdtkamp, aad P. Wagner, "An Algorithm for S e k d i ofMitianCandidates,"Proc17thAnnual Workshwon Mc"L * & pp. 140-146,1984.
Ibe Qpea and Interactions of Vertical Migretiona of Fu~ictiom in a Multilevel Inteqmtive system: IEEE f i n s . on Computers, vd. c-31
  • John
  • Stan
John A Stan-"Ibe Qpea and Interactions of Vertical Migretiona of Fu~ictiom in a Multilevel Inteqmtive system: IEEE f i n s. on Computers, vd. c-31. NO. 7, PP. 505-513, July 1981. 1291 ~ichad slater, "what RISC r, ~ i a p vi, MICRO, June 1990
Peering Through RISUCISC Fog: An Outline of Research RISC back to the future
  • R P Colwell
  • R P Colwell
R.P. Colwell, " Computer, Complexity and Controversy, " IEEE Computer, pp. 8-19, Sep. 1985. R.P. Colwell, " Peering Through RISUCISC Fog: An Outline of Research, " Computer Architecture News, pp. 4-50, March 1983. Gordon Bell, " RISC back to the future, " Datamation, pp. %-108, June 1986. John A. Board, Jr., Transputer Research and Applications, 1990.
Will Intel merge RISC and CISC with the upcoming 4867
  • A Joch
Vertical Migration for Performance Enhancement - Methods, Tools, and Application" Microcomputers. Usage &amp
  • Axel Lehmann
  • Eva Kwee
VLIW Architecture: Supercomputing via Overlapped Execution
  • J A Fisher