ABSTRACT: TheUniform Memory Hierarchy (UMH) model introduced in this paper captures performance-relevant aspects of the hierarchical nature of computer memory.
It is used to quantify architectural requirements of several algorithms and to ratify the faster speeds achieved by tuned
implementations that use improved data-movement strategies.
A sequential computer's memory is modeled as a sequence 〈M
1,...〉 of increasingly large memory modules. Computation takes place inM
0 might model a computer's central processor, whileM
1 might be cache memory,M
2 main memory, and so on. For each moduleM
u, a busB
u connects it with the next larger module Mu+1. All buses may be active simultaneously. Data is transferred along a bus in fixed-sized blocks. The size of these blocks,
the time required to transfer a block, and the number of blocks that fit in a module are larger for modules farther from the
processor. The UMH model is parametrized by the rate at which the blocksizes increase and by the ratio of the blockcount to
the blocksize. A third parameter, the transfer-cost (inverse bandwidth) function, determines the time to transfer blocks at
the different levels of the hierarchy.
UMH analysis refines traditional methods of algorithm analysis by including the cost of data movement throughout the memory
hierarchy. Thecommunication efficiency of a program is a ratio measuring the portion of UMH running time during which M0 is active. An algorithm that can be implemented by a program whose communication efficiency is nonzero in the limit is said
to becommunication- efficient. The communication efficiency of a program depends on the parameters of the UMH model, most importantly on the transfer-cost
function. Athreshold function separates those transfer-cost functions for which an algorithm is communication-efficient from those that are too
costly. Threshold functions for matrix transpose, standard matrix multiplication, and Fast Fourier Transform algorithms are
established by exhibiting communication-efficient programs at the threshold and showing that more expensive transfer-cost
functions are too costly.
A parallel computer can be modeled as a tree of memory modules with computation occurring at the leaves. Threshold functions
are established for multiplication ofN×N matrices using up to N2 processors in a tree with constant branching factor.
Algorithmica 04/1994; 12(2):72-109. · 0.60 Impact Factor
ABSTRACT: The authors describe a conceptual model, the memory hierarchy framework, and a visual language for using the model. The model is more faithful to the structure of computers than the Von Neumann and Turing models. It addresses the issues of data movement and exposes and unifies storage mechanisms such as cache, translation lookaside buffers, main memory, and disks. The visual language presents the details of a computer's memory hierarchy in a concise drawing composed of rectangles and connecting segments. Using this framework, the authors improved the performance of a matrix multiplication algorithm by more than an order of magnitude. The framework gives insight into computer architecture and performance bottlenecks by making effective use of human visual abilities
Visualization, 1990. Visualization '90., Proceedings of the First IEEE Conference on; 11/1990
ABSTRACT: The authors propose a structural classification and vocabulary for
visual languages. The visual grammars that comprises the elements of
such a language is defined. The usual elements are composed of: (1) the
visual alphabet, a set of visual primitives in a visual language; (2)
the visual syntax, compositions of primitives to form visual statements;
(3) interaction, user-to-system communications; and (4) structure, rules
combining sublanguages into a language. The classification of the visual
elements is viewed as a linguistic description of visual language
Visual Languages, 1988., IEEE Workshop on; 11/1988