Best known in our circles for his key role in the renaissance of low- density parity-check (LDPC) codes, David MacKay has written an am- bitious and original textbook. Almost every area within the purview of these TRANSACTIONS can be found in this book: data compression al- gorithms, error-correcting codes, Shannon theory, statistical inference, constrained codes, classification, and neural networks. The required mathematical level is rather minimal beyond a modicum of familiarity with probability. The author favors exposition by example, there are few formal proofs, and chapters come in mostly self-contained morsels richly illustrated with all sorts of carefully executed graphics. With its breadth, accessibility, and handsome design, this book should prove to be quite popular. Highly recommended as a primer for students with no background in coding theory, the set of chapters on error-correcting codes are an excellent brief introduction to the elements of modern sparse-graph codes: LDPC, turbo, repeat-accumulate, and fountain codes are de- scribed clearly and succinctly. As a result of the author's research on the field, the nine chapters on neural networks receive the deepest and most cohesive treatment in the book. Under the umbrella title of Probability and Inference we find a medley of chapters encompassing topics as varied as the Viterbi algorithm and the forward-backward algorithm, Monte Carlo simu- lation, independent component analysis, clustering, Ising models, the saddle-point approximation, and a sampling of decision theory topics. The chapters on data compression offer a good coverage of Huffman and arithmetic codes, and we are rewarded with material not usually encountered in information theory textbooks such as hash codes and efficient representation of integers. The expositions of the memoryless source coding theorem and of the achievability part of the memoryless channel coding theorem stick closely to the standard treatment in (1), with a certain tendency to over- simplify. For example, the source coding theorem is verbalized as: " i.i.d. random variables each with entropy can be compressed into more than bits with negligible risk of information loss, as ; conversely if they are compressed into fewer than bits it is virtually certain that informa- tion will be lost." Although no treatment of rate-distortion theory is offered, the author gives a brief sketch of the achievability of rate with bit- error rate , and the details of the converse proof of that limit are left as an exercise. Neither Fano's inequality nor an operational definition of capacity put in an appearance. Perhaps his quest for originality is what accounts for MacKay's pro- clivity to fail to call a spade a spade. Almost-lossless data compres- sion is called "lossy compression;" a vanilla-flavored binary hypoth-