
Michael Friendly- Doctor of Philosophy
- Professor (Full) at York University
Michael Friendly
- Doctor of Philosophy
- Professor (Full) at York University
About
146
Publications
217,926
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,910
Citations
Introduction
Michael Friendly is Professor of Psychology in the Department of Psychology, York University and a Fellow of the American Statistical Association. Michael does research in data visualization and the history of data visualization. His current project is 'The Origin of Graphical Species (with Howard Wainer)'.
Current institution
Publications
Publications (146)
Successfully negotiating the modern world often requires full and facile comprehension of quantitative phenomena. Experience gathered over more than three centuries has taught us this is best done by transforming the evidence into various kinds of pictures, so that we can enlist the remarkable power of the human visual system to rapidly decode and...
This paper recounts the origins and influences of the movement in statistics and data visualization dubbed Exploratory Data Analysis by John W. Tukey and developed by students who caught the EDA-bug in the period 1960-1990. In 2022, the 52th anniversary of the launch of the preliminary volumes of EDA, we trace the history of this topic through dics...
André-Michel Guerry was born and raised in Tours in a family whose touraine roots go back at least to the early 1600 s. He can be considered one of the founders of the empirical study of criminology and modern social science. His accomplishments were honored in his lifetime, yet he remains largely unrecognized and under-appreciated today, both in h...
André-Michel Guerry was born and raised in Tours in a family whose touraine roots go back at least to the early 1600s. He can be considered one of the founders of the empirical study of criminology and modern social science. His accomplishments were honored in his lifetime, yet he remains largely unrecognized and under-appreciated today, both in hi...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
During the course of his life, Steve Fienberg made important contributions to a remarkably broad range of topics. Because of his desire to communicate effectively and broadly what are sometimes a complex web of facts, he naturally gravitated toward data visualization. In this essay we describe the origins of data visualization and its nineteenth ce...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
#
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for t...
This PDF contains the table of contents and introductory chapter from our forthcoming book.
Michael Friendly and Howard Wainer recount the tale of Francis Galton and his discovery of weather patterns, which led to modern weather maps – providing a glimpse into the history of visual thinking and graphic communication, the subject of their new book Michael Friendly and Howard Wainer recount the tale of Francis Galton and his discovery of we...
Many readers are likely familiar with the stories of the tragic fate of passengers and crew of the RMS Titanic upon her fatal collision with an iceberg and her sinking in the early hours of April 15, 1912, on her maiden voyage to New York City. Little known is the fact that the first graphical summary of the initial survivor data appeared in The Sp...
The sinking of the Titanic has inspired books, movies and documentaries. But it has also motivated data visualisation designers to tell the story of the tragedy in new ways. Michael Friendly, Jürgen Symanzik and Ortac Onder review the first graph of the disaster and some recent history The sinking of the Titanic has inspired books, movies and docum...
This paper explores a variety of topics related to the question of testing the equality of covariance matrices in multivariate linear models, particularly in the MANOVA setting. The main focus is on graphical methods that can be used to address the evaluation of this assumption. We introduce some extensions of data ellipsoids, hypothesis-error (HE)...
This paper explores a variety of topics related to the question of testing the equality of
covariance matrices in multivariate linear models, particularly in the MANOVA setting.
The main focus is on graphical methods that can be used to address the evaluation of this assumption. We introduce some extensions of data ellipsoids, hypothesis-error (HE...
This is the supplemental appendix to “Visualizing Tests for Equality of Covariance Matrices,”
in press, The American Statistician. It covers topics of interest that were considered
too long or not sufficiently essential to include in the paper.
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
Background
Multiple sclerosis is a polysymptomatic disease. Little is known about relative contributions of the different multiple sclerosis symptoms to self-perception of health.
Objectives
To investigate the relationship between symptom severity in 11 domains affected by multiple sclerosis and self-rated health.
Methods
Multiple sclerosis patie...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
An Applied Treatment of Modern Graphical Methods for Analyzing Categorical Data. Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data presents an applied treatment of modern methods for the analysis of categorical data, both discrete response data and frequency data. It explains how to use graphical me...
This paper reviews our work in the development of visualization methods (implemented in R) for understanding and interpreting the effects of predictors in multivariate linear models (MLMs) of the form Y = XB + U, and some of their recent extensions. We begin with a description of and examples from the Hypothesis-error (HE) plots framework (utilizin...
This note indicates that the generalized pairs plot had an earlier and more general genesis than suggested by Emerson et al. (2013), and describes how this graphic form can be extended further.
The multivariate linear model is Y(n×m) = X (n×p) B (p×m) + E (n×m) The multivariate linear model can be fit with the lm function in R, where the left-hand side of the model comprises a matrix of response variables, and the right-hand side is specified exactly as for a univariate linear model (i.e., with a single response variable). This paper expl...
Visual insights into a wide variety of statistical methods, for both didactic
and data analytic purposes, can often be achieved through geometric diagrams
and geometrically based statistical graphs. This paper extols and illustrates
the virtues of the ellipse and her higher-dimensional cousins for both these
purposes in a variety of contexts, inclu...
A document retrieved from the archives of the Conservatoire National des Arts et Métiers (CNAM) in Paris sheds new light on the invention by André-Michel Guerry of a mechanical de-vice for obtaining statistical summaries and for examining the relationship between different vari-ables, well before general purpose statistical calculators and the idea...
ViSta is a project that focuses on dynamic and interactive graphics for statistics and was initiated by the late Forrest W. Young at the beginning of the 1990s. For over approximately 20 years, Forrest and other collaborators, including the authors of this article, have used ViSta for experimenting with these kinds of graphics in different settings...
Since its introduction, APL has frequently been touted as an ideal programming language for statistical applications. Among the attractive features of APL for statistics are its extensibility, the presence of primitives for operations such as sorting, matrix inversion, and arranging data, and powerful facilities for handling matrices and other arra...
In ridge regression and related shrinkage methods, the ridge trace plot, a plot of estimated coefficients against a shrinkage parameter, is a common graphical adjunct to help determine a favorable tradeoff of bias against precision (inverse variance) of the estimates. However, standard unidimensional versions of this plot are ill-suited for this pu...
Graphics for Statistics and Data Analysis with R (K. J. Keen) Michael Friendly Hidden Markov Models for Time Series: An Introduction Using R (W. Zucchini and I. L. MacDonald) Peter Guttorp Bayesian Adaptive Methods for Clinical Trials (S. M. Berry, B. P. Carlin, J. J. Lee, J. J. and P. Muller) Say Beng Tan SAS and R Data Management, Statistical Ana...
Random Numbers Medical Diagnosis Fidelity and Marriage
Visual Statistics Dynamic Interactive Graphics Three Examples History of Statistical Graphics About Software About Data About This Book Visual Statistics and the Graphical User Interface Visual Statistics and the Scientific Method
Objects User Interfaces for Seeing Data Character-Based Statistical Interface Objects Graphics-Based Statistical Interfaces Plots Spreadplots Environments for Seeing Data Sessions and Projects The Next Reality
Data: Medical Diagnosis Three Families of Multivariate Plots Parallel-Axes Plots Orthogonal-Axes Plots Paired-Axes Plots Multivariate Visualization Summary Conclusion
Introduction Data: Automobile Efficiency Bivariate Plots Multiple Bivariate Plots Bivariate Visualization Methods Visual Exploration Visual Transformation: Box–Cox Visual Fitting: Simple Regression Conclusions
Introduction Data: Automobile Efficiency Univariate Plots Visualization for Exploring Univariate Data What Do We See in MPG?
We consider Gelman's claims about the relative merits of tables versus graphs from a psychological perspective that emphasizes the role of data displays in the communication of quantitative results from authors to readers or viewers. From this perspective, we consider these claims in relation to a cognitive distinction between graph people and tabl...
The statistical community is divided when it comes to graphical methods and models. Graphics researchers tend to disparage models and to focus on direct representations of data, mediated perhaps by research on perceptions but certainly not by probability distributions. From the other side, modelers tend to think of graphics as a cute toy for explor...
Hypothesis error (HE) plots, introduced in Friendly (2007), provide graphical methods to visualize hypothesis tests in multivariate linear models, by displaying hypothesis and error covariation as ellipsoids and providing visual representations of effect size and significance. These methods are implemented in the heplots for R (Fox, Friendly, and M...
A 1644 diagram by Michael Florent van Langren, showing estimates of the difference in longitude between Toledo and Rome, is sometimes considered to be the first known instance of a graph of statistical data. Some recently discovered documents help to date the genesis of this graphic to before March 1628, and shed some light on why van Langren chose...
Statistical graphics and data visualization have long histories, but their modern forms began only in the early 1800s. Between roughly 1850 and 1900 ($\pm10$), an explosive growth occurred in both the general use of graphic methods and the range of topics to which they were applied. Innovations were prodigious and some of the most exquisite graphic...
The cluster heat map is an ingenious display that simultaneously reveals row and column hierarchical cluster structure in a data matrix. It consists of a rectangular tiling, with each tile shaded on a color scale to represent the value of the corresponding element of the data matrix. The rows (columns) of the tiling are ordered such that similar ro...
In the era of data-centric-science, a large number of visualization tools have been created to help researchers understand increasingly rich business databases. Information visualization is a process of constructing a visual presentation of business quantitative data, especially prepared for managerial use. Interactive information visualization pro...
Collinearity diagnostics are widely used, but the typical t abular output used in almost all soft- ware makes it hard to tell what to look for and how to understand the results. We describe a simple improvement to the standard tabular display, a graphic rendition of the salient information as a "tableplot," and graphic displays designed to make the...
In the debate over null hypothesis significance testing, Paul Meehl strongly advocated appraising theories through the generation and evaluation of precise predictions (e.g., Meehl, 1978). The study of personality structure through the five-factor model (FFM; McCrae & John, 1992) is an important area of research where one encounters many precise pr...
It is common to think of statistical graphics and data visualization as relatively modern developments in statistics. In fact,
the graphic representation of quantitative information has deep roots. These roots reach into the histories of the earliestmap
making and visual depiction, and later into thematic cartography, statistics and statistical gra...
Zeleis for various suggestions and contributions. Description This package accompanies J. Fox, An R and S-PLUS Companion to Applied Regression, Sage, 2002. The package contains mostly functions for applied regression, linear models, and generalized linear models, with an emphasis on regression diagnostics, particularly graphical diagnostic methods....
André-Michel Guerry’s (1833) Essai sur la Statistique Morale de la France was one of the foundation studies of modern social science. Guerry assembled data on crimes, suicides, literacy and other “moral statistics,” and used tables and maps to analyze a variety of social issues in perhaps the first comprehensive study relating such variables. Indee...
Multivariate analysis of variance (MANOVA) extends the ideas and methods of uni-variate ANOVA in simple and straightforward ways. But the familiar graphical methods typically used for univariate ANOVA are inadequate for showing how measures in a multivariate response vary with each other, and how their means vary with explanatory factors. Similarly...
This paper describes graphical methods for multiple-response data within the framework of the multivariate linear model (MLM), aimed at understanding what is being tested in a multivariate test, and how factor/predictor effects are expressed across multiple response measures. In particular, we describe and illustrate a collection of SAS macro progr...
Data Frequency Plots Visual Fitting of Log-Linear Models Conclusions
Of all the graphic forms used today, the scatterplot is arguably the most versatile, polymorphic, and generally useful invention in the history of statistical graphics. Its use by Galton led to the discovery of correlation and regression, and ultimately to much of present multivariate statistics. So, it is perhaps surprising that there is no one wi...
The Milestones Project is a comprehensive attempt to collect, document, illustrate, and interpret the historical developments leading to modern data visualizatio n and visual thinking. This paper provides an overview and brief tour of the milestones content, with a few illustrations of significant contributions to the history of data visualization....
The modules in the statistical package ViSta related to categorical data analysis are presented. These modules are: visualization of frequency data with mosaic and bar plots, correspondence analysis, multiple correspondence analysis and loglinear analysis. All these methods are implemented in ViSta with a big emphasis on plots and graphical represe...
In the era of data-centric-science, a large number of visualization tools have been created to help researchers understand increasingly rich business databases. Information visualization is a process of constructing a visual presentation of business quantitative data, especially prepared for managerial use. Interactive information visualization pro...
Correlation and covariance matrices provide the basis for all classical multivariate techniques. Many statistical tools exist for analyzing their structure, but, surprisingly, there are few techniques for exploratory visual display, and for depicting the patterns of relations among variables in such matrices directly, particularly when the number o...
This chapter presents the econometric methods that are used in health economics to model individuals health care costs. These methods are used for prediction, projection and forecasting, in the context of risk adjustment, resource allocation, technology assessment and policy evaluation. The chapter reviews the literature on the comparative performa...
Charles Joseph Minard is most widely known for a single work—his poignant flow-map depiction of the fate of Napoleon’s Grand Army in the disastrous Russian campaign of 1812. In fact, Minard was a true pioneer in thematic cartography and in statistical graphics; he developed many novel graphic forms to depict data, always with the goal to let the da...
This paper provides an illustrated history of the visual and conceptual ideas leading to the development of mosaic displays. We trace the origins of the use of rectangles and area to depict data quantities and their relations, of early forms of mosaic displays including sub-divided bar-like charts and various cartograms, to the modern forms used in...
The graphic portrayal of quantitative information has deep roots. These roots reach into histories of thematic cartography, statistical graphics, and data visualization, which are intertwined with each other. They also connect with the rise of statistical thinking up through the 19th century, and developments in technology into the 20th century. Fr...
Categorical data---frequency data, and discrete data---are most often presented in tables, and analyses using loglinear models and logistic regression are most often presented in terms of parameter estimates. Over the past decade, I and others have developed novel visualization methods for categorical data, designed to provide exploratory and confi...
This note extends the construction of the design matrix used for estimating cell probabilities with ignorable missing data described by Lipsitz, Parzen, and Molenberghs. A reformulation for the general case of an n-way table is described and implemented in a SAS macro program. The macro constructs this design matrix and offset variable, estimates t...