Page 1

A Tour of Trellis Graphics

Richard A. Becker

William S. Cleveland

Ming-Jen Shyu

Bell Laboratories

Murray Hill, New Jersey 07974

Stephen P. Kaluzny

Statistical Sciences Division, MathSoft

1700 Westlake Avenue North

Seattle, Washington 98109

April 15, 1996

Page 2

- 2 -

TABLE OF CONTENTS

1. INTRODUCTION

1.1 New Capabilities for S and S-PLUS Graphics

1.2 A Simple Scatterplot: Ethanol Data

1.3 Conditioning on Numeric Variables: Ethanol Data

1.4 Conditioning on Factors: Barley Data

1.5 Other Examples

2. HOW TO USE TRELLIS SOFTWARE

2.1 Display Functions

2.2 Customization for Devices

2.3 Panel Functions

2.4 Formulas

2.5 Trellis Objects

2.6 Layout

2.7 Axes

2.8 Aspect Ratio

2.9 Data Structures

2.10 Labeling (Titles, Strip Labels, Keys)

3. ADVANCED CONCEPTS

3.1 Prepanel Functions

3.2 The subscripts= Argument

3.3 Device Settings

3.4 Finding the Data

4. HIGHER DIMENSIONS

4.1 3-D Plotting

4.2 Contour Plots

4.3 More Than Three Variables

5. A GRAB BAG

Page 3

- 3 -

1. INTRODUCTION

Trellis displays are plots which contain one or more panels, arranged in a regular grid-like

structure (a trellis). Each panel graphs a subset of the data. All panels in a Trellis display contain

the same type of graph but these graphs are general enough to encompass a wide variety of 2-D

and 3-D displays: histogram, scatter plot, dot plot, contour plot, wireframe, 3-D point cloud and

more. The data subsets are chosen in a regular manner, conditioning on continuous or discrete

variables in the data, thus providing a coordinated series of views of high-dimensional data.

This document leads you through Trellis graphics: it shows the functions in the Trellis

library, it describes the common arguments that the functions share, and shows how Trellis dis-

plays are customized for various graphical devices. Other information is available about Trellis,

including a user’s manual and a journal article with data analysis examples. To find these and

more, refer to the Trellis web page:

http://netlib.att.com/projects/trellis/

1.1 New Capabilities for S and S-PLUS Graphics

Graphics have always been a strong feature of S and S-PLUS (the commercial version of S,

distributed by MathSoft). Its graphics provide device independence, high-level plotting functions

that produce an entire display, low-level functions to augment existing displays or build new

ones, and a collection of graphical parameters that provide a wide range of control over the details

of plotting.

Graphical parameters in S provide the ability to produce several plots on a single page.

However, producing a coordinated set of plots on a page, with control over aspect ratios and axes,

has always taken more knowledge of the graphical functions than even a proficient user is likely

to possess. In addition, graphics devices may vary in their capabilities, thus requiring adaptations

in order to produce the best plot on each device.

The Trellis library is designed to remedy this situation. Besides providing a straightforward

way to produce multiple panels on a single page, it also sets up a unifying framework for doing

this. Trellis displays extend S graphics to handle multivariate data situations by using a powerful

and general technique, conditioning. In addition, the Trellis software does an excellent job with

single panel displays, making it a suitable vehicle for doing most high-level graphics in S.

While improving user control of graphics, the Trellis software also makes graphics func-

tions behave just like any other S functions. The result of executing a Trellis expression is a Trel-

lis object. Unless it is assigned a name or used in a further computation, the Trellis object is dis-

played.

The Trellis library is now distributed as a standard part of S-PLUS. (Hold onto your hats,

jargon to follow!) S-PLUS prior to version 3.3 does not come with the Trellis library. In the PC

environment, S-PLUS for Windows, Version 3.3 (and presumably anything later) comes with

Trellis Version 2.0, as described in this document. Under the Unix operating system, S-PLUS

Version 3.3 contains a slightly older version of Trellis. The next release, due in 1996, is sched-

uled to contain Trellis Version 2.0.

1.2 A Simple Scatterplot: Ethanol Data

Perhaps the easiest way to introduce Trellis displays is by examples. They illustrate the

variety of Trellis displays that can be produced and also introduce the way that Trellis displays

are controlled. This document gives an entry point to use of the software and explains common

features. It does not have the space to explain, except in the most cursory way, the meaning or

use of the graphical techniques. To find out much more about how to use Trellis displays to

Page 4

- 4 -

understand data, read Visualizing Data by William S. Cleveland (1993).

Although Trellis graphics functions are capable of producing multiple panel displays, they

are also excellent at doing basic single-panel graphs. For this example, we will use data from an

experiment involving 88 trials of an engine running an ethanol mixture, contained in the data

frame ethanol. There are three variables: emissions of oxides of nitrogen, NOx, equivalence

ratio (a measure of the richness of the fuel/air mixture), E, and five values of compression ratio, C.

We can use the scatterplot function xyplot on the ethanol data to produce a simple scat-

terplot:

xyplot(NOx ˜ E, data = ethanol) # Figure 1

1

2

3

4

0.6 0.81.0 1.2

E

NOx

Figure 1. A simple scatterplot of the engine data, showing NOx emissions as a function of equiv-

alence ratio.

The first argument to xyplot and to most Trellis functions is a formula and the second tells

where the data in the formula can be found. Both of these kinds of arguments were introduced in

the book Statistical Models in S by Chambers & Hastie (1992). We have used this same para-

digm for Trellis graphics.

1.3 Conditioning on Numeric Variables: Ethanol Data

A simple modification of the previous call to xyplot produces a multi-panel display:

xyplot(NOx ˜ E | C, data = ethanol) # Figure 2

This produces Figure 2, which shows NOx emissions plotted against equivalence ratio, each panel

showing data for one of the five values of compression ratio.

The Trellis display consists of 5 panels, each showing NOx on the vertical axis and E on the

Page 5

- 5 -

1

2

3

4

C

0.60.81.0 1.2

C

C

1

2

3

4

C

0.60.81.0 1.2

1

2

3

4

C

E

NOx

Figure 2. A Trellis display of engine emissions data.

horizontal axis. The value of C is shown by the strip label at the top of each panel; in this case, C

takes on 5 discrete values as shown by the darkly-shaded region of the strip label atop each panel.

The formula,

NOx ˜ E | C

is read aloud as “NOx is plotted against E given C”. Note that the variable that goes on the vertical

axis is mentioned first in the formula — conventionally the dependent variable is plotted on the

vertical axis; the variable for the horizontal axis is given after the “ ˜ ” operator, and given or

conditioning variables are mentioned last.

The data = ethanol tells xyplot to look first in the data frame ethanol for the objects

NOx, E, and C in the formula. A data frame contains a set of related vectors and can be operated

upon as if it were a matrix; however, data frames can hold data of various types, including