Published in Public Opinion Quarterly, 51 (1987), 233-248
Return Postage in Mail Surveys: A Meta Analysis
J. Scott Armstrong
The Wharton School, University of Pennsylvania
Edward J. Lusk
Social System Sciences, University of Pennsylvania
This paper describes a five-step procedure for meta-analysis. Especially important was the
contacting of authors of prior papers. This was done primarily to improve the accuracy of the
coding; it also helped to identify unpublished research and to supply missing information.
Application of the five-step procedure to the issue of return postage in mail surveys yielded
significantly more papers and produced more definitive conclusions than those derived from
traditional reviews. This meta-analysis indicated that business reply postage is seldom cost-
effective because first class postage yields an additional 9% return. Business reply rates were lower
than for other first class postage in each of the 20 comparisons.
This paper describes a procedure for meta-analysis and applies it to a problem in survey research. The
conclusions from conventional literature reviews are contrasted with those from meta-analysis.
Meta-analysis appears to be a more rigorous method by which to integrate results from prior research,
especially when previous studies have led to conflicting results. The experiment by Cooper and Rosenthal (1980)
suggests that meta-analysis provides larger estimates of effects and better-supported conclusions than a conventional
reviewing process. Meta-analysis differs from the traditional literature review in that it emphasizes formal
procedures for the search, selection, and analysis of prior studies. Detailed discussions of the procedures for
meta-analysis are provided in Glass, McGaw, and Smith (1981), Hunter, Schmidt, and Jackson (1982), and Green
and Hall (1984).
Prior Reviews of Return Postage
Jackson (1980) suggests that a literature review should begin with a review of prior reviews. His study showed
that conventional reviews have seldom done this. Conventional reviews of the evidence on return postage have been
published by at least six researchers.1 These reviews were incomplete – three of the six reviewers missed over half
of the available studies. Conclusions from these studies were inconsistent. McCrohan and Lowe (1981), the most
recent and most extensive review to date, concluded that “the literature supporting [high powered postage] is
Meta-analyses by Yu and Cooper (1983) and Heberlein and Baumgartner (1978) also concluded that variations in
the type of return postage had negligible effects on return rates. These studies differed from ours in that they
employed different search procedures. As a result, their sample sizes of experimental studies on return postage were
small. They also analyzed differences in return rates among studies. We analyzed only those studies that had
experimentally varied the postage.
1 Scott (1961), Kanuk and Berenson (1975), Linsky (1975), Pressley (1976), Duncan (1979), McCrohan and Lowe
Hypotheses for Return Postage
We were interested not only in how postage affects response rates but also in why it might do so. Meta-analysis is
well suited for hypothesis testing, so we proposed some crude hypotheses.
To reduce bias in testing, we used the method of multiple hypotheses (Armstrong, 1979). Three hypotheses were
examined; these were based on self-interest, personalization, and dissonance. Below, we examine each hypothesis to
see what predictions are implied for the amount spent on return postage and for the appearance of the postage.2
The self-interest hypothesis suggests that respondents to a mail survey act in their own best interests. They will
reduce their time and costs and increase their benefits. The omission of postage would elicit fewer responses than
the use of business reply because some recipients would not want to pay for stamps. Business reply would be more
effective than stamps because some recipients might remove the stamps for their own use. Standard stamps would
be preferred to commemorative ones because the latter might appeal to stamp collectors.
Linsky's (1975) review found only modest support for personalization of the mailing piece, and Forsythe (1977)
found, in a survey of corporate executives, that it reduced the number of responses. Business reply is the least
personal type of postage. The omission of postage would be more personal than business reply. (It would be unusual
to receive business reply in a personal letter.) Stamps are more personal than metering or franking.3 A
commemorative stamp seems more personal than a standard stamp.
Return postage stamps might cause dissonance because recipients might not want to see money wasted. Business
reply represents no loss because money is spent only when the envelope is returned.4 However, because some
respondents may not be aware of this, we expected more dissonance to result from the use of business reply than
from the omission of postage. The more money spent on return postage. the greater the dissonance, and the higher
the return rate. Similarly, the more apparent it is that money is being spent, the higher the return; commemorative
stamps emphasize that money is being spent.
Predictions from the three hypotheses are summarized in Figure 1. The predictions differ substantially. For
example, only two of the three hypotheses favored first class over business reply.
We attempted to find all studies, published and unpublished, that made experimental comparisons of different
types of postage. To be included, a study had to provide a full report on what was done, and the different treatment
groups had to be equivalent except for the return postage. Eight computer files were searched: ABI/Inform, Man-
agement Contents, Social Science Citation Index, Sociological Abstracts, PAIS International, National Newspaper
Index, Newsearch, and Magazine Index. We used “Mail,” “Survey(s),” and “Response Rates” as key words. The
search produced ten papers relevant to our study. The references in these ten papers yielded 19 additional studies.
Our experience is similar to Yu and Cooper's (1983); most of the studies they found came from the references in the
studies discovered by a computer search.
2 Implied in the issue of return postage is that a self-addressed return envelope is included. No studies have tested its
importance since Ferriss (1951), in a survey sent to work addresses, found a dramatic reduction in the response rate
when the envelope was omitted (90.1% with envelope and postage vs. 25.8% with neither).
3 “Franking” is the first class postage permit used by government agencies and certain government officials.
4 The cost of business reply varies by the type of permit. In the United States in December 1985, the lowest rate was
available for a fixed fee of $210 and 29¢ for each piece of mail returned.
Hypotheses on Return Postage Response: Ranking of Expected Return Rates
Key: > means higher return, >> much higher return, < lower return, << much lower return
A. Amount of Postage
B. Appearance of First-Class Postage
orative Standard Metered/
orative Standard Metered/
orative Standard Metered/
Early versions of our paper were sent to the authors of the papers on return postage. We asked them about our
interpretation of their results and about studies that might have been overlooked. Replies were received from 26 of
the 32 authors. These replies provided five additional studies, three of which were unpublished. They also identified
mistakes in coding the studies, provided references on related studies, led to one correction of previously reported
research, and yielded useful comments. To our knowledge, no previously published paper has discussed or
attempted a survey of all prior authors as a way to improve the validity of the coding. In Jackson's (1980) survey of
traditional reviews, only one of 36 authors reported an attempt to contact those who had published research on their
topic and this single exception was only an effort to gain further information. (Incidentally, Heberlein and
Baumgartner (1978) also had contacted some authors to fill in missing information.)
Our search yielded 34 comparative studies. The largest number obtained by any previous review was ten. Of
course, we had a larger pool of studies. Seven studies are reported here for the first time-Pressley and Tullar (1984)
and the six described in the “Notes.”
The experimental studies varied substantially. For example, different types of outgoing postage were used, and
the number of follow-ups varied. The examination of the differential return rates helped to control for the factors
that differed among studies. However, the variation among the surveys helps to assess whether the type of return
postage has an effect in different situations.
We ignored interaction. Based on prior literature, the likelihood of significant interaction is small. Eisinger et al.
(1974), Kerin and Harvey (1976), Little and Pressley (1980), Peterson (1975), Hawes and Kiser (1981), Pressley
(1976), Pressley and Tullar (1977), and Wiseman (1973) found interaction effects in mail surveys to be insignificant.
Otwin and Corday (1985) and Bullock and Svyantek (1985) showed that conclusions from meta-analysis are
sensitive to coding procedures. Coding the direction and magnitude of effect was a simple matter in our study.
However, to ensure reliability, each of the authors coded each study and a research assistant provided an
independent check. The few differences that arose were recording errors. As noted above, the authors of the original
studies were also asked to verify our ratings. This step proved useful in correcting some of our interpretations.
The results are presented below in terms of the original mail-out sample (with no adjustment for wrong
addresses). If treatment X increases return rates from 40% to 50%, it is referred to as a 10% increase in return rate.
To avoid bias, each study was weighted equally.
Amount of Postage
What may be expected if return postage is not included? Unfortunately we found only six studies on this issue,
three of which we had done (Notes I, 2, and 9). These studies, summarized in Table 1, were not a representative
sample, given the low response rates. The use of first class postage produced 3% more replies than were obtained
with no postage. In addition, it seems a bit presumptuous to expect the respondent to provide postage.
First Class (FC) vs. No Postage (NP)
Table 2 summarizes all of the 16 studies we found on business reply (20 comparisons with more than 31,000
questionnaires). It shows return rate differentials of first class over business reply.5 The return rate covered a wide
range, going from 5.6% to 66.3% for business reply. One might expect smaller gains where the business reply draws
a high response, as there is less room for improvement. However, the gains were as large for surveys with high
return rates-the correlation coefficient between business reply rate and the first class gain was small and statistically
insignificant. We also expected a small advantage for first class postage in mailing to work addresses, but the results
were just the opposite. Gains occurred for all 20 comparisons, a statistically significant difference (p < .000001).
This stability of the results across different situations is remarkable. (See Note 8 for additional evidence.) Significant
results (at p < .05) were found for 15 of the 20 comparisons. The average gain was 9.2% (as a percentage of the
original sample) with a 95% confidence interval from 7.0% to 11.4%. The median gain was 8.3%.
5 The term “first class” is used as shorthand in this paper for metering, franking, and standard stamps. Business reply
is also given first class treatment by the U.S. Post Office, but it differs in method of payment.
First Class (FC)* vs. Business Reply (BR)
McCrohan and Lowe (1981), Jones and Linda (1978), and Harris and Guffey (1978) concluded that business
reply is often cost effective. However, these studies assumed no gain in return rates for first class postage and looked
only at postage costs.
Guffey, Harris, and Guffey (1980) concluded that business reply is useful only for a large inexpensive survey
where a low response rate is expected. We initially agreed and had hoped to define more explicitly the conditions
under which the use of business reply was desirable. Our meta-analysis surprised us because, given the magnitude
and stability of these results, we have been unable to imagine a practical survey research design that would call for
business reply when one considers the total cost per returned questionnaire. We have also asked some colleagues to
produce such a design, but with no success. Nevertheless, researchers can use our estimate of return rates in de-
signing their surveys and reach their own conclusions on this.
Our emphasis on return rates in the cost-effectiveness analysis seems reasonable. No evidence has been found to
suggest adverse effects due to the type of return postage. The rate of return over time is not affected (Finn, 1983;
Peterson, 1975), item nonresponse is not affected (Jones and Linda, 1978), and there is little danger that the return
postage will produce response bias (Jones and Linda, 1978; McCrohan and Lowe, 1981).
Appearance of Return Postage
Table 3 summarizes evidence from the six studies on types of first class postage. It compares stamps (all types)
versus metered postage (including franking). The studies are ranked by response rates for the metered postage. The
evidence, based on over 9,0(10 questionnaires, suggests a gain of over 3% for stamps, but this is not statistically
significant (using the addition of p-values from Rosenthal, 1978).
Stamps (S)* vs. Metered Postage (M)
Commemorative and standard stamps were compared in eight studies with a total of almost 8,000 questionnaires.
As shown in Table 4, the use of commemorative stamps led to a small gain (1.6%), but this was not statistically
significant. Given the proliferation of standard stamps that look like commemoratives, this strategy is not promising.
Commemorative (C) vs. Regular Stamps (R)
The use of a number of small-denomination stamps was compared with that of a single stamp in Longworth
(1953), Watson (1965), and Pressley and Tullar (1984). Multiple stamps increased the return percentages by 2%,
5.8%, and 2.8%, respectively, an average gain of 3.5%. Though this gain is not statistically significant, the direction
is consistent with the hypothesis that appearance matters.
Testing the Hypotheses
The literature offers few guidelines on the use of meta-analysis for testing hypotheses. We suggest that the
evaluation of the hypotheses be done formally. Furthermore, alternative schemes should be used so that the
sensitivity of the conclusions can be assessed. Finally, sufficient detail should be provided so that readers can use
whatever approach they feel to be appropriate.
To obtain an error score, we used a five-point scale for magnitude of effects:
<< < = > >>
much less less no difference more much more
We coded the results by counting the number of positions between actual and predicted (from Figure 1). If the
prediction was > and actual was <, the prediction error would equal 3. (We used the following intervals: equality
was assigned if the rate differential was I% or less; > or < was assigned if the rate differential was greater than 1%,
but less than 2%; and >> or << was assigned otherwise.) This score was doubled for statistically significant results
to place more emphasis on them. (Statistical significance was defined as p < .05, given at least five independent
studies.) This scheme yielded a crude index of predictive ability for each hypothesis.
Perfect predictions by a hypothesis would yield an error- score of zero. The worst possible error score, given the
actual results, would be 17. As a benchmark, if all treatments were assumed to be equal, the error score would be 9.
The self-interest hypothesis did poorly with an error score of 15. Dissonance did well, with an error score of 4. The
personalization hypothesis had a perfect score of zero. The ranking of the hypotheses was. not sensitive to the
scoring rules that we used.
Meta-analysis was used to test hypotheses about return postage. The procedure involved (1) developing multiple
hypotheses; (2) conducting an extensive and systematic search (contacting the authors of the original research
papers); (3) screening (using clearly defined prespecified criteria); (4) analyzing (coding and using formal statistical
procedures); and (5) testing hypotheses (using formal schemes, testing sensitivity, and providing full disclosure).6
These procedures led to clear-cut conclusions on a topic that has previously been interpreted in various ways.
The basic conclusion from the meta-analysis is that business reply postage is seldom cost effective because first
class postage yields an additional 9% return. This result was remarkably stable across different situations. This
estimate will allow survey researchers to test the value of business reply in a proposed study. (We had difficulty,
however, in creating a realistic survey that would call for business reply.) The results were consistent with the
personalization hypothesis and contrary to the self-interest hypothesis.
For most mail surveys, we recommend the use of either commemorative or a set of small denomination stamps on
the return envelope. This will reduce the cost per returned questionnaire and it will reduce nonresponse bias.
Note 1. In 1982, Luck and Armstrong surveyed 1,457 forecasting practitioners drawn from the International Institute
of Forecasters' mailing list. The questionnaire was highly specialized, dealing with technical approaches to
forecasting methods. (An intensive effort to contact 100 nonrespondents revealed that only a small proportion of the
sample felt capable of responding.) All respondents received a return-addressed envelope. some with return postage
and some with none. The response rates were 15.6% for first class postage and 14.9% for no postage, a difference
that was not statistically significant (at p < .05).
Note 2. Kathleen L. Tribull, in a study done under the supervision of Armstrong. sent a questionnaire to a stratified
random sample of 200 people from the Wharton Entrepreneurial Center mailing list in March 1984. A copy of this
study is available from Scott Armstrong.
Note 3. Roger A. Kerin (personal communication dated 19 December 1983) conducted a 1979 study of magazine
readership by 1,000 chief executive officers in firms in the southwestern pan of the United States. His address is:
Edwin L. Cox School of Business, Southern Methodist University. Dallas, TX 75275.
6 Details on the sensitivity testing are available from the authors. We also had examined other threats to validity.
None of these proved to be important, so they were omitted to save space.
Note 4. Peterson (1975, p. 203) reported that business reply brought significantly more response than stamps. That
was a misprint. It should have read that first class (stamps or meters) achieved a higher return than business reply.
The results are correctly presented elsewhere in his paper. (Personal communication with Robert Peterson in March
Note 5. Charles S. Goodman provided an unpublished report, “Dental Readership Studies,” dated 10 October 1961.
(Copies are available from Scott Armstrong.)
Note 6. Wolfgang Blass-Wilhelms translated the key portions of his paper into English for us. His address is: Institut
für Soziologie. Universität Hamburg, Allende Platz 1/R 300, 2000 Hamburg 13, West Germany.
Note 7. Arthur C. Wolfe provided his report. “Michigan Public Opinion toward Motor Vehicle Inspection,”
published in September 1979 by the Highway Safety Research Institute, University of Michigan, Ann Arbor, MI
48109. (The response rates in this study were calculated after the removal of the 6% of incorrect addresses.)
Note 8. Consideration was given to postage increases beyond first class: Wallace (1954). Kimball (1961), and Off
and Clinton (1965) found increases of 42%. 7.1%, and 15%, respectively, for domestic air mail return stamps versus
first class. These results add support to the results in Table 2.
Note 9. Armstrong conducted a survey of members of the International Institute of Forecasters in April 1986. The
mailout was arranged alphabetically. Return envelopes were then alternated-one with a stamp and the next with no
postage. A total of 908 questionnaires were sent to the U.S. members (454 with stamps and 454 with no postage).
The low overall rate of return might be due to the fact that the questionnaire was sent in a package that included a
newsletter and three announcements. and that no follow-ups were made.
Armstrong, J. Scott (1979), “Advocacy and objectivity in science,” Management Science, 25, 423-428.
Blass-Wilhelms, Wolfgang (1982), “Der Einfluß der Frankierungsart auf den Rücklauf von Antworlkarten,”
Zeitschrift für Soziologie, 11, 64-68.
Brook, Lindsay L. (1978), “The effect of different postage combinations on response levels and speed of reply,”
Journal of the Market Research Society, 20, 238-244.
Bullock, R. J., and D. J. Svyantek (1983), “Analyzing meta-analysis: Potential problems, an unsuccessful
replication and evaluation criteria,” Journal of Applied Psychology, 70, 108-115.
Cooper, Harris M., and Robert Rosenthal (1980), “Statistical versus traditional procedures for summarizing research
findings,” Psychological Bulletin, 87, 442-449.
Duncan, W. Jack (1979), “Mail questionnaires in survey research: A review of response inducement effects,”
Journal of Management, 5, 39-55.
Eisinger, R. A., W. P. Janicki, R. L. Stevenson and W. L. Thompson (1974), “Increasing returns in international
mail surveys,” Public Opinion Quarterly, 38, 124-130.
Ferriss, Abbott L. (1951), “A note on stimulating response to questionnaires,” American Sociological Review, 16,
Finn, David W. (1983), “Response speeds, functions. and predictability in mail surveys,” Journal of the Academy of
Marketing Science, 11, 61-70.
Forsythe, John B. (1977), “Obtaining cooperation in a survey of business executives,” Journal of Marketing
Research, 14, 370-373.
Glass, Gene V., Barry McGaw, and Mary L. Smith (1981), Meta-Analysis in Social Research. Beverley Hills, CA:
Glisan, George and Jim L. Grimm (1982), “Improving response rate in an industrial setting: Will traditional
variables work?” Southern Marketing Association Proceedings, 265-268.
Green, Bert F. and Judith A. Hall (1984), “Quantitative methods for literature reviews,” Annual Review of
Psychology, 35, 37-53.
Guffey, Hugh J., Jr., J. R. Harris, and M. M. Guffey (1980), “Stamps versus postal permits: A decisional guide for
return postage in mail questionnaires,” Journal of the Academy of Marketing Science, 8, 234-242.
Gullahorn, Jeanne E. and John T. Gullahorn (1963), “An investigation of the effects of three factors on response to
mail questionnaires,” Public Opinion Quarterly, 27, 294-296.
Hammond, E. Cuyler (1959), “Inhalation in relation to type and amount of smoking,” Journal of the American
Statistical Association, 54, 35-49.
Harris, James R. and H. J. Guffey (1978), “Questionnaire returns: Stamps versus business reply envelopes
revisited,” Journal of Marketing Research, 15, 290-293.
Hawes, Jon M. and G. E. Kiser (1981), “Additional findings on the effectiveness of three response-inducement
techniques for a mail survey of a commercial population,” Southern Marketing Association
Ileberlein, Thomas A. and Robert Baumgartner (1978), “Factors affecting response rates to mailed questionnaires: A
quantitative analysis of the published literature,” American Sociological Review, 43, 447-462.
Hensley, Wayne E. (1974), “Increasing response rate by choice of postage stamps,” Public Opinion Quarterly, 38,
Hewett, W. C. (1974), “How different combinations of postage on outgoing and return envelopes affect
questionnaire returns,” Journal of the Market Research Society, 16, 49-50.
Hunter. John E.. Frank L. Schmidt, and Gregg B. Jackson (1982), Meta-Analysis: Cumulating Research Findings
across Studies. Beverly Hills. CA, Sage.
Jackson, Gregg B. (1980), “Methods for integrative reviews,” Review of Educational Research, 50, 438-460.
Jones, Wesley H. and Gerald Linda (1978), “Multiple criteria effects in a mail survey experiment,” Journal of
Marketing Research, 15, 280-284.
Kanuk, Lesley and Conrad Berenson (1975), “Effects of postage and mailing classes on mail questionnaire response
rates,” Journal of Marketing Research, 12, 440-453.
Kerin, Roger A. and Michael G. Harvey (1976), “Methodological considerations in corporate mail surveys: A
research note,” Journal of Business Research, 4, 277-281.
Kimball, Andrew E. (1961), “Increasing the rate of return in mail surveys,” Journal of Marketing, 25, 63-64.
Labrecque, David P. (1978), “A response rate experiment using mail questionnaires,” Journal of Marketing, 42,
Linsky, Arnold S. (1975), “Stimulating responses to mailed questionnaires: A review,” Public Opinion Quarterly,
Little, Taylor E., Jr., and Milton M. Pressley (1980), “A multifactor experiment on the generalizabilily of direct mail
advertising response techniques to mail survey design,” Journal of the Academy of Marketing Science, 8,
Longworth, Donald S. (1953), “Use of a mail questionnaire,” American Sociological Review, 18, 310-313.
McCrohan, Kevin F. and Larry S. Lowe (1981), “A cost/benefit approach to postage used on mail questionnaires,”
Journal of Marketing, 45, 130-133.
Martin, J. David and Jon P. McConnell (1970), “Mail questionnaire response induction: The effect of four variables
on the response of a random sample to a difficult questionnaire,” Social Science Quarterly, 51, 409-414.
Off, David B. and A. Neyman Clinton. Jr. (1965), “Considerations, costs and returns on a large-scale follow-up
study,” Journal of Educational Research, 58, 373-378.
Orwin, Robert G. and D. S. Corday (1985), “Effects of deficient reporting on meta-analysis: A conceptual
framework and reanalysis,” Psychological Bulletin, 97, 134-147.
Perry, Norman (1974), “Postage combinations in postal questionnaire surveys: Another view,” Journal of the
Market Research Society, 16, 199-210.
Peterson, Robert A. (1975), “An experimental investigation of mail survey responses,” Journal of Business
Research, 3, 199-210.
Pressley, Milton M. (1976), Mail Survey Response: A Critically Annotated Bibliography. Greensboro, NC: Faber &
Pressley, Milton M. and W. L. Tullar (1977), “A factor interactive investigation of mail survey response rates from
a commercial population,” Journal of Marketing Research, 14, 108-111.
———— (1984), “Postage as a response inducing factor in mail surveys of commercial populations,” Working
paper: College of Business, University of New Orleans.
Price, D. O. (1950), “On the use of stamped return envelopes with mail questionnaires,” American Sociological
Review, 15, 672-673.
Robinson, R. A. and Philip Agisim (1951), “Making mail surveys more reliable,” Journal of Marketing, 15,
Rosenthal, Robert (1978), “Combining results of independent studies,” Psychological Bulletin, 85, 185-193.
Scott, Christopher (1961), “Research on mail surveys,” Journal of the Royal Statistical Society. Series A, Part 2,
Veiga, John F. (1974), “Getting the mail questionnaire returned: Some practical research considerations,” Journal of
Applied Psychology, 59, 217-218.
Venne, Richard V. (1954), Direct Mail Announcement of Agricultural Publications. Bulletin 21. Madison, WI:
Department of Agriculture Journalism, College of Agriculture, University of Wisconsin.
Wallace, David (1954), “A case for-and against-mail questionnaires.,” Public Opinion Quarterly, 18, 40-52.
Watson, John J. (1965), “Improving the response rate in mail research,” Journal of Advertising Research, 5, 48-50.
Wiseman, Frederick (1973), “Factor interaction effects in mail survey response rates,” Journal of Marketing
Research, 10, 330-333.
Wolfe, Arthur C. and Beatrice R. Trieman (1979), “Postage types and response rates in mail surveys,” Journal of
Advertising Research, 19, 43-48.
Yu, Julie and Harris Cooper (1983), “A quantitative review of research design effects on response rates to
questionnaires,” Journal of Marketing Research, 20, 36-44.