ChapterPDF Available

Students and Teachers’ Knowledge of Sampling and Inference



Ideas of statistical inference are being increasingly included at various levels of complexity in the high school curriculum in many countries and are typically taught by mathematics teachers. Most of these teachers have not received a specific preparation in statistics and therefore, could share some of the common reasoning biases and misconceptions about statistical inference that are widespread among both students and researchers. In this chapter, the basic components of statistical inference, appropriate to school level, are analysed, and research related to these concepts is summarised. Finally, recommendations are made for teaching and research in this area.
Anthony Harradine
, Carmen Batanero
and Allan Rossman
Potts-Baker Institute, Prince Alfred College, Australia;
University of Granada, Spain;
California Polytechnic State University, United States of America;;
Abstract: Ideas of statistical inference are being increasingly included at various levels of
complexity in the high school curriculum in many countries and are typically taught by
mathematics teachers. Most of these teachers have not received a specific preparation in
statistics and therefore could share some of the common reasoning biases and
misconceptions about statistical inference that are widespread among both students and
researchers. In this chapter the basic components of statistical inference, appropriate to
school level, are analysed, and research related to these concepts is summarised. Finally,
recommendations are made for teaching and research in this area.
Statistical inference, in the simplest possible terms, is the process of assessing
strength of evidence concerning whether or not a set of observations is consistent with a
particular hypothesised mechanism that could have produced those observations. It is an
essential tool in management, politics and research; however, people’s understanding of
statistical inference is generally flawed. The application and interpretation of standard
inference procedures is often incorrect (see, for example Harlow, Mulaik, & Steiger, 1997;
Batanero, 2000; Cumming, Williams, & Fidler, 2004).
Because of the relevance and importance of statistical inference, education authorities
in some countries include a basic study of statistical inference in the curriculum of the last
year of high school (17-18 year olds). For example, South Australian and Spanish students
learn about statistical tests and confidence intervals for both means and proportions (Senior
Published in C. Batanero, G. Burrill, & C. Reading (Eds.), Teaching Statistics in School-Mathematics-
Challenges for Teaching and Teacher Education: A Joint ICMI/IASE Study (pp. 235- 246), DOI 10.1007/978-
94-007-1131-0, Springer Science+Business Media B.V. 2011. The original publication is available at
Secondary Board of South Australia, 2002; Ministry of Education and Sciences, 2007).
New Zealand students learn about confidence intervals, resampling and randomisation
(Ministry of Education, 2007).
Some of the fundamental elements of basic inference are implicitly or explicitly
included in various middle school curricula, as well. For example, the National Council of
Teachers of Mathematics (NCTM) Standards (2000) suggest that Grades 6–8 students
should use observations about differences between two or more samples to make
conjectures about the populations. NCTM further recommends that grades 9-12 should use
simulations to explore the variability of sample statistics from a known population and to
construct sampling distributions; they also should understand how a sample statistic reflects
the value of a population parameter and use sampling distributions as the basis for informal
inference. More recently, the American Statistical Association’s Guidelines for Assessment
and Instruction in Statistics Education (GAISE; Franklin et al, 2005) highlights the need for
students to look beyond the data when making statistical interpretations in the presence of
variability and urges that students in middle grades recognize the feasibility of conducting
inference and that high school students learn to make inferences both with random
sampling from a population and with random assignment to experimental groups.
This chapter analyses the basic elements of statistical inference and then summarises
part of the wider research that is relevant to teaching this topic (see Vallecillos, 1999;
Batanero, 2000 and Castro-Sotos, Vanhoof, Noortgate, & Onghena; 2007 for an expanded
survey). The chapter finishes with some implications for teaching and research.
Classical statistical inference consists primarily of two types of procedures,
hypothesis testing and confidence intervals. These techniques build on a scheme of
interrelated concepts including probability, random sampling, parameter, distribution of
values of a sample statistic, confidence, null and alternative hypothesis, p-value,
significance level, and the logic of inference (Lui & Thompson, 2009).
Consequently, statistical inference consists of three distinct, but interacting,
fundamental elements: (a) the reasoning process, (b) the concepts and (c) the associated
computations. Because the computations are often easily learned by students, and can be
facilitated by user-friendly software, teachers of statistics must teach the three components
and not just the mechanics of inference, because the main difficulties in understanding
statistical inference lie within the other two elements.
2.1. The reasoning process
Garfield and Gal (1999) suggest that, across the primary, middle and high school
years, teachers must develop students’ statistical reasoning the processes people use to
reason with statistical ideas and make sense of statistical information. This process is
supported by concepts such as distribution, centre, spread, association, uncertainty,
randomness and sampling, some of which have been analysed in other chapters in this
book. While most students may be able to perform the calculations associated with an
inferential process, many students hold deep misconceptions that prevent them from
making an appropriate interpretation of the result of an inferential process (Vallecillos,
1994; Batanero, 2000; Castro-Sotos, et al., 2007). In addition, Garfield (2002) remarks that
some teachers do not specifically teach students how to use and apply types of reasoning
but rather teach concepts and procedures and hope that the ability to reason will develop as
a result. As a consequence, students reach their first inferential reasoning experience with a
reasoning-free statistical background, giving rise to a mind-set that statistics is solely about
the computation of numerical values. One possible reason for this unfortunate circumstance
is that teachers responsible for teaching statistics at a high school level may have serious
deficiencies in their knowledge that lead to inadequate understandings of inference (Liu &
Thompson, 2009).
2.2. The concepts
Central to learning statistical inference is understanding that the variation of a given
statistic (e.g. the mean) calculated from single random samples is described by a probability
distribution known as the sampling distribution of the statistic. When thinking about
statistical inference it is necessary to be able to clearly differentiate between three
The probability distribution that models the values of a variable from the
population/process. This distribution usually depends on some (typically unknown)
parameter values. For example, a normally distributed population is specified by two
parameters - its mean and standard deviation, often denoted by
The data distribution of the values of a variable for a single random sample taken from
the population/process. From this sample sample statistics such as the mean and
standard deviation, often denoted by
, can be used in the process of estimating
the unknown values of the population parameters.
probability distribution that models the variability in values of a statistic from ‘all’
potential random samples taken from the population/process, called the sampling
distribution. One example is the sampling distribution of a sample mean, which in many
circumstances has an approximately normal distribution with mean µ and standard
, where n represents the sample size. This result provides the basis for
much of classical statistical inference.
Sampling distributions are more abstract than the distribution of a population or a
sample and so are typically very challenging for students to understand (see section 3.2).
One reason for this difficulty is that when thinking about both the population distribution
and the single random sample’s distribution, the unit of analysis (case) is an individual
object. This is in stark contrast to the sampling distribution where the case is a single
random sample (Batanero, Godino, Vallecillos, Green, & Holmes, 1994). The object of
interest for each distribution might be the mean, for example, but in each case the
distribution’s mean has a different interpretation and a different behaviour. One strategy for
helping students to understand these distinctions is to engage in activities that involve
repeatedly taking random samples from a population. When working with such activities,
high school students often struggle with moving between the various levels of imagery
(Saldahna & Thompson, 2002). Proper application and interpretation of statistical inference
requires mastery of the knowledge and techniques specific to each distribution and
understanding of the rich links among these distributions.
Research reviewed in this section deals with understanding sampling and the
sampling distribution, hypothesis tests and confidence intervals.
3.1. Understanding sampling
Research on inferential reasoning started with the heuristics and biases programme of
research in psychology (Kahneman, Slovic, & Tversky, 1982), which established that most
people do not follow the normative mathematical rules that guide formal scientific
inference when they make a decision under uncertainty. Instead, people tend to use simple
judgmental heuristics that sometimes cause serious and systematic errors, and such errors
are resistant to change. For example in the representativeness heuristics, people tend to
estimate the likelihood for an event based on how well it represents some aspects of the
parent population. An associated fallacy that has been termed belief in the Law of Small
Numbers is the belief that even small samples should exactly reflect all the characteristics
in the population distribution.
Most curricula at a high school level include some instruction on random sampling,
which is mostly theoretical and includes descriptions of different methods of random
sampling. The core message of such instruction is that if a sample is chosen in a suitable
random manner and is sufficiently big, it will be representative of the population from
which it has been drawn. Students therefore learn to think about a random sample as a mini-
me of the population and that the purpose of drawing a random sample is to ensure
representativeness in order to gain knowledge about the population from the sample. This
conception constrains students’ thinking to a single random sample only and provides no
avenue to appreciate the range of possible samples that might have been drawn and the
variability across that range.
Understanding the purpose of drawing a single random sample in the context of
hypothesis tests and confidence intervals, requires the assimilation of “two apparently
antagonistic ideas: sample representativeness and (sampling) variability” (Batanero et al,
1994). In these situations the purpose of drawing a single sample is to quantify that
sample’s level-of-unusualness relative to the many other samples that could have been
drawn. Saldahna and Thompson (2002) observed that, without a suitable sense of the
variation across many possible samples, which extends to the notion of the distribution of a
statistic, 11th and 12th grade students tended to judge a sample’s representativeness only in
relation to the population parameter. Hence, when required to decide how rare a sample
was, these students did so based on how different they thought it was to the underlying
population parameter and not “on how it might compare to a clustering of the statistic’s
values” (Saldanha & Thompson, 2002).
3.2. Understanding sampling distributions
Reasoning about sampling distributions requires students to integrate several
statistical concepts and to be able to reason about the hypothetical behaviour of many
samples an intangible thought process for many students (Chance, Delmas & Garfield,
2004). According to these authors, many students fail to develop a deep understanding of
the sampling distribution concept and as a result can only manage a mechanical knowledge
of statistical inference, leaving such tasks as interpreting a p-value well beyond those
Saldahna and Thompson (2002) studied the understandings of high school students
when engaged in activities that used computer applets to simulate repeated random
sampling from a population. The activity required students to randomly draw a sample from
a population, compute a sample proportion and then repeat this process over and over. They
found that most students had extreme difficulty in conceiving of repeated sampling in terms
of three distinct levels: population, sample, collection of sample statistics. These difficulties
led many students to misinterpret a simulation’s result as a percentage of people rather than
a percentage of sample proportions.
Chance et al. (2004) found that while students were able to observe behaviours and
notice patterns in the behaviour (e.g. larger the sample size smaller the variation) shown by
random sampling applets, they did not understand why the behaviour occurred. The authors
noted that, after exposure to applets, students were unable to suggest plausible distributions
of samples for a given sample size and agreed with Saldahna and Thompson that students
did not have a clear distinction between the distribution of one sample of data and the
distribution of means of samples. Simply being exposed to the applets was not sufficient to
render a learning gain. The authors concluded that: (a) students need to become more
familiar with the process of sampling, (b) activities associated with applets need to be both
structured and unstructured, and (c) students need to discuss their observations after an
activity so they could become focussed on what observations are most important, what
important observations they did not make and how the important observations are
3.3. Understanding the null and alternative hypotheses
Errors and misinterpretations in hypothesis tests can lead to a paradoxical situation,
where, on one hand, a significant result is often required to get a paper published in many
journals and, on the other hand, significant results are misinterpreted in these publications
(Falk & Greenbaum, 1995). There is confusion between the roles of the null and alternative
hypotheses as well as between the statistical alternative hypothesis and the research
hypothesis (Chow, 1996). Vallecillos (1994) reported that many students in her research,
including 6 out of 31 pre-service mathematics teachers, believed that correctly carrying out
a test proved the truth of the null hypothesis, as in the case of a deductive procedure.
Vallecillos (1999) described four different conceptions regarding the type of proof that
hypotheses tests provide: (a) as a decision-making rule, (b) as a procedure for obtaining
empirical support for the hypothesis being researched, (c) as a probabilistic proof of the
hypotheses, and (d) as a mathematical proof of the truth of the hypothesis. While the two
first conceptions are correct, many students in her research, including some pre-service
teachers, held either conception (c) or (d).
Belief that rejecting a null hypothesis means that one has proven it to be wrong was
also found in the research by Lui and Thompson (2009) when interviewing 8 high school
statistics teachers, who seemed not to understand the purpose of statistical tests as
mechanisms to carry out statistical inferences.
3.4. Understanding statistical significance and p-values
Two particularly misunderstood concepts are the significance level and the p-value.
The significance level is defined as the probability of falsely rejecting a null hypothesis.
The p-value is defined as the probability of observing the empirical value of the statistics or
a more extreme value, given that the null hypothesis is true. The most common
misinterpretation of these concepts consists of switching the two terms in the conditional
probability: interpreting the level of significance as the probability that the null hypothesis
is true once the decision has been made to reject it or interpreting the p-value as the
probability that the null hypothesis is true, given the observed data. For example, Birnbaum
(1982) reported that his students found the following definition reasonable: "A level of
significance of 5% means that, on average, 5 out of every 100 times we reject the null
hypothesis, we will be wrong". Falk (1986) found that most of her students believed that α
was the probability of being wrong when rejecting the null hypothesis at a significance
level α. Similar results were found by Krauss and Wassner (2002) in university lecturers
involved in the teaching of research methods. More specifically they found that 4 out of
every 5 methodology instructors have misconceptions about the concept of significance,
just like their students. Vallecillos (1994) carried out extensive research on students
misconceptions related to statistical tests (n=436 students from different backgrounds) that
included 31 pre-service mathematics teachers (students graduating in mathematics), 13 of
whom interpreted the level of significance as the probability that the null hypothesis is true,
once the decision to reject it has been made.
Lui and Thompson (2009) remark that the ideas of probability and unusualness are
central to the logic of hypothesis testing, where one rejects a null hypothesis when a sample
from this population is judged to be sufficiently unusual in light of the null hypothesis.
However, they found that teachers “conceptions of probability (or unusualness) were not
grounded in a conception of distribution and thus did not support thinking about
distributions of sample statistics and the fraction of the time that a statistic’s value is in a
particular range (p. 16). While a single random sample is a critical part of statistical
inference, probably more important is an appreciation of the "could-have-been" all the
other random samples that could have been drawn but were not. “Sampling has not been
characterized in the literature as a scheme of interrelated ideas entailing repeated random
selection, variability, and distribution.” (Saldahna & Thompson, 2002, p. 258).
3.5. Understanding confidence intervals
Fiddler and Cumming (2005) asked a sample of 55 undergraduates and postgraduate
science students to interpret statistically non-significant results and gave the results in two
different ways (first as p values and then as confidence intervals or vice versa). Students
were asked to indicate whether the results provided support for the null hypothesis
(considered as a misconception), provided support against the null hypothesis, or neither.
The authors found that students misinterpreted p-values twice as often as they mis-
interpreted confidence intervals. There was also evidence that students who were given the
confidence interval results first gave the correct answer on the p value presentation more
often than students who were given the p value results first. The author concluded there are
benefits of teaching inference via confidence intervals rather than hypothesis tests.
Cumminget al. (2004) reported an internet study in which researchers were given
results from an experiment (simulated in an applet) and were asked to show where they
thought the 10 means from 10 ‘new’ samples could plausibly fall. The results suggested
that a majority of the researchers held a misconception that a r% confidence interval will,
on average, capture r% of the means of the ‘new’ samples.
Castro-Sotos (2009) reported slightly lower percentages of students with certain
misconceptions related to hypothesis testing when compared to similar studies from years
before. The author suggests that innovation in statistics education in the last decade may be
resulting in some level of improved understanding of statistical inference. While this is
merely conjecture, it highlights the idea that students must develop an understanding of
many challenging probabilistic and statistical concepts and the relationships between them
before meeting statistical inference. Given the difficulty learners have integrating the
concepts involved in statistical inference, it makes sense that the underpinning ideas need to
be developed over years, not weeks.
4.1. Inference-friendly views of a sample
Statistical inference is applied to a wide variety of situations. However, understanding
why it can be validly applied to one situation does not mean learners will understand why it
can (or cannot) be validly applied to another, e.g. a situation involving the mean of a finite
population compared to a situation involving measurement error (where a population does
not exist, but a true value of the measurement does). Students need to hold multiple views
of a sample, appreciating the source(s) of the variability that give rise to the samples
characteristics, to deeply understand statistical inference and its many applications. Context
is clearly critical in supporting a student to develop different views of a sample. Konold and
Lehrer (2008) discuss three contexts from which samples are produced: measurement error,
manufacturing processes and natural variation.
A critical view of a sample is as the result of a target-error process, which aims to
consistently produce a single value but fails due to the unavoidable variation in the process
(e.g. the machine process that aims to cut fruit bars to be exactly 7 cm long). This can be
referred to as the target-error-view of sample. Opportunities to develop this view are rarely,
if ever, provided at a school level. Natural variation contexts (e.g. the weight of all female
quokkas on Rottnest Island) are the most common contexts students meet at school but do
not help in developing this critical view of a sample.
Students also need opportunities, over a period of years, to develop a view of a
sample as a single instantiation of the random sampling process from a population and to
develop the appreciation that each possible random sample carries with it an associated
level of unusualness (the probability of being drawn). This is referred to as the population-
view of a sample. While this is the most common view, and current school curricula attempt
to develop this using contexts associated with natural variation, it is possible that the target-
error-view of a sample should be developed prior the population-sample view. Konold,
Harradine, and Kazak (2007) describe activities in which middle school students build data
factories with the aim of assisting in the development of the target-error-view. Their
approach also develops the notion that data result from chance based processes and as such
make explicit the relationship between data and chance; a relationship critical to
understanding statistical inference and that has been lost (or was never present) in many
current school curricula (Konold & Kazak, 2007). Without such views of sample, it is
difficult to develop a deep understanding of, and validly apply, statistical inference.
4.2. Developing an understanding of the population-view of a sample
Many interactive applets are now available that provide dynamic, visual
environments within which students can engage in the construction of sampling
distributions. Chance et al. (2004) reported on a series of studies that investigated the
impact that interacting with such applets had on students’ understanding when learning
about sampling distributions. In the first studies, students tended to look for rules when
answering test items and did not understand the underlying relationships that caused the
visible patterns they noticed as a result of using the applets. In later studies, the authors
asked the students to make predictions about sampling distributions of means before using
the applets to validate their predictions. This strategy proved to be useful in improving the
students' reasoning about sampling distributions.
4.3. Alternative ways to introduce statistical inference
Most students’ first introduction to statistical inference is via a first course in classical
statistical inference. In recent years the literature has included thinking about what is
termed informal inference. While informal inference, as a concept, is not yet universally
agreed upon, a consistent feature of informal inference is that suggested activities engage
students in the reasoning process of statistical inference without relying on probability
distributions and formulas.
Some see informal inference as the collection of the fundamental ideas that underpin
the understanding of classical statistical inference. These fundamentals include
discriminating between signal and noise in aggregates, understanding sources of variability,
recognizing the effect of sample size, and being able to identify tendencies and sources of
bias (Rubin, Hammerman, & Konold, 2006). Other views of informal inference include
(Zieffler, Garfield, Delmas, & Reading, 2008): (a) reasoning about possible characteristics
of a population from a sample of data, (b) reasoning about possible differences between
two populations from observed differences between two samples of data and, (c) reasoning
about whether or not a particular sample statistic is likely or unlikely given a particular
expectation about the population.
Cobb (2007) proposes teaching the logic of inference with randomisation tests rather
than using normal distributions as approximate models for sampling distributions, noting
that such an approach is what Ronald Aylmer Fisher advocated, but which was not realistic
in his day due to the absence of computers. Rossman (2008) claims that teachers could use
randomisation tests to connect the randomness that students perceive in the process of
collecting data to the inference to be drawn. He provides examples of how such a
randomization-based approach might be implemented, while Scheaffer and Tabor (2007)
propose such an approach for the secondary curriculum and provide relevant examples.
4.4. Teacher knowledge
Research results summarised in this chapter primarily concern students’
misconceptions and difficulties in learning about statistical inference. The little research
available about teachers’ understanding of statistical inference (Vallecillos, 1994; 1999;
Krauss & Wassner, 2002; Lui & Thompson, 2009) indicates it is possible that some
teachers share the same misconceptions as the students. In addition, teachers who have not
studied statistical inference prior to having to teach it are likely to have the same difficulties
in learning the concepts as students do. If this is the case and the situation is not addressed,
then it is unlikely that widespread improvement in student understanding will be seen any
time soon.
4.5. Some research priorities
The valid application of statistical inference is of critical importance in a broad range
of human endeavours. Areas in which research attention is needed include:
The creation and critical evaluation of a curriculum that systematically develops the
key ideas that underpin statistical inference across a number of years in the middle and
high school years, so a proper foundation is laid for the formal instruction of statistical
The study of the current level of understanding and professional knowledge, both at a
school and university level, of those teachers charged with teaching statistical
The critical evaluation of the use of alternative methods (e.g. randomisation tests)
when first introducing statistical inference. Great care should be taken in this area
given the widespread and long-term use of classical statistical inference.
Batanero, C. (2000). Controversies around significance tests. Mathematical Thinking and
Learning, 2(1-2), 75-98.
Batanero, C., Godino, J. D., Vallecillos, A., Green, D. R., & Holmes, P. (1994). Errors and
difficulties in understanding elementary statistical concepts. International Journal of
Mathematics Education in Science and Technology, 25 (4), 527–547.
Birnbaum, I. (1982). Interpreting statistical significance. Teaching Statistics, 4, 24–27.
Castro-Sotos, A. E. (2009). How confident are students in their misconceptions about
hypothesis tests? Journal of Statistics Education 17 (2). Online:
Castro-Sotos, A. E., Vanhoof, S., Noortgate, W. & Onghena, P. (2007). Students’
misconceptions of statistical inference: A review of the empirical evidence from
research on statistics education. Educational Research Review, 2, 98–113
Chance, B., delMas, R. C., & Garfield, J. (2004). Reasoning about sampling distributions.
In D. Ben-Zvi & J. Garfield (Eds.), The challenge of developing statistical literacy,
reasoning and thinking (pp. 295-323). Amsterdam: Kluwer.
Chow, L. S. (1996). Statistical significance: Rationale, validity and utility. London: Sage.
Cobb, G. (2007). The introductory statistics course: A Ptolemaic curriculum? Technology
Innovations in Statistics Education, 1(1). Online:
Cumming, G., Williams, J., & Fidler, F. (2004). Replication, and researchers’
understanding of confidence intervals and standard error bars. Understanding
Statistics, 3, 299-311.
Falk, R. (1986) Misconceptions of statistical significance, Journal of Structural Learning,
9, 83-96.
Falk, R., & Greenbaum, C. W. (1995) Significance tests die hard: The amazing persistence
of a probabilistic misconception, Theory and Psychology, 5 (1), 75-98.
Fidler, F., & Cumming, G. (2005). Teaching confidence intervals: Problems and potential
solutions. Proceedings of the International Statistical Institute 55
Session. Sydney,
Australia: International Statistical Institute. Online:
Franklin, C., Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., & Scheaffer, R.
(2005). Guidelines for assessment and instruction in statistics education (GAISE)
report: a preK-12 curriculum framework. Alexandria, VA: American Statistical
Association. Online:
Garfield, J. B. (2002) The challenge of developing statistical reasoning. Journal of
Statistics Education, 10 (3). Online:
Garfield, J., & Gal, I. (1999), Teaching and assessing statistical reasoning. In L. Stiff (Ed.),
Developing mathematical reasoning in grades K-12 (pp. 207-219). Reston, VA:
National Council Teachers of Mathematics.
Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (1997). What if there were no significance
tests? Mahwah, NJ: Lawrence Erlbaum Associates.
Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics
and biases. New York: Cambridge University Press.
Konold, C., Harradine, A., & Kazak, S. (2007). Understanding distributions by modeling
them. International Journal of Computers for Mathematical Learning, 12 (3), 217-
Konold, C., & Lehrer, R. (2008). Technology and mathematics education: An essay in
honor of Jim Kaput. In L. D. English (Ed.), Handbook of international research in
mathematics education (2
ed.) (pp. 49–71). New York: Routledge.
Konold, C., & Kazak, S. (2008). Reconnecting data and chance. Technology Innovations in
Statistics Education, 2(1). Online:
Krauss, S., & Wassner, C. (2002). How significance tests should be presented to avoid the
typical misinterpretations. In B. Phillips (Ed.), Proceedings of the Sixth International
Conference on Teaching Statistics. Cape Town: International Statistical Institute and
International Association for Statistical Education. Online:
Liu, Y., & Thompson, P. W. (2009). Mathematics teachers' understandings of proto-
hypothesis testing. Pedagogies, 4 (2), 126-138.
Ministry of Education and Sciences (2007). Real Decreto 1467/2007, de 2 de noviembre,
por el que se establece la estructura del bachillerato y se fijan sus enseñanzas
mínimas (Royal Decree establishing the structure of high school curriculum).
Ministry of Education, (2007). The New Zealand Curriculum. Wellington, New Zealand:
Learning Media Limited.
National Council of Teachers of Mathematics. (2000). Principles and standards for school
mathematics. Reston, VA: Author.
Rossman, A. (2008). Reasoning about informal statistical inference: One statistician’s view.
Statistics Education Research Journal, 7 (2), 5-19. Online:
Rubin, A., Hammerman, J. K. L., & Konold, C. (2006). Exploring informal inference with
interactive visualization software. In B. Phillips (Ed.), Proceedings of the Sixth
International Conference on Teaching Statistics. Cape Town, South Africa:
International Association for Statistics Education. Online:
Saldanha. L., & Thompson, P. (2002) Conceptions of sample and their relationship to
statistical inference. Educational Studies in Mathematics, 51, 257-270.
Saldanha. L., & Thompson, P. (2007) Exploring connections between sampling
distributions and statistical inference: an analysis of students’ engagement and
thinking in the context of instruction involving repeated sampling. International
Electronic Journal of Mathematics Education, 3, 270-297.
Scheaffer, R., & Tabor, J. (2008). Statistics in the high school mathematics curriculum:
Building sound reasoning under uncertainty. Mathematics Teacher, 102 (1), 56-61.
Senior Secondary Board of South Australia (SSABSA), (2002). Mathematical studies
curriculum statement. Adelaide, Australia: SSABSA.
Vallecillos, A. (1994). Estudio teórico-experimental de errores y concepciones sobre el
contraste estadístico de hipótesis en estudiantes universitarios (Theoretical and
experimental study on errors and conceptions about hypothesis testing in university
students). Unpublished Ph. D. University of Granada, Spain.
Vallecillos, A. (1999). Some empirical evidence on learning difficulties about testing
hypotheses. Proceedings of the International Statistical Institute 52nd Session.
Helsinki: International Statistical Institute. Online:
Zieffler, A., Garfield, J. B., delMas, R., & Reading, C. (2008). A framework to support
research on informal inferential reasoning. Statistics Education Research Journal,
7(2), 5-19. Online:
... La formación en estadística y probabilidad ha adquirido gran relevancia en las últimas décadas debido al desarrollo de métodos estadísticos y su pertinencia para entender diversos fenómenos, entre ellos científicos y sociales (Batanero y Borovcnik, 2016). Dentro de esta área, la inferencia estadística es un aspecto fundamental que apunta al centro del trabajo estadístico: el hecho de obtener conclusiones sobre una población a partir de una muestra, distinguir lo sistemático del azar y comprender el proceso que genera los datos (Harradine et al., 2011;Peck et al., 2013;Wackerly et al., 2010). Esto se ha reflejado en distintas políticas educativas a lo largo del mundo, que incluyen este contenido en secundaria y bachillerato (Harradine et al., 2011). ...
... Dentro de esta área, la inferencia estadística es un aspecto fundamental que apunta al centro del trabajo estadístico: el hecho de obtener conclusiones sobre una población a partir de una muestra, distinguir lo sistemático del azar y comprender el proceso que genera los datos (Harradine et al., 2011;Peck et al., 2013;Wackerly et al., 2010). Esto se ha reflejado en distintas políticas educativas a lo largo del mundo, que incluyen este contenido en secundaria y bachillerato (Harradine et al., 2011). De hecho, a escala mundial, el currículum escolar ha dado importancia a esta área y en diversos lugares se está tratando de introducir la inferencia estadística de manera temprana, incluso en la educación primaria (Harradine et al., 2011;Manor y Ben-Zvi, 2017;Watson, 2008). ...
... Esto se ha reflejado en distintas políticas educativas a lo largo del mundo, que incluyen este contenido en secundaria y bachillerato (Harradine et al., 2011). De hecho, a escala mundial, el currículum escolar ha dado importancia a esta área y en diversos lugares se está tratando de introducir la inferencia estadística de manera temprana, incluso en la educación primaria (Harradine et al., 2011;Manor y Ben-Zvi, 2017;Watson, 2008). Diversos autores reportan que la enseñanza y aprendizaje de la inferencia estadística es compleja y aparecen múltiples preconcepciones erradas y dificultades en la interpretación de conceptos centrales, como por ejemplo distribuciones muestrales y tipos de hipótesis (Castro-Sotos et al., 2007;Harradine et al., 2011). ...
Full-text available
Este estudio examina las tensiones que vivencia una formadora de profesores al enseñar inferencia estadística, tópico en el que se han reconocido diversas dificultades de enseñanza y aprendizaje en la formación inicial del profesorado de Matemática de secundaria. Para ello, se utilizó un enfoque cualitativo de naturaleza interpretativa, donde se analizaron las reflexiones de la formadora sobre la enseñanza de una unidad de inferencia estadística registradas por medio de bitácoras y reuniones de amistad crítica. El análisis permitió identificar tres grandes tensiones: una de ellas relacionada con la estadística como disciplina; la segunda, vinculada a su enseñanza; y la tercera, asociada con el aprendizaje de los estudiantes. Finalmente, se discuten posibles mejoras de los procesos formativos en la formación inicial del profesorado.
... Sin embargo, realizar inferencias basados en dichas pruebas requiere una comprensión profunda tanto del estadístico t-Student, como de las nociones que se encuentran relacionadas con él, por ejemplo, el nivel de significancia, las hipótesis estadísticas, el valor-p y las distribuciones muestrales; mismas nociones con las que se han identificado dificultades al realizar inferencias (e.g. GARFIELD; BEN-ZVI, 2008;HARRADINE;ROSSMAN, 2011;SOTOS et al., 2007). ...
... Sin embargo, realizar inferencias basados en dichas pruebas requiere una comprensión profunda tanto del estadístico t-Student, como de las nociones que se encuentran relacionadas con él, por ejemplo, el nivel de significancia, las hipótesis estadísticas, el valor-p y las distribuciones muestrales; mismas nociones con las que se han identificado dificultades al realizar inferencias (e.g. GARFIELD; BEN-ZVI, 2008;HARRADINE;ROSSMAN, 2011;SOTOS et al., 2007). ...
Full-text available
Resumen En este artículo presentamos una propuesta de niveles progresivos, de lo informal a lo formal, de razonamiento inferencial para el estadístico t-Student, a partir de criterios epistémicos identificados con un estudio de tipo histórico-epistemológico sobre este estadístico y de la investigación desarrollada sobre razonamiento inferencial. Para ello, utilizamos algunas nociones teórico-metodológicas introducidas por el Enfoque Onto-Semiótico del conocimiento y la instrucción matemáticos (EOS), las cuales permitieron tanto identificar y caracterizar diversos significados conferidos al estadístico t-Student, a lo largo de su evolución y desarrollo, como presentar una perspectiva integral de lo que se considera razonamiento inferencial. Los atributos matemáticos de los diversos significados del estadístico t-Student se encuentran fuertemente vinculados a los indicadores de los distintos niveles de razonamiento aquí expuestos. Además, cada nivel se encuentra asociado a un razonamiento inferencial informal, pre-formal o formal. La propuesta de niveles de Razonamiento Inferencial para el estadístico t-Student y sus indicadores, se prevén útiles para el diseño de actividades que promuevan, gradualmente, un razonamiento inferencial formal sobre la base del razonamiento inferencial informal, sobre este estadístico.
... Muestreo e inferencia. Nos permiten explorar la relación entre las características de las muestras con las de la población, a fin de considerar qué datos y cómo recopilarlos hasta extraer conclusiones con un cierto grado de certeza (Harradine et al., 2011). De acuerdo con Batanero et al. (2013), es en la Educación Secundaria cuando es posible una aproximación a una comprensión informal de la inferencia. ...
Full-text available
En este estudio se caracteriza cómo las orientaciones curriculares abordan la estadística y la probabilidad en Educación Infantil y Educación Primaria. Para ello, en primer lugar, se analiza la presencia explícita de la estadística y la probabilidad en las orientaciones curriculares de dos países (Chile y España); en segundo lugar, se examina el sentido que se otorga a su enseñanza y aprendizaje; y, por último, las implicaciones que ello conlleva para la formación del profesorado que se desempeñará en estos niveles educativos. Los resultados muestran, en el caso de la Educación Infantil, una nula presencia de los contenidos vinculados al estudio de la estadística y la probabilidad. Por su parte, en Educación Primaria, la presencia de este bloque de contenido es levemente mayor, sin embargo, es baja en comparación con otros ejes de contenidos. En consecuencia, una implicación directa es que el profesorado otorgue realce a la estadística y la probabilidad en el proceso de enseñanza, aún más considerando su rol protagonista en el desarrollo de habilidades para el siglo XXI. Para ello, es necesario que el profesorado diseñe tareas enfocadas en la comprensión conceptual por sobre de lo procedimental.
... Early instruction on sampling attempts to develop the foundations for this kind of knowledge and the associated abilities by addressing the concept of representativeness, which is grounded in embracing random sampling methods and acknowledging the representative power of a sample based on a sufficiently large sample size. However, even if this teaching is already quite restrictive when referring to key ideas of sampling in relation to inference (Harradine et al., 2011), the specialized research literature reveals that these notions are not trivial to students and that applying this knowledge outside the classroom is especially complicated. This is a big concern for reaching a main goal of statistical literacy: being able to interpret and critically evaluate statistical information, datarelated arguments, or stochastic phenomena found in diverse contexts (Gal, 2002). ...
Conference Paper
Sampling is a foundational concept for a proper development of students’ statistical literacy. This paper focuses on the evaluation of Mexican tertiary students’ abilities to evaluate statistical conclusions or claims in relation to sampling found in a social and media report context. The SOLO taxonomy is used to evaluate the richness of students’ responses in relation to sampling for two scenarios. General results reveal students have trouble correctly identifying or calculating sample size for a given study and a tendency to accept claims based on inappropriate sampling, even if the selection method has an evident source of bias. Implications for teaching in relation to sampling and statistical literacy are discussed.
... Shojaie, Aminghafari and Mahammadpour (2012) acknowledged that students experience difficulties in introductory courses in probability and statistics which include finding the joint distribution of a function and calculation of bivariate expectation. Similarly, some other studies (Harradine, Batanero & Rossman, 2011;Lugo-Armenta & Pino-Fan, 2021;Sotos, Vanhoof, Van den Noortgate & Onghena, 2007) acknowledge that statistics in general is challenging for students at various levels of education. Memnun, Ozbilen and Dinc (2019) observed that many students experienced difficulties in solving probability problems and failed to apply different probability concepts in problem-solving. ...
Full-text available
This study explored undergraduate students’ proficiencies in solving bivariate normal distribution (BND) problems in a Kenyan university. The study followed a case study design and qualitative research approach. One hundred and seventy-five undergraduate statistics students in a Kenyan university participated in the study. Data was collected using an achievement test. Content analysis of the students’ solutions to test questions revealed that majority of the students were not proficient in solving BND problems with respect to calculating; (i) the probability of a normal distribution given the mean and variance of a variable, (ii) the mean of a normal distribution given the variance and the probability of a variable, (iii) the mean and variance of the joint distribution, and hence the probability of the variable given the conditional distribution of a variable, and (iv) the mean and standard deviation of two random variables given a bivariate random density function. It is recommended that the basic statistical concepts relevant to learning the BND be thoroughly revised before formally teaching BND.
... La Inferencia Estadística es la rama de la Estadística que comprende el conjunto de métodos y técnicas que permiten inferir, a partir de la información empírica proporcionada por muestras aleatorias, el comportamiento de una determinada población (Harradine, Batanero y Rossman, 2011). De un modo más informal, la Inferencia Estadística trata de describir las características básicas de un conjunto, inaccesible en su totalidad, mediante el estudio de muestras aleatorias tomadas del mismo. ...
Este capítulo presenta un conjunto de categorías dirigido, fundamentalmente, al profesor universitario que diseña o selecciona tareas referidas a la obtención de intervalos de confianza como modelo estadístico en contextos de salud y que exigen un alto nivel de demanda cognitiva. Se describe el proceso de adaptación las categorías en la elaboración de un instrumento, a saber, Comprender, Comprender para aplicar, Analizar, Evaluar y Crear, para abordar el estudio de la demanda cognitiva potencial (u orientación cognitiva) de tareas estadísticas, a la idiosincrasia del modelo estadístico referido, los intervalos de confianza. Se precisa el proceso de validación del instrumento resultante de la colaboración de diez expertos investigadores en el área de Educación Estadística y Estadística e Investigación Operativa, quienes han asignado a cada una de las once tareas de intervalos de confianza una de las categorías de alta demanda cognitiva; tareas que ponen en juego distintos modelos asociados a los intervalos de confianza para medias y proporciones.
Beliefs like the Gambler's Fallacy and the Hot Hand have interested cognitive scientists, economists, and philosophers for centuries. We propose that these judgment patterns arise from the observer's mental models of the sequence‐generating mechanism, moderated by the strength of belief in an a priori base rate. In six behavioral experiments, participants observed one of three mechanisms generating sequences of eight binary events: a random mechanical device, an intentional goal‐directed actor, and a financial market. We systematically manipulated participants’ beliefs about the base rate probabilities at which different outcomes were generated by each mechanism. Participants judged 18 sequences of outcomes produced by a mechanism with either an unknown base rate, a specified distribution of three equiprobable base rates, or a precise, fixed base rate. Six target sequences ended in streaks of between two and seven identical outcomes. The most common predictions for subsequent events were best described as pragmatic belief updating, expressed as an increasingly strong expectation that a streak of identical signals would repeat as the length of that streak increased. The exception to this pattern was for sequences generated by a random mechanical device with a fixed base rate of .50. Under this specific condition, participants exhibited a bias toward reversal of streaks, and this bias was larger when participants were asked to make a dichotomous choice versus a numerical probability rating. We review alternate accounts for the anomalous judgments of sequences and conclude with our favored interpretation that is based on Rabin's version of Tversky & Kahneman's Law of Small Numbers.
One of the topics that have been deeply studied in Statistical Education is that of how to promote formal inferential reasoning (FIR) based on the results of informal inferential reasoning (IIR). However, it is still necessary to have proposals to explore and progressively develop the inferential reasoning of students and teachers from IIR to FIR. In this context, this article seeks to characterize the inferential reasoning displayed by high school mathematics teachers in the practices they developed for solving problems about Student’s t statistic.To do this, we use theoretical and methodological notions introduced by the Onto-Semiotic Approach (OSA) to mathematical knowledge and instruction, among which are the notion of mathematical practice, mathematical object and the theoretical proposal of progressive levels of inferential reasoning about the Student’s t statistic. The subjects who participated in this qualitative study were 59 teachers in training from Costa Rica and 22 practicing teachers from Chile. The practices developed by the pre-service teachers and the practicing teachers were found to have similar elements (representations, concepts/definitions, properties, procedures, arguments). The principal conclusion of this investigation was that the proposal of inferential reasoning levels about Student’s t statistic proved to be a useful predictor of the practices developed by the teachers, making it possible to distinguish characteristic elements of each level of inferential reasoning.
Full-text available
Uno de los temas que han sido intensamente estudiados en la educación estadística, refiere a cómo promover el razonamiento inferencial formal (RIF) sobre la base de un razonamiento inferencial informal (RII). Sin embargo, aún es necesario contar con propuestas que permitan explorar y desarrollar progresivamente (del RII al RIF) el razonamiento inferencial de estudiantes y docentes. En este sentido, el objetivo de este artículo es caracterizar el razonamiento inferencial que evidencia el profesorado de matemáticas de enseñanza media en sus prácticas para resolver problemas sobre el estadístico t-Student. Para ello, utilizamos nociones teóricas y metodológicas introducidas por el enfoque ontosemiótico del conocimiento y la instrucción matemáticos (EOS), entre las cuales se encuentran la noción de práctica matemática, objeto matemático y una propuesta teórica de niveles progresivos de razonamiento inferencial sobre el estadístico t-Student. Los sujetos que participan en este estudio de corte cualitativo son 59 docentes en formación de Costa Rica y 22 en ejercicio de Chile. Las prácticas que desarrollaron los profesores de los grupos en formación y de los grupos en ejercicio resultaron tener elementos similares (representaciones, conceptos/definiciones, propiedades, procedimientos, argumentos). Como conclusión principal, se obtuvo que la propuesta de niveles de razonamiento inferencial para la t-Student resultó ser un predictor útil de las prácticas que desarrolló el profesorado, lo cual permite distinguir elementos característicos de cada uno de los niveles de razonamiento inferencial.
Promoting positive student attitudes towards stochastics has become a core goal of statistics education reform, and we argue that this starts with the teachers during their teacher preparation program. Building on previous work assessing teachers’ attitudes, we focus on how pre-service teacher attitudes vary across different dimensions, and how these patterns can inform teacher training. We present results from assessing attitudes towards stochastics and its teaching for a sample of 269 pre-service Chilean mathematics teachers across three topics: descriptive statistics, probability, and statistical inference. Using a quantitative approach, and considering attitudes towards content and pedagogy, we focus on describing the main attitudinal differences among these three areas. In general, we found positive attitude towards the content and its teaching in all three areas, but with differences among them, primarily in the area of statistical inference. We end with some proposals aimed at improving teacher preparation, focusing on helping pre-service teachers understand the utility and overarching process of statistical investigations.
Full-text available
For the past 15 years, pre-university students in many countries including the United States have encountered data analysis and probability as separate, mostly independent strands. Classroom-based research suggests, however, that some of the difficulties students have in learning basic skills in Exploratory Data Analysis stem from a lack of rudimentary ideas in probability. We describe a recent project that is developing materials to support middle-school students in coming to see the “data in chance” and the “chance in data.” Instruction focuses on four main ideas: model fit, distribution, signal-noise, and the Law of Large Numbers. Central to our approach is a new modeling and simulation capability that we are building into a future version of the data-analysis software TinkerPlots. We describe three classroom-tested probability investigations that employ an iterative model-fit process in which students evaluate successive theories by collecting and analyzing data. As distribution features become a focal point of students’ explorations, signal and noise components of data become visible as variation around an “expected” distribution in repeated samples. An important part of students’ learning experience, and one enhanced through visual aspects of TinkerPlots, is becoming able to see things in data they were previously unable to see.
Statistics is the key to decision making in the information age. The importance of statistical thinking for life and work is reflected in state and provincial mathematics frameworks and in national standards.
Many decisions are based on beliefs concerning the likelihood of uncertain events such as the outcome of an election, the guilt of a defendant, or the future value of the dollar. Occasionally, beliefs concerning uncertain events are expressed in numerical form as odds or subjective probabilities. In general, the heuristics are quite useful, but sometimes they lead to severe and systematic errors. The subjective assessment of probability resembles the subjective assessment of physical quantities such as distance or size. These judgments are all based on data of limited validity, which are processed according to heuristic rules. However, the reliance on this rule leads to systematic errors in the estimation of distance. This chapter describes three heuristics that are employed in making judgments under uncertainty. The first is representativeness, which is usually employed when people are asked to judge the probability that an object or event belongs to a class or event. The second is the availability of instances or scenarios, which is often employed when people are asked to assess the frequency of a class or the plausibility of a particular development, and the third is adjustment from an anchor, which is usually employed in numerical prediction when a relevant value is available.
This chapter presents a series of research studies focused on the difficulties students experience when learning about sampling distributions. In particular, the chapter traces the seven-year history of an ongoing collaborative research project investigating the impact of students’ interaction with computer software tools to improve their reasoning about sampling distributions. For this classroom-based research project, three researchers from two American universities collaborated to develop software, learning activities, and assessment tools to be used in introductory college-level statistics courses. The studies were conducted in five stages, and utilized quantitative assessment data as well as videotaped clinical interviews. As the studies progressed, the research team developed a more complete understanding of the complexities involved in building a deep understanding of sampling distributions, and formulated models to explain the development of students’ reasoning.
Informal inferential reasoning is a relatively recent concept in the research literature. Several research studies have defined this type of cognitive process in slightly different ways. In this paper, a working definition of informal inferential reasoning based on an analysis of the key aspects of statistical inference, and on research from educational psychology, science education, and mathematics education is presented. Based on the literature reviewed and the working definition, suggestions are made for the types of tasks that can be used to study the nature and development of informal inferential reasoning. Suggestions for future research are offered along with implications for teaching.