Article

Statistics as Principled Argument

Taylor & Francis on behalf of the American Statistical Association
The American Statistician
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... A strategy of "exploring small" as Sakaluk (2016) recommends, while "confirming big, " calls for the use of varying techniques whose strengths are best suited to the problem's constraints. Data analysis and inference require experience and judgment (Abelson, 1995;Krantz, 1999). An eclectic and prudent perspective highlights the need for shared ethical standards. ...
... We submit that the use of significance testing in experimental work with small to mediumsized samples may remain beneficial, especially in cases involving new questions, and assuming that researchers will consider a variety of options from the statistical toolbox. This conclusion resembles Fisher's original advice (see also Cohen, 1990;Abelson, 1995;Wilkinson and The Task Force on Statistical Inference, 1999;Nuzzo, 2014;Sakaluk, 2016). In contrast, the eminent Bayesian Lindley (1975, p. 112) asserted that "all those methods that violate the likelihood principle" should be left to die. ...
Book
Full-text available
This Research Topic focuses on the questions “behind” empirical research in the social sciences,especially in psychology, sociology and education, and presents various ideas about the nature ofempirical knowledge and the values knowledge is or should be based on.The questions raised in the contributions are central for empirical research, especially withrespect to disciplinary and epistemological diversity among researchers. This diversity is alsomirrored by the variety of article types collected in this issue, “Hypotheses & Theory,” “Methods,”“Conceptual Analyses,” “Review,” “Opinion,” “Commentary,” and “Book Review.” [Excerpt from the editorial]
... The NHST procedure is what Abelson (1995) called ''ritualized devil's advocacy.'' It is a recipe, a formula, that can be applied to almost any null hypothesis about a data set. ...
... We should note that the probability of H 0 (or H 1 ) given the data is not a p-value, although many students interpret the p-value in this way. Bayesian approaches, therefore, are more intuitive than frequentist approaches, and answer questions in ways that solve Abelson's (1995) concerns about the counter-intuitive nature of NHST. ...
Article
Statistical thinking is essential to understanding the nature of scientific results as a consumer. Statistical thinking also facilitates thinking like a scientist. Instead of emphasizing a “correct” procedure for data analysis and its outcome, statistical thinking focuses on the process of data analysis. This article reviews frequentist and Bayesian approaches such that teachers can promote less well-known statistical perspectives to encourage statistical thinking. Within the frequentist and Bayesian approaches, we highlight important distinctions between statistical evaluation versus estimation using an example on the facial feedback hypothesis (Strack, Martin, & Stepper, 1998). We first introduce statistical concepts, which are then illustrated with simulated data. Finally, we demonstrate how these approaches are applied to empirical data obtained from a Registered Replication Report (Wagenmakers et al. 2016). Data and R code of for the example are provided as supplementary teaching material. We conclude with a discussion of key learning outcomes centered on promoting statistical thinking.
... The P-value does not explicitly refer to the probability of the null hypothesis being true, but it does provide a "measure of the strength of evidence against H 0 (the null hypothesis)" (Dorey 2010(Dorey , p. 2297. Abelson (1995) referred to "discrediting the null hypothesis" (p. 10) based on the P-value from a statistical test. ...
... This article highlights three issues relevant to improving the evidential quality within lighting research: determination and justification of sample sizes, assessment of test assumptions, and reporting of statistical results, particularly effect sizes. Further treatment of these issues can also be found in a number of other texts (e.g., Abelson 1995;Cohen 2013;Field et al. 2012;Haslam and McGarty 2018). ...
Article
The reporting of accurate and appropriate conclusions is an essential aspect of scientific research, and failure in this endeavor can threaten the progress of cumulative knowledge. This is highlighted by the current reproducibility crisis, and this crisis disproportionately affects fields that use behavioral research methods, as in much lighting research. A sample of general and topic-specific lighting research papers was reviewed for information about sample sizes and statistical reporting. This highlighted that lighting research is generally underpowered and, given median sample sizes, is unlikely to be able to reveal small effects. Lighting research most commonly uses parametric statistical tests, but assessment of test assumptions is rarely carried out. This risks the inappropriate use of statistical tests, potentially leading to type I and type II errors. Lighting research papers also rarely report measures of effect size, and this can hamper cumulative science and power analyses required to determine appropriate sample sizes for future research studies. Addressing the issues raised in this article related to sample sizes, statistical test assumptions, and reporting of effect sizes can improve the evidential value of lighting research.
... As a result of the omnipresence of variation in statistical investigations, there is no certainty in the solutions. The end product of a statistical investigation is better thought of as a well principled argument (Abelson 1995). Mathematics on the other hand is generally treated in a very deterministic way, logically deducing a single solution to a problem using theorems, axioms, and definitions from the community of mathematics (Gattuso and Ottaviani 2011 ). ...
... Statistics' value in K-12 education with the goal of preparing students to become citizens in today's information based societies comes from the core practices that make up the statistical process: to pose questions, collect relevant data, analyze the data in the context of a problem (Franklin et al. 2007), and then verbalize the story that the data tell about an issue to others in a precise well principled argument (Abelson 1995). These practices situated in mathematics classrooms can begin to provide students with experiences in critically investigating and critiquing their own context in society, while developing the statistical concepts and practices that will enable them to make sense of their context. ...
Chapter
This chapter discusses how ideas from critical mathematics education and statistics education intersect and could be used to transform the types of experiences that students have with both mathematics and statistics in the school mathematics curriculum. Key ideas from the critical mathematics literature are described to provide a background from which to discuss what a critical statistics education could be. The chapter ends with a discussion of some of the major barriers that need to be considered to make such a vision a reality and possible future directions for moving towards making a critical statistics education a reality.
... 108 But ritualized or unthinking uses of significance tests are not helpful, especially when the ritual displaces reasoned judgment about the meaning of a hypothesis test. Others have offered many instructive illustrations of the traps and failures that follow from that displacement (e.g., Abelson, 1995;Clift, 1989;Cohen, 1994). 109 To cite two quick illustrations from Abelson: ...
Preprint
Full-text available
A short primer on inference (observational, mensural, causal, and statistical), requiring very little mathematics.
... In several cases, the estimated odds ratio is smaller enough to become zero when rounded. It is because one of the odds being compared is close to zero and indicates a strong negative association (Abelson, 1995). ...
Article
Full-text available
The travel mode preference exists in both culture and theenvironment. The wide scale of people's mobility makesour cities more polluted and congested, eventually affecting urban assets.Understanding people’s mode choice is important to develop urbantransportation planning policies effectively. This study aims to model andpredict the commuter’s mode choice behaviour in Lahore, Pakistan. A surveywas conducted, and the data was used for model validation. Thecomparative study was further done among multinomial logit model (MNL),Random Forest (RF), and K-Nearest Neighbor (KNN) classificationapproaches. It’s common in existing studies that vehicle ownership is rankedas the most important among all features impacting commuters’ travel modechoice. Since many commuters in Lahore own no vehicle, it’s unclear whatthe rank of factors impacting non-vehicle owners is. Other than thecomparison of predicting the performance of the methods, our contributionis to do more analysis of the rank of factors impacting the different types ofcommuters. It was observed that occupation is ranked as the most importantamong all features for non-vehicle owners.
... In domains that are clinically unexplored such as the role of VR in the present study, compelling but principled arguments that aim to explain rather than quantify observations are part of the natural history of clinically relevant, statistically significant findings. [65][66][67][68][69] By building upon recent preliminary findings, 66-69 a limited technology investigating evidence-based, yet clinically non-validated, neurovascular biomarkers with a proposed standardized methodology has been able to provide an exploratory answer to an interesting clinical question. Inherent statistical limitations in magnitude of effect, articulation of precision, and generalizability of results were partially mitigated by applied importance and provisional credibility based on theoretical coherence, methodological soundness, and results of limited statistical and yet to be determined clinical significance. ...
Article
Purpose: The purpose of this study was to measure sensitivity of virtual reality (VR) in detecting biomarkers of neurovascular remodeling suboptimally evaluated in digital subtraction angiography (DSA) of treated unruptured, intracranial aneurysms. Methods: The sensitivity of virtual reality and digital subtraction angiography in detection of neurovascular biomarkers in aneurysms treated with flow diversion, coiling, and clipping were evaluated. Validated grading scales were integrated into a standardized rating platform. The respective novel and conceptual measures of minimal imaging important difference (MIID) and number needed to image (NNI) were calculated for each biomarker. Results: In flow diversion, coiling, and clipping, minimal imaging important difference and number needed to image were associated with virtual reality in detection of abnormal biomarkers, with the exception of stasis phase associated with digital subtraction angiography. Number needed to image was associated with flow diversion stent stenosis (RR: 7.00, 95% CI 0.37 to 131.97; OR: 7.46, 95% CI 0.38 to 148.49). Minimal imaging important difference was greatest in residual aneurysm filling (25%±66, 95% CI) in flow diversion and Meyer score in coiling (42.5%±17.69, 95% CI) and clipping (22.2%±13.58, 95% CI). Regression models demonstrated minimal imaging important difference and number needed to image shared a significant correlation (R20.99, 95% CI, p<0.001). Conclusion: Virtual reality adds value to digital subtraction angiography in evaluation of aneurysms treated with flow diversion, coiling, and clipping. Larger, prospective studies are warranted to increase statistical power and validate clinical significance.
... Stouffer's Z-score (Stouffer et al., 1949;Mosteller and Bush, 1954) method was designed for independent hypotheses. Abelson (2012) found that Stouffer's test is more sensitive to consistent departures from the null hypothesis than Fisher's method for independent tests. ...
Preprint
Full-text available
This paper proposes a stable combination test, which is a natural extension of Cauchy combination tests by Liu and Xie (2020). Similarly to the Cauchy combination test, our stable combination test is simple to compute, enjoys good sizes, and has asymptotically optimal powers even when the individual tests are not independent. This finding is supported both in theory and in finite samples.
... They maintain that the p-value from each group gives only one way of interpreting effectiveness of treatment. Abelson (1995) further supports the use of effect size by stating "A major difficulty with simply using the significance level is that the p value depends not only on the degree of departure from the null hypothesis, but also on the sample size" (p. 40). ...
... Using Example 1, we highlight the advantages and disadvantages of each of the three analogous presentation devices to emphasize different aspects of results. In general, because results are employed to corroborate substantive theory with reasoning (Abelson, 1995), results should be presented in a form that maximizes readers' ease in encoding information from the presentation of results. ...
Chapter
Full-text available
We examine the fundamental role of visualizations and offer strategies and recommendations on how to use graphics to facilitate the use of SEM to fit models to data. We begin by reviewing aspects of model specification using the popular LISREL matrix notation highlighting the isomorphism between the algebraic representation of models and the path diagram. We emphasize advantages and caveats associated with the use of graphics in model specification. Next, we introduce several univariate and multivariate graphics that are useful for modeling data with SEM. Finally, we extend the use of graphics to the presentation of SEM results. In each of these sections, we illustrate the use of graphics with an empirical example examining the effects of sensation seeking and self-regulation on problem behavior among adolescents. The first example introduces the basic covariance structure model without mean structure and the second example extends the model to include mean structure. We conclude with a discussion of strategies to consider when making use of graphics with SEM.
... The soundness of the data analysis process and how to articulate the findings aid to derive the right conclusions from experiments [30]. However, capturing these aspects are well beyond the scope of the taxonomy and refer them to available guidelines (e.g., [1,3]). ...
Preprint
Crowdsourcing is being increasingly adopted as a platform to run studies with human subjects. Running a crowdsourcing experiment involves several choices and strategies to successfully port an experimental design into an otherwise uncontrolled research environment, e.g., sampling crowd workers, mapping experimental conditions to micro-tasks, or ensure quality contributions. While several guidelines inform researchers in these choices, guidance of how and what to report from crowdsourcing experiments has been largely overlooked. If under-reported, implementation choices constitute variability sources that can affect the experiment's reproducibility and prevent a fair assessment of research outcomes. In this paper, we examine the current state of reporting of crowdsourcing experiments and offer guidance to address associated reporting issues. We start by identifying sensible implementation choices, relying on existing literature and interviews with experts, to then extensively analyze the reporting of 171 crowdsourcing experiments. Informed by this process, we propose a checklist for reporting crowdsourcing experiments.
... De acuerdo con Makar y Confrey (2004), la comparación de grupos tiene el potencial de dar a los estudiantes contextos auténticos con los datos adecuados para responder preguntas significativas, observando así el poder de los datos para brindar información importante para la toma de decisiones. Para Abelson (1995), la idea de comparación es crucial porque las diferencias observadas conducen a preguntas que desencadenan la búsqueda de factores explicativos para ciertos fenómenos o experimentos. Así es que cuando se espera una diferencia y no se encuentra, surge la pregunta ¿por qué no hay una diferencia? ...
Thesis
Full-text available
When reviewing the current Plans and Programs (from 1999 and 2018) for the training of telesecundaria teachers, it was observed that the didactic activities that are proposed for statistical education are scarce and outdated, both in the 1999 Plan and in the 2018 Plan. Likewise, it was found that there is a great disarticulation between the Study Plans that guide the training of future Telesecundaria teachers and the Study Plans of secondary education students. These observations motivated the present investigation. In order to contribute to the generation of knowledge in the area of training telesecundaria teachers, particularly in the field of statistical education, a didactic proposal based on working with statistical projects was designed to be implemented within the teacher training process of this undergraduate program. Thus, the objective of this research is to analyze the implementation of statistical projects as a learning strategy for future teachers. Specifically, it is analyzed whether working with statistical projects underpin the development of elements of three learning approaches: culture, reasoning, and statistical thinking. Besides, the extent to which working with statistical projects can be used as a strategy for training teachers is explored. The results show that it is possible that working with statistical projects allows future teachers to develop elements of the three learning approaches and raise awareness of the use of statistics as a useful tool for solving real-world problems related to their profession.
... While we still rely on statistical significance and non-significance in interpreting our results, we will use the terms proposed by Tukey (1991; as cited in Abelson, 1995) to refer to different degrees of uncertainty regarding the absence of an effect. Rather than concluding that there was no significant effect, we will say that differences between means, or the effect of a predictor on an outcome, lean towards a particular direction when p > .05 ...
Thesis
How can we reduce the social gradient in obesity if we do not know what causes it in the first place? This PhD thesis explores underlying explanations of the association between socioeconomic status and eating behaviors. Taking a social psychological approach, this thesis presents the results from a series of empirical studies that test how relative socioeconomic status affects decision-making. In particular, it examines how perceptions of one’s relative status affects impulsivity, and how someone else’s relative status influences beliefs about that person’s impulsivity. Together, these findings reveal both the existence and accuracy of impulsivity stereotypes. The findings suggest that (adherence to) these stereotypical behaviors are malleable and can be used in health interventions aimed at reducing health gradients.
... incongruent Necker cube stimuli. According to Kan et al. [13] and the referenced supporting literature [32,34,36], the passive observation of the bistable Necker cube induces several alterations of the cube's perceived direction. These alterations are experiences of a visual conflict only and the participants are not instructed to solve this conflict, only to indicate every alteration of the cube's mental direction. ...
Article
Full-text available
Exploring the mechanisms of cognitive control is central to understanding how we control our behaviour. These mechanisms can be studied in conflict paradigms, which require the inhibition of irrelevant responses to perform the task. It has been suggested that in these tasks, the detection of conflict enhances cognitive control resulting in improved conflict resolution of subsequent trials. If this is the case, then this so-called congruency sequence effect can be expected to occur in cross-domain tasks. Previous research on the domain-generality of the effect presented inconsistent results. In this study, we provide a multi-site replication of three previous experiments of Kan et al . (Kan IP, Teubner-Rhodes S, Drummey AB, Nutile L, Krupa L, Novick JM 2013 Cognition 129 , 637–651) which test congruency sequence effect between very different domains: from a syntactic to a non-syntactic domain (Experiment 1), and from a perceptual to a verbal domain (Experiments 2 and 3). Despite all our efforts, we found only partial support for the claims of the original study. With a single exception, we could not replicate the original findings; the data remained inconclusive or went against the theoretical hypothesis. We discuss the compatibility of the results with alternative theoretical frameworks.
... The relationship between the responses to the Likert scale and the demographic variables is also examined in the study. For this reason, the answers are given by the participants were collected by considering the 25 questions in our 5-point Likert scale expressing the exposure to mobbing and a new variable was obtained by summation (Abelson, 2012). ...
Article
Bu çalışma, Çukurova Üniversitesi’ndeki çalışanların mobbing mağduriyet algılarının cinsiyetlere göre değişimini görmek amacıyla yapılmıştır. Mobbing çok eski bir kavram olmasına rağmen özellikle ülkemizde insanların konuya ilişkin farkındalıkları oldukça düşüktür. Bunu ise yapılan anket açıklıkla gözler önüne sermektedir. Zira likert tipi sorularda maruz kaldığı psikolojik baskıların varlığına ilişkin yanıtları işaretlemiş olmasına rağmen, mobbinge uğramadığını ifade eden oldukça fazla sayıda kişiye rastlanmıştır. Bilinçsiz verilen yanıtlar, yükselme için saklanan duygular ya da korku nedeniyle verilen doğru olmayan yanıtların varlığına da bireysel görüşmelerde rastlanmıştır. Üç bölümden oluşan bu anket çalışmasında 33 profesör, 15 doçent, 23 yardımcı doçent, 14 öğretim görevlisi, 52 araştırma görevlisi, 9 okutman, 162 idari personelden oluşan 308 kişi ile görüşülmüştür. 146 kadın,162 erkekten oluşan bu örneklemde analizler tanımlayıcı istatistikler ve kikare testlerinden oluşmaktadır. İstatistiksel analizlerde mobbinge maruz kalma durumu ile cinsiyet arasında anlamlı bir farklılık olmamasına rağmen yine de kadın akademisyenlerin erkeklere oranla daha dezavantajlı oldukları da verilere yansımıştır. Türkiye’de akademisyenliğin kadın için ‘uygun’ meslek olarak görülmesi nedeniyle, Türkiye’de kadın akademisyen sayısı dünya ortalamasının üzerindedir (YÖK, 2020). Bu ortalamayı yükselten üniversitelerden bir tanesi olan Çukurova Üniversitesi’nde gerçekleştirilen bu araştırmanın bulgularında bile kadınların dezavantajlı olduğunun görülmesi ise ayrıca şaşırtıcıdır.
... Information units in one study are not necessarily equivalent to information units in another study, so a common scale is necessary. However, a major drawback of this approach is that standardized effects lack an obvious practical or intuitive meaning (Abelson, 1995;Baguley, 2009Baguley, , 2010Bond et al, 2003). Invoking benchmarks, such as Cohen's (1988) well-known suggestions, offers no solution, since such benchmarks are abstractions that are not grounded in the contexts relevant to the phenomena in question. ...
Article
Full-text available
Hanns Scharff, an interrogator during the Second World War, was known for his remarkable effectiveness at collecting intelligence from prisoners of war using a friendly, conversational approach in which he led the prisoners to unknowingly reveal the information he wanted. In the last decade, psychologists have produced a body of experimental studies testing the effectiveness of Scharff’s interrogation technique. Here, I provide a meta‐analytic review of that experimental research. The existing data supports the conclusions that the present conceptualization of Scharff’s technique is effective at eliciting more new information, leading people to perceive the interviewer as more knowledgeable, and inducing people to underestimate how much information they have revealed. However, numerous unanswered questions and challenges for this program of research remain. For example, future research may benefit from examining unaddressed elements of the methods Scharff used in the field. Research would also benefit from the development of measures that more clearly correspond to practical outcomes. This article is protected by copyright. All rights reserved.
... Pragmatically, the analysis was designed to address the likely utility of the SOS-10-E in Latin America and the transferability of the findings from the development study to this sample and location. Contextually, the analyses are in the traditions of Abelson's "statistics as principled argument" (Abelson, 1995) and the PPDAC (Problem, Plan, Data, Analysis, Conclusions and communication) approach to statistics (Oldford & MacKay, 2000;Spiegelhalter, 2019). The PPDAC Problem was to determine if the SOS-10-E was acceptable, and has appropriate psychometric qualities of internal reliability, structure and convergent validity. ...
Article
Full-text available
The Schwartz Outcome Scale-10 is a 10-item measure that has proven utility for assessing well-being and mental health and measuring change over time. Although there is a Spanish translation of the measure created in the United States for the Latino population, its acceptability and psychometric properties have not been studied in unilingual Spanish speakers, nor outside the United States. The aim of the present study is to explore these properties in larger samples, clinical and non-clinical, from Latin America adding convergent validity checking and exploration of effects of gender and age on scores. A qualitative study was conducted with 11 participants to test for dialect/language issues, then a psychometric exploration of data from 886 participants in a non-clinical sample and 172 in a clinical sample. The results showed good psychometric characteristics and suggest that the SOS-10-E can be used in Latin America. A cutoff of 42.51 differentiates clinical scores from non-clinical. Future studies are needed to explore sensitivity to change and check replication in other Spanish speaking populations.
... First and foremost, while we find a good fit of our models to the data, this does neither imply that all models are correct, nor is it a proof that the theory of ecologically unequal exchange is true in reality. Second, and related to the first point, the models used in our analysis are to be understood as an approximation of reality (Abelson, 2012;Grace, 2006), and we have not considered other approximations, i.e. model candidates. This confirmatory approach is appropriate for the purpose of testing a set of hypotheses from a particular theory as we have aimed for here, but it has to be kept in mind that a candidate modeling approach and multimodel inference would likely be more useful for predictive purposes (Burnham and Anderson, 2004;Grueber et al., 2011). ...
Article
Full-text available
Ecologically unequal exchange theory posits asymmetric net flows of biophysical resources from poorer to richer countries. To date, empirical evidence to support this theoretical notion as a systemic aspect of the global economy is largely lacking. Through environmentally-extended multi-regional input-output modelling, we provide empirical evidence for ecologically unequal exchange as a persistent feature of the global economy from 1990 to 2015. We identify the regions of origin and final consumption for four resource groups: materials, energy, land, and labor. By comparing the monetary exchange value of resources embodied in trade, we find significant international disparities in how resource provision is compensated. Value added per ton of raw material embodied in exports is 11 times higher in high-income countries than in those with the lowest income, and 28 times higher per unit of embodied labor. With the exception of embodied land for China and India, all other world regions serve as net exporters of all types of embodied resources to high-income countries across the 1990-2015 time period. On aggregate, ecologically unequal exchange allows high-income countries to simultaneously appropriate resources and to generate a monetary surplus through international trade. This has far-reaching implications for global sustainability and for the economic growth prospects of nations.
... As the researcher typically wishes to test a regionally specified hypothesis, it is necessary to perform the appropriate univariate statistical tests at each and every voxel At the standard rejection rate of p<0.05, we would expect by chance to reject the null hypothesis (Ho) of no experimental effects in a twentieth of our sample i.e. 5% of 200,000 -10,000 voxels! One can therefore obtain a perfectly reasonable number of activated voxels in the imaging volume by simply testing one's hypothesis enough times, and reporting only the instances in which Ho is rejected while ignoring the times it is not (called the 'file drawer' ^voh\Qm\ Abelson, 1989). ...
Thesis
All organisms must possess the ability to detect environmental stimuli and transform them into a form of information that can be utilised to guide behaviour. As the primate sensory systems consist of multiple interconnected cortical areas, it is important to know where areas processing different aspects of a sensory stimulus are located, and also which dimensions of the stimuli are being processed in each area. The use of functional neuroimaging allows one to address both of these problems. Although much progress has been made regarding the functional and anatomical organisation of higher order visual areas such as IT (e.g. Milner and Goodale, 1996), there has been comparatively little headway in understanding the functional organisation of somatosensory processing in humans. One problem in particular, the delivery robust somatosensory stimulation in the neuroimaging environment, is not a trivial one. In summary, the field of somatosensory neuroimaging has not received as much interest as other sensory modalities. In this thesis, I will present the results of my studies, which can be divided into three sections. I) The design and implementation of stimulation within the scanning environment; II) examinations of the topography of digit representations within primary and tertiary somatosensory areas using fMRI, and; III) examinations of sensorimotor transformations and somatoform illusions. My results are discussed with reference to similar studies in other sensory systems, and are placed in the context of investigations using other non-invasive scanning technologies.
... significance level (see Table 4). 7 As Abelson (1995) has emphasized in relation to effect size generally, however, the important question is whether there is a pattern of significance that has a substantive meaning. In this case, such a pattern exists for two variables we foregrounded in our theoretical discussion. ...
Article
Full-text available
An overlooked context for citizen deliberation occurs when voters discuss their ballots with others while completing them at home. Voting by mail (or “absentee voting”) creates an opportunity for informal deliberation in the midst of exercising a basic form of citizen power. We examined this understudied context by blending prior theory with qualitative observations of dyadic and small-group absentee voter discussions to identify common features of such talk, which range from cynical joking and speculation on election outcomes to observing norms of politeness and engaging in heated argument. The hypothesized antecedents and consequences of those behaviors were examined in a survey of 295 Washington and Oregon voters’ recollections of their ballot discussions. Results showed that pro-deliberative features of discussion were reported most often by voters with more formal education and political knowledge. Contrary to hypotheses, the strength of voters’ partisan identities bore no relation to deliberative behavior. Finally, the presence of key discussion features had many of the expected effects on voters’ confidence in ballot choices and their respect for the electoral process, particularly for those voters with less political knowledge.
... Interestingly, the moderation effect was not significant for body shame, anxiety and stress. This may be due to a lack of power in this study because other scholars have found positive effects of self-compassion on body shame (Liss & Erchull, 2015), and the beta reported in the present study is trending toward significance (in Tukeys's terminology: Abelson, 1995). Future research with a larger and perhaps more diverse ...
Preprint
According to objectification theory (Fredrickson & Roberts, 1997), being treated as an object leads women to engage in self‐objectification, which in turn increases body surveillance and body shame as well as impairs mental health. However, very little is known about what factors could act as buffers against the detrimental consequences of self‐objectification. This paper seeks to understand the role of self‐compassion (the ability to kindly accept oneself or show self‐directed kindness while suffering) in the perception that women have of their own bodies. Results indicate that self‐compassion moderated the effect of body surveillance on depression and happiness separately among women. More specifically, for women low in self‐compassion, body surveillance was negatively associated with happiness, which was explained by increased depression. In sum, our results indicate that self‐compassion protects against the detrimental consequences of body surveillance.
... Thus, while sought as a legitimate source of unbiased information and understood as part of a paradigm which assumes the existence of facts that are independent of biases subjective methods and personal views; crime statistics set agendas and greatly influence the articulation of news and the shaping of public policies. Indeed, journalism plays a vital role as an intermediary of senses and meanings, but there is a gap where the truth-seeking and truth-telling profession of journalism must fill if it wants to help people make informed decisions (Abelson 1995). Traditionally, journalists have been responsible for seeking, investigating and transmitting information across the lay public. ...
... Objective local structure is introduced by design constraints like a fixed number of target or stimulus set size, while subjectively people are highly sensitive to local sample structure in sequences. For example, unconstrained sampling from a uniform distribution to generate sequences may lead to frequent local repetitions of stimuli (i.e., "lumpiness"; Abelson (1995)). In the n-back task, such local patterns could encourage people to identify targets solely based on stimulus familiarity rather than to use their working memory, as this strategy may in this case lead to high performance at low cognitive cost. ...
Conference Paper
Full-text available
Numerous cognitive tasks, like the n-back, employ sequences of stimuli to target particular cognitive functions. These sequences are generated to satisfy specific criteria but the generation process typically induces unintentional statistical structure in the sequences which may not only affect performance but also alter the strategies participants use to complete the task. Here we propose that the generation of stimulus sequences can be conceptualized as a soft constraint satisfaction problem and offer experimental evidence demonstrating the impact of local sequence features on human behavior. Our approach to sequence generation provides a means to better control and assess sequence structures , which in turn could help clarify the cognitive and neural processes involved in cognitive tasks.
... Interestingly, the moderation effect was not significant for body shame, anxiety and stress. This may be due to a lack of power in this study because other scholars have found positive effects of self-compassion on body shame (Liss & Erchull, 2015), and the beta reported in the present study is trending toward significance (in Tukeys's terminology: Abelson, 1995). Future research with a larger and perhaps more diverse ...
Article
Full-text available
According to objectification theory (Fredrickson & Roberts, 1997), being treated as an object leads women to engage in self‐objectification, which in turn increases body surveillance and body shame as well as impairs mental health. However, very little is known about what factors could act as buffers against the detrimental consequences of self‐objectification. This paper seeks to understand the role of self‐compassion (the ability to kindly accept oneself or show self‐directed kindness while suffering) in the perception that women have of their own bodies. Results indicate that self‐compassion moderated the effect of body surveillance on depression and happiness separately among women. More specifically, for women low in self‐compassion, body surveillance was negatively associated with happiness, which was explained by increased depression. In sum, our results indicate that self‐compassion protects against the detrimental consequences of body surveillance.
... In order to determine whether emergent writing and emergent reading were interrelated at T2 (i.e., at the end of the interventions), a Pearson correlation was conducted. Researchers in statistics (Abelson, 1995;Rasmussen, 1989) have reported that the same analyses applied to parametric measures can also be applied to ordinal scale data with more than 5 ranks, without compromising the power of the test. ...
... Kraemer, Mintz, Noda, Tinklenberg, & Yesavage, 2006). There are a range of potential effect sizes that could be used to evaluate students' risk of exclusionary discipline, including risk ratios and risk differences, as well as test statistics, such as the t value (Abelson, 1995). Ratios of and differences between proportions, however, are not robust to changes in the underlying variation in the variable of interest. ...
Article
There are substantial racial disparities in school discipline but little agreement on how best to measure them. The choice of metric can influence conclusions about the magnitude of racial discipline disproportionality and intervention effectiveness. This article describes 2 common (risk ratio, risk difference) and 3 relatively novel (standardized effect size, raw differential representation, discipline rate) approaches to evaluating racial disproportionality, with illustrations of their strengths and weaknesses. Its concludes with a discussion of the metrics and a recommendation that researchers and policymakers consider the raw number of students of color differentially disciplined, as among the easiest to understand, the most stable, and capturing the widest range of information. Even so, no metric captures all relevant aspects of disproportionality. Accordingly, researchers and policymakers should be deliberate in their specific aims in measuring discipline disproportionality and select a combination of metrics that provides information most responsive to their goals.
Article
The Lumosity games and subsequent “memory wars” illustrate the rhetorical power of statistics in public discourse. Defenders of Lumosity build upon discursive traces based in societal fears and arguments based in “science” supported through statistics and experimentation. Detractors of Lumosity argue that their experiments are faulty. A close rhetorical reading reveals that certain commonalities exist across defenders and detractors alike. Looking at the inventional strategies of the statistical analyst as rhetor demonstrates how statistical tools are granted agency to determine research outcomes. Displacement of rhetorical agency has ramifications for understanding popular scientific discourse and making decisions as a society.
Article
Full-text available
This study examines the practice of ethnic communities in Bosnia and Herzegovina flying a state, entity, religious, or foreign flag at wedding ceremonies in public spaces. The wedding custom is analyzed through the lens of Hannah Arendt’s discussion of the way nationalism in the modern era links family and state. After a tragic war, flag power in this context appears to exacerbate nationalism and ethnic tensions in a polyethnic society trapped in a dysfunctional state structure created by the Dayton Accords. The empirical study finds that flag power does not, in fact, privilege ethnic solidarity over national solidarity to the degree that social and political theory would have us imagine. The national identity of being Bosnian is more likely to be exemplified. A clustered, stratified, random sample of 2,500 subjects over the age of eighteen was drawn from the country’s population, including the two entities, Federation of Bosnia and Herzegovina and Republika Srpska and Brčko District. Survey questions involving face-to-face structured questions asked participants whether flags were flown at their weddings, which flags were flown, and attitudes toward the wedding custom. Variations by age, religiosity, education, ethnicity, type of flag flown, and political party affiliation are reported and interpreted.
Article
Full-text available
Book
Full-text available
Книга предназначена для достаточно широкой аудитории преподавателей, студентов университетов и колледжей, исследователей, статистиков, а также пользователей статистики в поведенческих и социальныхнауках. Прежде всего, книга ориентирована на широкий круг читателей, изучающих статистические дисциплины в университетах и колледжах; однако, она может быть полезна также для читателей, самостоятельно изучающих статистику. Присутствие в книге ряда нетрадиционных тем делает ее полезным руководством для всех тех, кто в своей профессиональной деятельности имеет дело со статистическим анализом данных различных природы и характера. Новые особенности этого курса – многочисленные примеры статистических процедур, реализованных в среде известного пакета Maple. Исходные тексты некоторых из процедур включены в книгу, что позволяет непосредственно использовать их в среде пакета Maple с конкретными данными читателя. Эти и другие процедуры находятся в пользовательской библиотеке UserLib6789, позволяя выполнять простой статистический анализ данных различного характера в среде пакета Maple. Материал настоящей книги базируется на трех наших российских книгах, тираж которых был полностью распродан, и английской книге. Эти книги были написаны на основе ряда курсов лекций по Общей Теории Статистики, Теории Вероятностей и Математической Статистики для студентов Университетов Беларуси и Балтики, которые специализируются в области экономических и социальных наук (экономика, международное право, юриспруденция, политология, социология, психология, банковское дело, бухгалтерия и т.д.). Книга содержит веcьма обширную как русскую, так и английскую литературу по различным аспектам статистики. Наряду с этим, книга снабжена полезным списком статистических организаций, статистических периодических изданий и т.д.
Chapter
We introduce the statistical design of experiments and put the topic into the larger context of scientific experimentation. We give a non-technical discussion of some key ideas of experimental design, including the role of randomization, replication, and the basic idea of blocking for increasing precision and power. We also take a more high-level view and consider the construct, internal and external validities of an experiment, and the corresponding tools that experimental design offers to achieve them.
Article
Background Undocumented immigration is often accompanied by multiple and complex stressors, which over time may increase the risk for chronic pain.Objective This study aimed to identify the prevalence of chronic pain and its association with psychological distress among undocumented Latinx immigrants in the USA.Design/ParticipantsWe used respondent-driven sampling to collect and analyze data from clinical interviews with 254 undocumented Latinx immigrants, enabling inference to a population of 22,000.Main MeasuresChronic pain was assessed using the World Health Organization Composite International Diagnostic Interview (CIDI) Chronic Conditions Module. For all analyses, inferential statistics accounted for design effects and sample weights to produce weighted estimates. We conducted logistic regression analyses to assess the association between chronic pain and psychological distress after controlling for age, years in the USA, and history of trauma.ResultsA total of 28% of undocumented Latinx immigrants reported having chronic pain, and 20% of those had clinically significant psychological distress. Significant differences in the prevalence of chronic pain were reported across age groups, years in the USA, and trauma history. After controlling for relevant covariates, chronic pain was significantly associated with psychological distress (OR = 1.06, 95% CI [1.02, 1.09]), age (OR = 1.05, 95% CI [1.02; 1.09]), and history of trauma (OR = 1.10 per additional traumatic event, 95% CI [1.02; 1.19]; C-statistic = 0.79).Conclusion Among undocumented Latinx immigrants, chronic pain is significantly associated with psychological distress, older age, and trauma history. Given that undocumented immigrants have restricted access to healthcare and are at high risk for chronic pain, developing alternatives to facilitate access to chronic pain interventions and risk-reduction prevention are needed.
Article
Das Wissen über aktuelle bildungswissenschaftliche Erkenntnisse stellt einen zentralen Aspekt der professionellen Kompetenz von Lehrkräften dar. Die Vermittlung von bildungswissenschaftlicher Evidenz an Lehramtsstudierende ist somit eine wesentliche Herausforderung für die Lehrerbildung. Adaptierte wissenschaftliche Artikel sind eine Möglichkeit dieser Herausforderung zu begegnen. Die vorliegende explorative Studie analysiert das Potential adaptierter wissenschaftlicher Artikel zur Vermittlung von Evidenz aus den Bildungswissenschaften, indem die Leseprozesse von Studierenden des Lehramts (n = 5) und Lehrerbildnern (n = 5) mittels Eye Tracking verglichen werden. So soll geprüft werden, ob sich Studierende in ihrem Leseprozess von den Lehrerbildern unterscheiden, um ausgehend davon hochschuldidaktische Implikationen für ein verbessertes Verständnis des Rezeptionsprozesses ableiten zu können. Die Ergebnisse deuten darauf hin, dass der Leseprozess bei beiden Gruppen ähnlich verläuft, aber unterschiedliche Schwerpunkte aufweist. Lehrerbildner lesen eher szientifisch, der Leseprozess von Studierenden kann als eher praxeologisch bezeichnet werden, allerdings bedürfen diese ersten explorativen Befunde weiterer empirischer Fundierung. Zukünftige Studien sollten ebenfalls prüfen, welcher Leseprozess zu einem effektiveren Wissensaufbau führt und durch welche hochschuldidaktischen Methoden sich effizientere Leseprozesse bei Lehramtsstudierenden fördern ließen.
Article
As part of this special issue on validity, this essay addresses practices for research in both measurement and methodology through the lens of the philosophy of science. The essay has three primary objectives. First, it seeks to outline seminal characteristics of science, identifying practices that scientists must adopt in research for their work to be more scienctific. Second, the essay notes best practices for conducting science in the field of communication. Finally, the article concludes with discussion of the future of communication science.
Article
Full-text available
Psychologists are often interested in whether an independent variable has a different effect in condition A than in condition B. To test such a question, one needs to directly compare the effect of that variable in the two conditions (i.e., test the interaction). Yet many researchers tend to stop when they find a significant test in one condition and a nonsignificant test in the other condition, deeming this as sufficient evidence for a difference between the two conditions. In this Tutorial, we aim to raise awareness of this inferential mistake when Bayes factors are used with conventional cutoffs to draw conclusions. For instance, some researchers might falsely conclude that there must be good-enough evidence for the interaction if they find good-enough Bayesian evidence for the alternative hypothesis, H 1 , in condition A and good-enough Bayesian evidence for the null hypothesis, H 0 , in condition B. The case study we introduce highlights that ignoring the test of the interaction can lead to unjustified conclusions and demonstrates that the principle that any assertion about the existence of an interaction necessitates the direct comparison of the conditions is as true for Bayesian as it is for frequentist statistics. We provide an R script of the analyses of the case study and a Shiny app that can be used with a 2 × 2 design to develop intuitions on this issue, and we introduce a rule of thumb with which one can estimate the sample size one might need to have a well-powered design.
Article
Full-text available
The role of self-disclosure has gone understudied in risk and crisis communication, despite its demonstrated relevance in other literature. The current quasi-experimental study examined the impact of self-disclosure on perceptions of source credibility, motivation to seek information, and behavioral intentions. Such variables are essential for the protection of people's physical health before a risk event. The results fail to reveal main effects for self-disclosure, but suggest indirect effects whereby disclosure may drive elaboration, which in turn impacts the variables of interest. The results are discussed in terms of their implications for risk communicators and policymakers, and in directions for future research.
Article
The role of p‐values for null hypothesis testing is under debate, with suggestions to lower the threshold for new discoveries from p<0.05 to p<0.005, to justify alpha based on the context, or to retire significance altogether. We aim to explore the impact of the significance threshold on estimates for the strengths of associations (‘effects’), and the implications for different types of epidemiological research. We consider situations with normal distribution of a true effect, while varying the effect size. We confirm the occurrence of ‘testimation bias’: estimating effect size only if the test was statistically significant leads to exaggerated results. The absolute bias is largest for true effects around 0.7 times the size of the standard error: +220% bias if effects are selected after testing with p<0.05, and + 335% if tested with p<0.005. Less bias was found for testing with p<0.20 (+130%) and larger true effect sizes. We conclude that a lower p‐value threshold for declaring statistical significance implies more exaggeration in an estimated effect. This implies that if a low threshold is used, effect size estimation should not be attempted, e.g. in the context of selecting promising discoveries that need further validation. Confirmatory studies, such as randomized controlled trials, might stick to the 0.05 threshold if adequately powered, while prediction modeling studies should use an even higher threshold, such as 0.2, to avoid strongly biased effect estimates.
Article
The spaghetti problem arises in graphics when multiple time series or other functional traces show mostly a tangled mess. Devices to improve on graphical defaults include transformed scales (especially logarithmic scales); trying to increase the graph area showing the data (especially by losing the legend whenever possible); different colors sometimes; subdividing data into a few groups; subtraction to focus on residuals or smoothing to reduce noise; selection or sampling of what is shown or emphasized; and stacking series vertically.
Article
Full-text available
Methods to increase Campbell's (1957) internal and external validity as well as Cook and Campbell's (1979) construct and conclusion validity are reviewed. For internal validity or valid causal inference, designs and methods to draw causal conclusions from nonrandomized studies are considered. Greater collaboration between the causal inference and structural equation modeling traditions would benefit both. For external validity, generalizing results, treating partners and studies as well as participants as random is strongly encouraged. For construct validity, particularly the psychological meaning of measures, multivariate models that treat measures from both overtime and dyadic data as being a combination of multiple constructs are discussed. For conclusion validity or valid statistical inference, the problem of low power when generalizability is high and the assumption of independence are discussed. Finding the truth in psychological research is a challenge, and seemingly insurmountable difficulties are often encountered. Nonetheless, persistent and diligent efforts using strategies developed by generations of methodologists should lead to scientific advancement. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Article
Early experience sampling research sought to map the ecology of adolescents’ lives. Its contributions include discovery of similar patterns in psychological states across diverse samples: positive emotions with friends, more negative states alone, high challenge but low motivation during schoolwork, and wider variability in teens’ than adults’ emotions, including more frequent extreme positive states. Recent ambulatory assessment research has expanded this mission and methods in valuable ways. Yet it still demands problem‐solving (e.g., engaging participants, formulating analyses that represent teens’ complex lives). A promising innovation is use of micro‐longitudinal analyses to examine sequential processes (e.g., linkages between stress–coping–emotions; relationship episodes). Qualitative data can add “zones” for development of empirically‐based theory about daily processes, such as adolescents’ meaning‐making and learning self‐regulation.
Article
A large body of work shows that reasoning motivated by partisan cues and prior attitudes leads to unreflective decisions and disparities in empirical beliefs across groups. Surprisingly little research, however, has tested the limits of motivated reasoning. We argue that the publicly circulated findings of deliberative minipublics can spark a more reflective motivation in voters when these bodies provide policy‐relevant factual information. To test that proposition, we conducted a survey experiment using information generated by one such minipublic during an election. Results showed that exposure to the minipublic's findings improved the accuracy of voters' empirical beliefs regarding a ballot proposition on the regulation of genetically modified seeds. This treatment effect transcended voters' partisan identities and prior environmental attitudes. In some instances, the respondents showing the greatest knowledge gains were those who a directional motivated‐reasoning account would have expected to resist the treatment most effectively, owing to party identity or prior attitudes.
Article
Full-text available
Z jednej strony statystyki mogą być użyte jako skuteczne narzędzie retoryczne w dyskursie politycznym. Z drugiej – tzw. oporni odbiorcy mogą z łatwością je odrzucić jako nieprzekonywające. W artykule wskazuję na związek między traktowaniem statystyk jako arystotelesowskich dowodów „nieartystycznych” (gr. pisteis atechnoi, łac. probationes inartificiales) a postrzeganiem ich przez wspomniany typ odbiorców jako źródła kłamstwa. Korzystając z medioznawczej koncepcji ramowania, badam, kiedy język używany do opisu statystyk może efektywniej służyć odbiorcom. Twierdzę, że dzieje się tak wtedy, gdy odbiorca rozumie sposób tworzenia statystyk jako elementu społecznego procesu. Statystyki nie są dowodami same w sobie, ale raczej są wykorzystane przez retora, by swą argumentację wzmacniać. Jeśli istotą zrozumienia statystyk jest ich interpretacja, autorzy przekazów powinni zachęcać odbiorców do tego procesu myślowego, a nie narzucać własne interpretacje. W swoim wywodzie przywołuję przykłady artykułów prasowych wykorzystujących statystyki na temat imigracji w Stanach Zjednoczonych, by omówić dwa rodzaje ramowania: takiego, które ma utwierdzać w gotowych interpretacjach i takiego, które zaprasza czytelnika do własnej interpretacji.
ResearchGate has not been able to resolve any references for this publication.