MethodPDF Available

Cheat Sheet for Quantitative Research

Authors:

Abstract

Cheat Sheet for Quantitative Research (in Linguistics, Psycholinguistics) If your study has at most two orange cells and no red cell in the table on the right, then proceed with caution. If your study has more than two orange cells or one red cell, go back and reconsider your design and analysis. Available online at https://www.hugoquene.nl/qm/CheatSheetQuantRes.pdf
QUANTITATIVE RESEARCH | CHEAT SHEET
PR ER EQ UI SI TE S
Write clear research questions, order them by priority and
importance, and write them out in full.
DE SI GN AND A NAL YS IS
Design is more important than analysis.
Before collecting data, ensure that your analysis matches
your design, and vice versa.
Obtaining more data is always better, no matter what.
Check whether your proposed study is ORANGEORANGE or GREENGREEN
in each row of the table on the right. For explanation, see
notes.
If your study has at most TWO orange cells and no REDRED cell
in the table on the right, then proceed with caution. If your
study has more than two orange cells or one red cell, go
back and reconsider your design and analysis.
Beware of order eects (priming, learning, emerging
strategies, fatigue, boredom, etc) within a participant’s
session and across multiple sessions for the same
participant. Test for these eects in your analyses.
Check ALL assumptions of a statistical test or model BEFORE
conducting that test or tting that model.
LAX, PERMISSIVE, LIBERAL STRICT, RESTRICTIVE, CONSERVATIVE NOTES MY STUDY IS...
1No prior evidence against H0 (signicant
outcome may be false positive)
Strong prior evidence against H0 (signicant
outcome may be true positive)
If most of H0’s (!) being tested are true, a priori, then
most of signicant outcomes are false positives
(Ioannidis, 2005). See point 4.
ORANGEORANGE
GREENGREEN
2Key factors vary between participants Key factors vary within participants See tables below, and see Quené (2010) ORANGEORANGE
GREENGREEN
3Large variation between participants (items) Small variation between participants (items) Larger variation requires larger numbers of participants
(items), see point 5.
Consider (i.e. balance) both internal and external validity.
ORANGEORANGE
GREENGREEN
4Exploratory research, developing tentative
ideas
Experimental research, testing pre-existing
hypothesis
ORANGEORANGE
GREENGREEN
5Few participants OR few items Many participants AND many items See 3. Should be GREEN for GLMM or LMM, for
participants AND items.
NB “few” means 12 or fewer, “many” means 30 or more
ORANGEORANGE
GREENGREEN
6Low power High power NB “low” means .8 or less, “high” means .9 or more ORANGEORANGE
GREENGREEN
7Dependent variable (response) measured on
categorical scale
Dependent variable (response) measured on
continuous scale
Related to point 5.
“categorical” or qualitative response: e.g.
correct~incorrect response, scale with 5 or fewer
options; “continuous” or numerical response: e.g.
response time in ms, scale with 7 or more options, most
phonetic measurements.
ORANGEORANGE
GREENGREEN
8Predicted eect is small in size:
small dierence, large variation
Predicted eect is large in size:
large dierence, small variation
Obtain estimates of variation from previous studies, or
from pilot work (see Quené, 2010).
Background: Quené & Van den Bergh (2020), §13.8.
ORANGEORANGE
GREENGREEN
9Many factors or predictors:
risk of overtting
Few factors or predictors:
“less is more”, “keep it simple”, robust
with k number of continuous predictors, m number
of levels of categorical factors, and N number of
observations:
N > 20(k+m), or, (k+m) < N/20
(Cohen, 1990; Quené, 2010)
ORANGEORANGE
GREENGREEN
10 Some concepts mentioned in this table are
not familiar to me
I have learned about and I fully understand all
concepts mentioned in this table
H0, variation, variance, eect size, power, signicance,
predictor, levels, response, n and N, model, test,
inference, sample, participants, stimuli, groups,
treatment, ...
REDRED
GREENGREEN
Hugo Quené (h.quene@uu.nl) 2021.01.28 version 0.20 | license: CC: BY-SA (https://creativecommons.org/licenses/by-sa/4.0/)
RE FE RE NC ES
Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45(12),
1304-1312.
Gries, S. Th. (2015). Quantitative Linguistics. In International Encyclopedia of the
Social & Behavioral Sciences (2nd ed., Vol. 19, pp. 725–732). Oxford:
Elsevier. https://www.academia.edu/23085895/Quantitative_
methods_in_linguistics
Ioannidis, J. (2005). Why Most Published Research Findings Are False. PLoS
Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124
Quené, H. (2010). How to design and analyze language acquisition
studies. In E. Blom & S. Unsworth (Eds.), Experimental Methods
in Language Acquisition Research (pp. 269–287). Amsterdam:
Benjamins. https://ebookcentral.proquest.com/lib/uunl/reader.
action?docID=623350&ppg=277
Quené, H. & Van den Bergh, H. (2020). Quantitative Methods and Statistics.
Retrieved 27 January 2021 from <https://hugoquene.github.io/
QMS-EN/>
Winter, B. (2019). Statistics for Linguists: An Introduction Using R. Routledge.
https://doi.org/10.4324/9781315165547
Zuur, A. F., Ieno, E. N., & Elphick, C. S. (2010). A protocol for data exploration to
avoid common statistical problems. Methods in Ecology and Evolution,
1(1), 3-14. https://doi.org/10.1111/j.2041-210X.2009.00001.x
TREATMENT VARIES WITHIN PARTICIPANTS
Treat.A Treat.B
group 1 (n=48) 1.A 1.B
group 2 (n=48) 2.A 2.B
total N=96 participants
AC KN OW LE DG EM EN TS
Thanks to Maaike Schoorlemmer, Kirsten Schutter and Piet van Tuijl for helpful comments and suggestions.
The following two tables illustrate row 2 of the table above.
(power >.8, sd=.5 for xed eects, sd=1.0 for random eects)
TREATMENT VARIES BETWEEN PARTICIPANTS
Treat.A Treat.B
groups 1+2 (each n=32) 1.A 2.B
groups 3+4 (each n=32) 3.A 4.B
total N=128 participants
ResearchGate has not been able to resolve any citations for this publication.
Chapter
Full-text available
One of the key questions in linguistics is how language is acquired, both by children and by adults. Language acquisition is often investigated by means of behavioral research methods. The aim of this chapter is to provide an overview of the most important methodological issues involved in designing empirical linguistic studies, and in analyzing data from such studies.
Article
Full-text available
There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.
Book
Open textbook, available at https://hugoquene.github.io/QMS-EN, 290+ pp. Multiple formats, source code, and supplementary materials are available at doi:10.5281/zenodo.4479620.
Article
This article surveys a selected variety of statistical methods that are currently used in experimental and observational studies in linguistics. It covers goodness-of-fit tests, monofactorial and multifactorial hypothesis testing methods, and hypothesis-generating techniques. In addition, for the two major sections of significance testing and exploratory methods, the article also discusses a wide range of statistical desiderata, i.e., perspectives and methods whose more widespread recognition or adoption would benefit linguistics as a discipline.
Article
This is an account of what I have learned (so far) about the application of statistics to psychology and the other sociobiomedical sciences. It includes the principles "less is more" (fewer variables, more highly targeted issues, sharp rounding off), "simple is better" (graphic representation, unit weighting for linear com- posites), and "some things you learn aren't so." I have learned to avoid the many misconceptions that surround Fisherian null hypothesis testing. I have also learned the importance of power analysis and the determination of just how big (rather than how statistically significant) are the effects that we study. Finally, I have learned that there is no royal road to statistical induction, that the informed judgment of the investigator is the crucial element in the interpretation of data, and that things take time.