Article

Linear models and linear mixed effects models in R with linguistic applications

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

This text is a conceptual introduction to mixed effects modeling with linguistic applications, using the R programming environment. The reader is introduced to linear modeling and assumptions, as well as to mixed effects/multilevel modeling, including a discussion of random intercepts, random slopes and likelihood ratio tests. The example used throughout the text focuses on the phonetic analysis of voice pitch data.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Lateinfantile GM1 gangliosidosis patients were assigned a value of 0 and juvenile patients were assigned a value of 1 to evaluate the effect of GM1 gangliosidosis subtype. Linear mixed effects modeling was created using the LME4 package [38,39] and were used to evaluate the relationship between RCGI-C scores with GM1 subtype, participant age, and the elapsed time since the baseline evaluation. A subject-level random intercept was used to account for repeated measures [39,40]. ...
... Linear mixed effects modeling was created using the LME4 package [38,39] and were used to evaluate the relationship between RCGI-C scores with GM1 subtype, participant age, and the elapsed time since the baseline evaluation. A subject-level random intercept was used to account for repeated measures [39,40]. Pearson correlations were also calculated in R between crosssectional volumetric MRI data (gray matter, white matter, and ventricle volume) and baseline RCGI-S scores (considered as continuous measures). ...
... Linear mixed effects modeling was also used to evaluate the percent change in volumetric MRI data (fixed effect) with RCGI-C scores. A subject-level random intercept was used to account for repeated measures [38][39][40]. Of the 10 patients with Vineland data, there were 16 longitudinal time points with matching RCGI-C and corresponding Vineland-3 or Vineland-II scores. ...
Article
Full-text available
Background Clinical trials for rare diseases pose unique challenges warranting alternative approaches in demonstrating treatment efficacy. Such trials face challenges including small patient populations, variable onset of symptoms and rate of disease progression, and ethical considerations, particularly in neurodegenerative diseases. In this study, we present the retrospective clinical global impression (RCGI) severity and change (RCGI-S/C) scale on 27 patients with GM1 gangliosidosis, a post hoc clinician-rated outcome measure to evaluate natural history study participants as historical controls for comparisons with treated patients in a clinical trial. Methods We conducted a systematic chart review of 27 GM1 gangliosidosis natural history participants across 95 total visits. RCGI-S was assessed at the first visit and rated 1 (normal) to 7 (among the most extremely ill). Each subsequent follow-up was rated on the RCGI-C scale from 1 (very much improved) to 7 (very much worse). We demonstrate scoring guidelines of both scales with examples and justifications for this pilot in GM1 gangliosidosis natural history participants. The convergent validity of the RCGI scales was explored through correlations with magnetic resonance imaging (MRI) and the Vineland Adaptive Behavioral Scales. Results We found strong association between the RCGI-S scores with gray matter volume (r(14) = −0.81; 95% CI [−0.93, −0.51], p < 0.001), and RCGI-C scores significantly correlated with increases in ventricular volume (χ ² (1) = 18.6, p < 0.001). Baseline RCGI-S scores also strongly correlated with Vineland adaptive behavioral composite scores taken at the same visit (r(14) = −0.72; 95% CI [−0.93, −0.17], p = 0.02). Conclusion RCGI-S/C scales, which use the clinical evaluation to assess the severity of disease of each patient visit over time, were consolidated into a single quantitative metric in this study. Longitudinal RCGI-C scores allowed us to quantify disease progression in our late-infantile and juvenile GM1 patients. We suggest that the retrospective CGI may be an important tool in evaluating historical data for comparison with changes in disease progression/mitigation following therapeutic interventions.
... Due to an excess of zeroes, we modified the model to account for zero inflation using the command 'ziformula' (Brooks et al., 2017;Zuur et al., 2012). To find out if the time spent in the zones (odour or control) differed between the presented odours, a likelihood ratio test of the full model (including the explanatory variable odour) against the null model (excluding the odour) was performed (Winter, 2013). In case of significance, we then conducted a Tukey post hoc test using the package multcomp on the full model to check for significant differences between the odours (Hothorn et al., 2008). ...
... The IDs of the tested frogs, as well as the side where the odour was placed, were entered as random effects into the model. We then conducted a likelihood ratio test of the full model with the effect in question (the different odours) against the null model without the effect in question (without the different odours; Winter, 2013) in order to find if the activity patterns of the frogs varied between the different odour treatments. ...
... See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licenseanalysis of variance was conducted using the full model with the different odours against the null model without the different odours(Winter, 2013). ...
Article
Olfaction is the oldest sense in the animal kingdom. It is used during a multitude of behaviours, such as the encounter of food, the detection of predators, the recognition of habitat‐related cues or the communication with conspecifics. While the use of olfaction and chemical communication has been studied widely in some animals, it is barely known in others. Anurans (frogs and toads), for example, are well known to use acoustic and visual senses, but their chemical sense is still largely understudied. Studies concerning the chemical sense in anurans have been mostly based on the use of semiochemicals in juvenile stages, while the information on adult anurans remains limited. In this study, we analysed the behavioural response of the Neotropical poison frog Ranitomeya sirensis (Sira poison frog, Dendrobatidae) when presented with the odours of prey, novel/prey‐luring fruit, habitat, conspecific faeces and heterospecifics. For this, we offered each of the odours by placing them into one of two testing tubes fixed in an arena, with the other tube left empty as a control. We then measured the time the frogs spent in the vicinity of the odour versus the control tube and calculated a response index. While the frogs did not show a significant avoidance or attraction towards most of the tested odours, they showed a strong response towards the heterospecific odour, which was significantly avoided. This is the first evidence of a poison dart frog responding towards the odours of adult heterospecific frogs. We consider potential reasons for this strong negative reaction, such as the interspecific competition avoidance hypothesis, and discuss our results in the context of other animal species being deterred or attracted by heterospecific chemical cues.
... Statistical models help to represent and determine key influences within environments and among participants. Statistical models are important because they can help to account for different factors in the environment, support research claims, explain important phenomena, summarize data trends, and apply STEM (Science, Technology, Engineering, and Mathematics) fields to real-world problems (Gordon, 2019;Winter, 2013). Statistical model construction is not comfortable for everyone, but it can be crucial in demonstrating actual treatment effects and outcomes. ...
... After general models were chosen that fit the data, specific procedures for model fitting and analysis could start. The modeling procedures in R were recommended by Christensen ( (2007), and Winter (2013). There were six essential assumptions that were tested within the data variables: linearity, collinearity, independence, influential weights, equal variances, and normality. ...
... for time). Through principal component analysis with the raw data as recommended by Hayden (2018) and Winter (2013), it was determined that sample size accounted for 72.2% of the variation and effect size variance accounted for 27.8% of the variation. ...
Research
Full-text available
I have just been published! Check out my work at https://lnkd.in/edtef76b and https://lnkd.in/es2u8GTV This is open access.
... For 95% confidence interval analysis, we assume that the paired differences will be normally distributed over the selected range of simulated gold standard RR values. Then, the new model is compared with the model that contains all the effects, through a likelihood ratio test, as described in [44]. In addition, we compute the Pearson correlation coefficient between the RR estimates and the gold standard RR values for all 28 experiments for both RR estimation methods. ...
... Table IV shows the likelihood ratio test results for the model excluding motion effect specifically for both RR methods. The mean bias due to motion has been reduced from 5.38 to 0.56 bpm and the effect of motion no longer induces a statistically significant effect on RR estimation error (p-value increases from 0 to 0.075) [44]. The Pearson correlation coefficient is 0.53 for the first RR estimation method and 0.99 for the second modified RR estimation method. ...
Preprint
We develop and evaluate a respiratory rate estimation algorithm that utilizes data from pressure-sensitive mat (PSM) technology for continuous patient monitoring in neonatal intensive care units (NICU). An analysis of the random effect of drift and systematic effect of creep in the PSM data is presented, showing that these are essentially dependent on the applied load and contact surface. Uncertainty measurements are pivotal when estimating physiologic parameters. The standard uncertainty in the PSM data is here represented by the percent drift. Next, we evaluate the applicability of PSM technology to estimate RR in neonatal patient simulator trials under five mixed effects including internally and externally induced motion, mattress type, grunting, laying position, and different breathing rates. We analyze the limits of agreement on the mixed effects model to derive the uncertainty in the estimated RR obtained through two estimation techniques. In comparison with the gold standard RR values, we achieved a mean bias of 0.56 breaths per minute (bpm) with an error bounded by a 95% confidence interval of [-2.26, 3.37] bpm. These results meet the clinical accuracy requirements of RR within +/-5 bpm.
... A model comparison approach using a Likelihood Ratio Test was then used to assess the goodness of each model fit. These evaluations returned pvalues which compared the full linear effects model to a partial model (see Winter, 2013). This statistical approach has the advantage of analyzing all available data while adjusting fixed effects, random effects, and likelihood ratio test estimates for missing data. ...
... A model comparison approach using a Likelihood Ratio Test was then used to assess the goodness of each model fit. These evaluations returned p-values which compared the full linear effects model to a partial model (see Winter, 2013). This statistical approach has the advantage of analyzing all available data while adjusting fixed effects, random effects, and likelihood ratio test estimates for missing data. ...
Technical Report
Full-text available
As more active driving systems are integrated into vehicles, the role and responsibilities of drivers using these technologies will change fundamentally from conventional vehicles. Vehicle automation now offers extended control of vehicle's speed, headway and lane position, but calling upon drivers to continuously monitor the road and traffic environment. This report summarizes an on-road study of drivers' workload, arousal and attentiveness when driving vehicles equipped with Level 2 automation. Results summarized in this report should help researchers, automobile industry and government entities better understand driver performance, behavior and interactions in vehicles with advanced technologies.
... We applied the stepwise approach described by Winter [86] and others [56] to integrate various predictor variables and their interaction effects. For model selection and significance testing in mixed models, we followed the standard approach by using the Likelihood Ratio Test (LRT) which tests the difference between two nested models using the Chi-square test. ...
... This process involved comparing a baseline model without predictors to one with predictors and maintaining those that showed statistical significance. Participant and headline IDs were included as random effects to account for repeated measures for varying items [15,86,89]. We then conducted post hoc analysis for different subgroup comparisons. ...
Article
For many people, social media is an important way to consume news on important topics like health. Unfortunately, some influential health news is misinformation because it is based on retracted scientific work. Ours is the first work to explore how people can understand this form of misinformation and how an augmented social media interface can enable them to make use of information about retraction. We report a between-subjects think-aloud study with 44 participants, where the experimental group used our augmented interface. Our results indicate that this helped them consider retraction when judging the credibility of news. Our key contributions are foundational insights for tackling the problem, revealing the interplay between people's understanding of scientific retraction, their prior beliefs about a topic and the way they use a social media interface that provides access to retraction information.
... The LME analysis was informed by a published tutorial (Winter, 2013). The FramebyFrame R package performs the above LME analysis when generating parameter plots with ggParameter-Grid(…) or writing a report of LME statistics using LMEreport(…). ...
Article
Full-text available
By exposing genes associated with disease, genomic studies provide hundreds of starting points that should lead to druggable processes. However, our ability to systematically translate these genomic findings into biological pathways remains limited. Here, we combine rapid loss-of-function mutagenesis of Alzheimer’s risk genes and behavioural pharmacology in zebrafish to predict disrupted processes and candidate therapeutics. FramebyFrame , our expanded package for the analysis of larval behaviours, revealed that decreased night-time sleep was common to F0 knockouts of all four late-onset Alzheimer’s risk genes tested. We developed an online tool, ZOLTAR , which compares any behavioural fingerprint to a library of fingerprints from larvae treated with 3677 compounds. ZOLTAR successfully predicted that sorl1 mutants have disrupted serotonin signalling and identified betamethasone as a drug which normalises the excessive day-time sleep of presenilin-2 knockout larvae with minimal side effects. Predictive behavioural pharmacology offers a general framework to rapidly link disease-associated genes to druggable pathways.
... To study changes in TSPO expression over time in lesions of different ages, of which a subset of lesions were found in the same animal, we performed a changepoint analysis using a linear mixed-effects model, with time point as a categorical independent variable, to account for the nonindependence of these data points [41,42]. The fraction of Iba1 + cells expressing TSPO in each lesion was modeled as a function of lesion age at time of sacrifice. ...
Article
Full-text available
Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) and is a leading non-traumatic cause of disability in young adults. The 18 kDa Translocator Protein (TSPO) is a mitochondrial protein and positron emission tomography (PET)-imaging target that is highly expressed in MS brain lesions. It is used as an inflammatory biomarker and has been proposed as a therapeutic target. However, its specific pathological significance in humans is not well understood. Experimental autoimmune encephalomyelitis (EAE) in the common marmoset is a well-established primate model of MS. Studying TSPO expression in this model will enhance our understanding of its expression in MS. This study therefore characterizes patterns of TSPO expression in fixed CNS tissues from one non-EAE control marmoset and 8 EAE marmosets using multiplex immunofluorescence. In control CNS tissue, we find that TSPO is expressed in the leptomeninges, ependyma, and over two-thirds of Iba1 + microglia, but not astrocytes or neurons. In Iba1 + cells in both control and acute EAE tissue, we find that TSPO is co-expressed with markers of antigen presentation (CD74), early activation (MRP14), phagocytosis (CD163) and anti-inflammatory phenotype (Arg1); a high level of TSPO expression is not restricted to a particular microglial phenotype. While TSPO is expressed in over 88% of activated Iba1 + cells in acute lesions in marmoset EAE, it also is sometimes observed in subsets of astrocytes and neurons. Additionally, we find the percentage of Iba1 + cells expressing TSPO declines significantly in lesions > 5 months old and may be as low as 13% in chronic lesions. However, we also find increased astrocytic TSPO expression in chronic-appearing lesions with astrogliosis. Finally, we find expression of TSPO in a subset of neurons, most frequently GLS2 + glutamatergic neurons. The shift in TSPO expression from Iba + microglia/macrophages to astrocytes over time is similar to patterns suggested by earlier neuropathology studies in MS. Thus, marmoset EAE appears to be a clinically relevant model for the study of TSPO in immune dysregulation in human disease. Graphical Abstract
... Since the data were not normally distributed, we rank-transformed all grip force measurements before making any statistical comparisons [54]. We constructed the linear mixed-effects models based on Winter [55] and Bates et al. [56], using the R packages 'lmerTest' [57] and 'lme4' [56]. In each model-one with body weight-adjusted maximal grip force as %BW and the other with residuals-we included body mass, autopodial shape, locomotor ecology (i.e. ...
Article
Full-text available
Powerful digital grasping is essential for primates navigating arboreal environments and is often regarded as a defining characteristic of the order. However, in vivo data on primate grip strength are limited. In this study, we collected grasping data from the hands and feet of eleven strepsirrhine species to assess how ecomorphological variables—such as autopodial shape, laterality, body mass and locomotor mode—influence grasping performance. Additionally, we derived anatomical estimates of grip force from cadaveric material to determine whether in vivo and ex vivo grip strength measurements follow similar scaling relationships and how they correlate. Results show that both in vivo and anatomical grip strength scale positively with body mass, though anatomical measures may overestimate in vivo performance. Species with wider autopodia tend to exhibit higher grip forces, and forelimb grip forces exceed those of the hindlimbs. No lateralization in grip strength was observed. While strepsirrhine grip forces relative to their body weight are comparable to those of other primates and slightly exceed those of humans, they are not exceptional compared to other arboreal mammals or birds, suggesting that claims of extraordinary primate grasping abilities require further investigation.
... For further discussion on why mixed effects are a powerful approach in psycholinguistic studies, please see Winter [98] and Cunnings [99]. ...
Article
Full-text available
Sentence stimuli pervade psycholinguistics research. Yet, limited attention has been paid to the automatic construction of sentence stimuli. Given their linguistic capabilities, this study investigated the efficacy of ChatGPT in generating sentence stimuli and AI tools in producing auditory sentence stimuli. In three psycholinguistic experiments, this study examined the acceptability and validity of AI-formulated auditory sentences and written sentences in one of the two languages: English and Arabic. In Experiment 1 and 3, participants gave English AI-generated stimuli similar to or higher acceptability ratings than human-composed stimuli. In Experiment 2, Arabic AI-generated stimuli received lower acceptability ratings than their human-composed counterparts. The validity of AI-developed stimuli relied on the study design, with only Experiment 2 demonstrating the target psycholinguistic effect. These results highlight the promising role of AI as a stimuli developer, which could facilitate psycholinguistic research and increase its diversity. Implications for psycholinguistic research were discussed.
... In the Bilingual group, outliers represented 2.47% of all RTs (namely 22-day, 24-month, 9-hour and 16-year calculations). For RT analyses, we used linear mixed-effects models following Winter (2013). The main fixed-effect factors were Calculation (Day and Month) and Group (English, Chinese, Bilingual). ...
Article
Full-text available
Calendar calculations, the process of computing the target day or month, exhibit peculiar differences across languages. In systems like English, calendar labels are largely opaque (Tuesday, August), which invites calculations to rely more heavily on verbal listing. In transparent systems, like Chinese, habitual labeling of calendar terms numerically (Tuesday = Day 2, August = Month 8) facilitates fast numerical operations instead of verbal listing. This study examines the effects that different levels of transparency of the calendar naming system may have on calculations in the speakers’ first and second language. Chinese–English bilinguals were tested alongside English and Chinese controls. Forced-choice calendar calculations (day, month, hour and year) and self-reported strategies were used as tasks to tap into participants’ calculation speed, accuracy and temporal reasoning. In the calculation questions, we manipulated Distance (short/long), Direction (forward/backward), Input (linguistic/numerical) and Boundary (within/across). More complex Month calculations significantly differed across groups while easier day calculations did not. The English group reported reliance on verbal listing while the Chinese and the Bilingual groups preferred numerical reasoning. These findings bring new evidence for linguistic relativity in the form of modulations of calendar processing speed changing as a function of linguistic transparency, input type and task demand.
... Our study employs a linear mixed-effects model, which is wellsuited for data that include both fixed effects and random effects, allowing for the analysis of data with multiple sources of variability [25]. In this analysis, emotion (categorised as excited, angry, happy, sad, neutral) and gender (male/female) were treated as fixed effects. ...
Conference Paper
Full-text available
This study explores the effects of Parkinson's disease (PD) on emotional speech production among native New Zealand English speakers, utilising a newly developed simulated emotional speech corpus. We engaged twelve participants diagnosed with PD to generate speech samples expressing five distinct emotions , yielding 1800 sentences. Our analysis focuses on key acoustic parameters, namely fundamental frequency and intensity , to evaluate their efficacy in depicting emotional states in PD-affected speech. Results indicate these parameters effectively differentiate between emotional expressions and correlate with emotional arousal levels. This provides insights into the complex interplay between physiological alterations due to PD and emotional vocal expression.
... The pooled data did not have a normal distribution. Thus, to assess whether the amount of cleaning varied within challenges, we applied the generalized linear mixed-effects model analysis (GLMM [42]; glmmTMB package [43]). To select the best model, we applied the Akaike information criterion (AIC) to compare models with distinct factors and the following family distributions: Poisson, Conway-Maxwell Poisson, Negative Binomial 1 and Negative Binomial 2, with the default link function. ...
Article
Full-text available
The immune system is crucial for organisms to defend against pathogens. Likewise, analogous immune features evolved against similar pressures at the superorganism scale. Upregulating hygiene to the same fungus pathogen is one assumption for convergent immune mechanisms in social insects, although more evidence of immune memory features remains to be confirmed. Here, we assess immune memory traits at the colony level in the leaf-cutting ant Atta sexdens. We exposed their fungus cultivar to both homologous and heterologous challenges with the entomopathogenic fungi Metarhizium anisopliae and Beauveria bassiana, as well as the mycoantagonistic fungi Fusarium oxysporum and Trichoderma spirale. By measuring ants’ behaviours, we evaluated the capacity of A. sexdens: (i) to enhance their collective hygiene, (ii) speed their hygiene in further infections, (iii) how long this capacity lasts in the colonies and (iv) the degree of specificity to increase hygienic responses. Fungus grooming behaviour was enhanced mostly against entomopathogenic fungi, with a trend of faster reactions during homologous challenges. In general, the capacity to elicit such upregulated actions lasted for up to 30 days, but no longer than 60 days. Overall, colonies exhibited a degree of immune specificity, enhancing hygiene only in response to homologous exposures but decreasing it when infected secondarily with a different fungus, indicating flexible social immunity of A. sexdens after immune challenges.
... The models were compared using the MuMIn package (Bartoń, 2020) and those with lower AIC values were treated as better fit. To test for fixed effects, the ANOVA function was used for model comparisons via likelihood ratio tests to obtain p values (Winter, 2013). Specifically, we compared a full model with a certain fixed effect against a reduced model without that effect (we test for main effects and interactions separately). ...
Article
Connectives are crucial in constructing coherent discourse by signaling relationships across sentences and triggering presuppositions. This event-related potential (ERP) study investigated how the type and presence of connectives influence sentence processing. Participants read sentences with because, although, or no connective, where critical words were either congruent or incongruent with the clausal relation specified by connectives. Incongruent words elicited a larger N400 than congruent words, regardless of connective type or presence. In later processing stages, incongruency induced an extended N400 effect in concessive contexts but a P600 effect in causal and no-connective contexts. Additionally, a larger P600 was observed in concessive-congruent versus causal-congruent conditions, indicating greater effort to reconcile the concessive-triggered presupposition with the proposition. These findings suggest that connective-triggered presuppositions constrain both the initial evaluation of language input against broad contextual knowledge and the subsequent integration into discourse representation, supporting an Evaluation and Integration model of presupposition processing.
... Statistically significant differences between the integrals of 13 Clabeled NMR signals of metabolites in the two temperature groups were determined using linear models. The linear models (function lm) were applied to the temperature groups using the statistical programming environment R (R Core Team, 2022), according to the tutorial by Winter (2013). ...
Article
Full-text available
Current climate change, particularly ocean warming, will induce shifts in marine species distribution and composition, affecting the marine food web and, thus, trophic interactions. Analyses of the stable isotopes ¹³C and ¹⁵N are commonly used to detect trophic markers for food web analyses. With the current standard methods used in food web ecology, it is still challenging to identify potential changes in the uptake and utilization of trophic markers. In this work, we present a ¹³C-enrichment analysis by NMR spectroscopy to track the uptake and utilization of dietary carbon in a simple laboratory experiment of a primary producer and its consumer (algae and bivalve). In particular, we tested the hypothesis of a temperature-dependent use of dietary carbon by tracing the incorporation of ¹³C-atoms. Unicellular phytoplankton, Phaeodactilum tricornutum, was reared in a medium containing ¹³C-labeled bicarbonate. The accompanying ¹³C-NMR spectra of labeled P. tricornutum showed a specific profile of ¹³C-labeled compounds, including typical trophic markers such as the polyunsaturated omega-3 fatty acid eicosapentaenoic acid (EPA). Afterwards, ¹³C-labeled P. tricornutum was fed to King scallops, Pecten maximus, kept at two different temperatures (15°C and 20°C). Tissue-specific NMR spectra of P. maximus revealed elevated ¹³C-NMR signals, particularly of the fatty acid EPA in the digestive gland, which was not evident in muscle tissue. The comparison between the two temperatures indicated a change in trophic markers. At the higher temperature, less unsaturated fatty acids were detected in the digested gland, but increased ¹³C-labels in sugars were detected in the adductor muscle. This might indicate a change in the uptake and utilization of the trophic marker EPA in P. maximus due to a shift in energy conversion from favored beta-oxidation at colder temperatures to conversion from carbohydrates in the warmth. Our approach indicates that besides the accumulation of trophic markers, their incorporation and conversion are additional important factors for the reliable interpretation of trophic linkages under climate change.
... The best model has the lowest AIC and the significance of full models versus the null models was tested using likelihood ratio tests. Linear mixed-effects models were constructed and analyzed in R using 'lme4' (Bates et al., 2015) following Winter (2013) preprint. ...
Article
It is thought that the magnitude of center of mass (COM) oscillations can affect stability and locomotor costs in arboreal animals. Previous studies have suggested that minimizing collisional losses and maximizing pendular energy exchange are effective mechanisms to reduce muscular input and energy expenditure during terrestrial locomotion. However, few studies have explored whether these mechanisms are used in an arboreal context, where stability and efficiency often act as tradeoffs. This study explores three-dimensional center of mass mechanics in an arboreal primate—the squirrel monkey (Saimiri sciureus)—moving quadrupedally at various speeds on instrumented arboreal and terrestrial supports. Using kinetic data, values of energy recovery, center of mass mechanical work and power, potential and kinetic energy congruity, and collision angle and fraction were calculated for each stride. Saimiri differed from many other mammals by having lower energy recovery. Although few differences were observed in center of mass mechanics between substrates at low or moderate speeds, as speed increased, center of mass work was done at a much greater range of rates on the pole. Collision angles were higher, while collision fractions and energy recovery values were lower on the pole, indicating less moderation of collisional losses during arboreal versus terrestrial locomotion. These data support the idea that the energetic demands of arboreal and terrestrial locomotion differ, suggesting that arboreal primates likely employ different locomotor strategies compared to their terrestrial counterparts—an important factor in the evolution of arboreal locomotion.
... As a relatively new advance in statistical analysis in the language sciences, the use of mixed-effects models is supported by many scholars in the field (Winter 2013). Analysis was conducted using R (version 4.3.1) ...
Article
The present study on intentional retrieval practice compared the benefits of presenting words in either informative or uninformative sentence contexts. Participants first studied a list of English words with their translations. Then, they were all exposed to half of the words with informative sentences containing meaning clues in the Context Inference (CI) condition and half with uninformative sentences devoid of such clues in the Memory Retrieval (MR) condition as part of retrieval-based practising. Participants were required to type the L1 translation for each word presented using a mobile application. Data were collected by both form-recall and meaning-recall tests immediately afterwards and then a week later. In addition, this study focused on the relationship between working memory capacity (WMC) and word retention in these two conditions to explore the suggestion that individuals may benefit differently from retrieval practices. Although the results showed that both conditions contributed to word retention, the MR condition was significantly more effective than the CI condition for the participants’ long-term retention. Further, the results revealed an overall positive effect of WMC on word retention in both conditions, with high-WMC individuals achieving higher retention scores than low-WMC individuals. However, this effect was not modulated by the type of context condition.
... We used analysis of variance (ANOVA) to test for significant differences among these models, and AIC and BIC estimates to select the best model fit. This approach allowed us to select the most parsimonious, best fitting model (Winter 2013). We interpreted a significant interaction that included the depth term to mean that the depth distribution of δ 13 C-CO 2 varied with the other term(s) in the interaction. ...
Article
Full-text available
Replacing long-lived, rarely disturbed vegetation with short-lived, frequently disturbed vegetation is a widespread phenomenon in the Anthropocene that can influence ecosystem functioning and soil development by reducing the abundance of deep roots. We explore how sources and fate of soil CO2 vary with organic substrate source, abundance of respiring biota (i.e., roots and soil microbes), season, and soil depth. We quantified multiple isotopic signatures of CO2 (δ¹³C, Δ¹⁴C, δ¹⁸O) as well as concentrations and δ¹⁸O of free O2 in the upper 5 m of soil at sites where root abundances and soil organic C have been previously quantified: in late-successional forests, cultivated fields, and ~ 80 y old regenerating pine forests growing on previously cultivated land. We hypothesized that soil CO2sources would vary across soil depth and land cover, reflecting varying abundances of organic substrates, and seasonally as the dominance of root vs. microbial CO2 production changes through the year. δ¹³C–CO2 revealed respiration of C4-derived substrates in cultivated fields particularly during the growing season. This effect was not evident in soils of regenerating pine or older hardwood forests, suggesting that ~ 80 y of pine inputs to reforested soils have been sufficient to dominate microbial substrate selection over any remnant, historic agricultural C4 inputs. Δ¹⁴C–CO2 diverged by land use at 3 and 5 m, indicating that more recently-produced photosynthate is available for mineralization in forests compared to cultivated plots, and in late-successional forests compared to regenerating pine forests. At 1.5, 3, and 5 m in forested plots we observed evidence of respiratory demands on soil pore space O2. In these soils, we observed declines in [O2] compared to other depths and to the agricultural plots and concurrent increases in δ¹⁸O of free O2, consistent with the idea that roots and heterotrophic soil microbes are more active where photosynthate is more available. The δ¹⁸O–CO2 values, a likely proxy for δ¹⁸O of soil porewater, exhibited ¹⁸O enrichment during the winter, when many sampling wells were flooded, compared to growing season values. These data suggest an isotopically-distinct and laterally-flowing source of CO2-laden porewater during winter months. Combined, these datasets document how ~ 80 y of forest regeneration can provide sufficient C inputs to mask any microbial mineralization of decades-old organic inputs, but belowground C inputs still lag those of late successional forests. We also infer that lateral and vertical flows of water can serve as a sink for biotically-generated CO2, and that where deep soil [CO2] is lower due to lower root and microbial activities, production of carbonic acid is also diminished. Where reaction rates are weathering limited, a paucity of deep roots imposed by anthropogenic land cover change thus may limit the production of this agent of soil development and the C sink represented by the silicate weathering it can promote. The data suggest deep and persistent effects of the loss of deeply rooted long-lived vegetation on deep soil C storage and transformations that promote acid-dissolution weathering reactions that help form soil itself.
... The models included learner-by-time and word-by-time random slopes, as random slopes help with handling by-learner and by-word heteroskedasticity and prevent overconfident results (Barr et al., 2013;Cunnings & Finlayson, 2015). Although linear mixed-effects models are robust against violation of normality (Winter, 2013), normality of duration, intensity, and pitch data were checked visually using q-q plots. All data were normally distributed. ...
Article
Full-text available
The interactionist approach to second language acquisition has yielded a plethora of studies confirming the positive impact of interaction and corrective feedback on second language (L2) development. Nevertheless, only a few studies have attempted to investigate the development of L2 prosody using the interactionist approach. The current study contributes to this line of research by investigating the relationship between recasts and the production of primary stress in L2 English. Following a pretest-posttest design, 68 L1 Arabic speakers were randomly assigned to control and intervention groups. The pre- and posttest comprised sentence-completion and information-exchange tasks, whereas the intervention was a role-play task that dyads carried out with the researcher. The intervention group received a recast upon producing target words with misplaced primary stress, whereas the control group did not receive any corrective feedback. The results of acoustic analyses, which focused on syllable duration, intensity, and pitch, indicated a positive relationship between recasts and development of primary stress placement. The results were also supported by expert listener judgments. The findings suggest that interaction and implicit corrective feedback play a positive role in the development of lexical stress.
... One semantic similarity model (a) was specified for analyzing lexical-semantic associations within the subordinate clause. Based on a model-selection procedure with the likelihood ratio test [38], the full model including the fixed effects of the temporal indicator and antecedent polarity and their interactions were included in the best-fitting model for analyzing the semantic similarity value. A second semantic similarity model (b) was specified for analyzing the lexical-semantic associations across the main and subordinate clauses. ...
Article
Full-text available
Previous theories have established the mental model activation of processing different types of conditionals, stating that counterfactual conditionals expressing events that contradict known facts (e.g., “If it had rained, then they would not go to the park.”) are considered to trigger two mental models: (1) a hypothetical but factually wrong model (e.g., “rain” and “did not go to the park”) and (2) a corresponding real-world model (e.g., “did not rain” and “went to the park”). This study aimed to investigate whether pragmatic factors differentially influence readers’ comprehension and distinction between counterfactual and hypothetical conditional sentences in Mandarin Chinese. Participants were required to read and judge the comprehensibility of Chinese hypothetical and counterfactual conditionals, which were different in temporal indicators (past vs. future temporal indicators) in the antecedent. Different polarities (with vs. without negators) and different moving directions (different directional verbs: lai2 [come] vs. qu4 [go]) in the consequent were also manipulated. Linear mixed-effects models (LMEM) revealed that hypothetical conditionals (with future temporal indicators) were more comprehensible than counterfactual conditionals (with past temporal indicators). The semantic similarities within the subordinate clause revealed future temporal indicators had higher lexical–semantic co-occurrence than past indicators, suggesting that temporal indicators impact comprehension partly through lexical semantics in the premise, and hypothetical conditionals are more easily processed. However, the semantic similarity analysis of the main and the subordinate clauses showed no effect of temporal indicators, suggesting that lexical–semantic co-occurrence across clauses may not substantially contribute to the distinction between hypothetical conditionals and counterfactual conditionals. In conclusion, this study offers insights into the comprehension of Chinese conditional sentences by shedding light on the pragmatic factors influencing the activation of different mental models.
... We performed logistic regression using generalized linear models [8,9]. We also used artificial neural networks called autoencoders to integrate data from different sources, giving a holistic picture of the mental health, physical health, and social factors contributing to mortality in SMI [2]. ...
Article
Full-text available
We present an explainable artificial intelligence methodology for predicting mortality in patients. We combine clinical data from an electronic patient healthcare record system with factors relevant for severe mental illness and then apply machine learning. The machine learning model is used to predict mortality in patients with severe mental illness. Our methodology uses class-contrastive reasoning. We show how machine learning scientists can use class-contrastive reasoning to generate complex explanations that explain machine model predictions and data. An example of a complex class-contrastive explanation is the following: “The patient is predicted to have a low probability of death because the patient has self-harmed before, and was at some point on medications such as first-generation and second-generation antipsychotics. There are 11 other patients with these characteristics. If the patient did not have these characteristics, the prediction would be different”. This can be used to generate new hypotheses, which can be tested in follow-up studies. Diuretics seemed to be associated with a lower probability of mortality (as predicted by the machine learning model) in a group of patients with cardiovascular disease. The combination of delirium and dementia in Alzheimer’s disease may also predispose some patients towards a higher probability of predicted mortality. Our technique can be employed to create intricate explanations from healthcare data and possibly other areas where explainability is important. We hope this will be a step towards explainable AI in personalized medicine.
... Linear mixed effects models are constructed by adding one or several random terms to the equation with the aim of systematising the random error inherent in linear regression. This avoids two common problems in biological data: lack of independence of the measurements, and hierarchical structures within the data (Calama & Montero, 2004;Winter, 2013). Thus, linear mixed effects models can be used to determine the relationship between shrinkage and density in terms of goodness of fit, as has been found for some softwoods (e.g. ...
Article
Full-text available
Aim of study: The properties of wood of laurel (Laurus nobilis L.) have not yet been adequately described. For example, information on variables related to dimensional stability during drying (shrinkage) is lacking, even though this is a key factor determining the suitability of the material for industrial uses with high added value. The aim of this study was to construct models for estimating shrinkage variables by using wood density as the predictor variable. Area of study: Seventeen laurel trees were felled in an inland area of Galicia (north-western Spain) in order to obtain the material for testing and modelling. Material and methods: The experimental tests were performed on 958 small standardised, defect-free wood specimens. Main results: The wood under study was moderately heavy and volumetrically unstable. Density varied only slightly, but volumetric shrinkage varied statistically significantly within and between trees. A linear mixed effects model was constructed to predict the variation in volumetric shrinkage from the oven-dry density, including the factors tree and height in the stem, with random slopes and intercepts. Research highlights: The model proved valid for all sampled individuals up to a height of two metres in the stem, thus enabling estimation of the volumetric shrinkage in commercial basal logs.
... To test our hypotheses, we conducted a linear mixed effect regression with the use of R language (R Core Team, 2022) and "lme4" R package (Bates et al., 2015). We followed the guidelines by Winter (2013). To interpret results, we exploited the "report" R package (Makowski et al., 2023) which computes confidence intervals and p-values using a Wald t-distribution approximation. ...
Article
Full-text available
The phenomenon of "hearing voices" can be found not only in psychotic disorders, but also in the general population, with individuals across cultures reporting auditory perceptions of supernatural beings. In our preregistered study, we investigated a possible mechanism of such experiences, grounded in the predictive processing model of agency detection. We predicted that in a signal detection task, expecting less or more voices than actually present would drive the response bias toward a more conservative and liberal response strategy, respectively. Moreover, we hypothesized that including sensory noise would enhance these expectancy effects. In line with our predictions, the findings show that detection of voices relies on expectations and that this effect is especially pronounced in the case of unreliable sensory data. As such, the study contributes to our understanding of the predictive processes in hearing and the building blocks of voice hearing experiences.
... likelihood ratio tests (lRt) tests were used to analyze fixed effects. here, a chi-square (χ2) approach of model comparison was implemented where interaction and main effect terms were verified by comparing models with and without relevant terms (Winter, 2013). the size and confidence of significant fixed effects were described using exponential converted (to account for previous logarithm transformation) model slope coefficients (β) and confidence interval (ci). ...
Article
Full-text available
Commercial pilots endure multiple stressors in their daily and occupational lives which are detrimental to psychological well-being and cognitive functioning. The Quick coherence technique (QCT) is an effective intervention tool to improve stress resilience and psychophysiological balance based on a five-minute paced breathing exercise with heart rate variability (HRV) biofeedback. The current research reports on the application of QCT training within an international airline to improve commercial pilots’ psychological health and support cognitive functions. Forty-four commercial pilots volunteered in a one-month training programme to practise self-regulated QCT in day-to-day life and flight operations. Pilots’ stress index, HRV time-domain and frequency-domain parameters were collected to examine the influence of QCT practice on the stress resilience process. The results demonstrated that the QCT improved psychophysiological indicators associated with stress resilience and cognitive functions, in both day-to-day life and flight operation settings. HRV fluctuations, as measured through changes in RMSSD and LF/HF, revealed that the resilience processes were primarily controlled by the sympathetic nervous system activities that are important in promoting pilots’ energy mobilization and cognitive functions, thus QCT has huge potential in facilitating flight performance and aviation safety. These findings provide scientific evidence for implementing QCT as an effective mental support programme and controlled rest strategy to improve pilots’ psychological health, stress management, and operational performance.
... The GLMER analysis allows the variation in performance to be attributed to relevant empirically controlled factors such as ball arrival positions and vertical ball speeds (fixed effects), but also in part to the individual variation that came with each participant who repeated all the sessions (random effects). It thus enables dissociating the contributions of these separate components which are interdependent (for further details on mixed models, see: Tagliamonte and Baayen, 2012;Winter, 2013;Winter and Wieling, 2016). ...
Article
Full-text available
In this study, we aimed to characterize the affordance of interceptability for oneself using a manual lateral interception paradigm. We asked a two-fold research question: (1) What makes a virtual ball interceptable or not? (2) How reliably can individuals perceive this affordance for oneself? We hypothesized that a spatiotemporal boundary would determine the interceptability of a ball, and that individuals would be able to perceive this boundary and make accurate perceptual judgments regarding their own interceptability. To test our hypotheses, we administered a manual lateral interception task to 15 subjects. They were first trained on the task, which was followed by two experimental sessions: action and judging. In the former, participants were instructed to intercept as many virtual balls as possible using a hand-held slider to control an on-screen paddle. In the latter session, while making interceptions, participants were instructed to call “no” as soon as they perceived a ball to be uninterceptable. Using generalized linear modeling on the data, we found a handful of factors that best characterized the affordance of interceptability. As hypothesized, distance to be covered and ball flight time shaped the boundary between interceptable and uninterceptable balls. Surprisingly, the angle of approach of the ball also co-determined interceptability. Altogether, these variables characterized the actualized interceptability. Secondly, participants accurately perceived their own ability to intercept balls on over 75% of trials, thus supporting our hypothesis on perceived interceptability. Analyses revealed that participants considered this action boundary while making their perceptual judgments. Our results imply that the perceiving and actualizing of interceptability are characterized by a combination of the same set of variables.
... 반면에 한국어에서는 격음과 평음을 구별할 때 피치가 제 1단서이고 VOT는 제 2단서이다 (Silva, 2006;Kim et al., 2002 (Bates et al., 2015). 세부적인 분석을 위해 Tukey 사후검정을 실시하였다 (Winter, 2013). 10) 본 연구에서는 L3 폐쇄음만을 음성 분석하였다. ...
... As random effects, we considered the intercepts for participants. We obtained p-values by calculating likelihood-ratio tests [70]. All parameters were estimated by maximum likelihood estimation [55]. ...
Article
Achieving temporal synchrony between sensory modalities is crucial for natural perception of object interaction in virtual reality. While subjective questionnaires are currently used to evaluate users’ VR experiences, leveraging behavior and psychophysiological responses can provide additional insights. We investigated motion and ocular behavior as discriminators between realistic and unrealistic object interactions. Participants grasped and placed a virtual object while experiencing sensory feedback that either matched their expectations or occurred too early. We also explored visual-only feedback vs. combined visual and haptic feedback. Due to technological limitations, a condition with delayed feedback was added post-hoc. Gaze-based metrics revealed discrimination between high and low feedback realism. Increased interaction uncertainty was associated with longer fixations on the avatar hand and temporal shifts in the gaze-action relationship. Our findings enable real-time evaluation of users’ perception of realism in interactions. They facilitate the optimization of interaction realism in virtual environments and beyond.
... Interactions between fixed factors were also explored. Likelihood ratio tests were used to assess the significance of all fixed factors (Winter, 2013). These tests produce a chi-squared statistic that compares the model wherein the fixed factor of interest is included with the model where the same fixed factor is removed. ...
... The statistical analysis was performed in RStudio by the linear mixed models (LMM) approach using the lmer function from the lme4 package [43]. We devised LMMs to test the fixed effects of the 1. ...
Preprint
Full-text available
Touch-mediated affect has largely been studied using natural textures and brushing techniques, which present challenges in control and deployment across haptic applications. A common alternative is vibrotactile stimulation (VBT) since it is easily deployable and accessible across devices. However, sparse literature exists on the VBT-induced affect modulation and its cortical correlates. Addressing this gap, we developed a novel paradigm that examined the behavioral and electrophysiological correlates of affect induced by VBT. We used electroencephalography (EEG) to record the cortical responses to VBT across six different locations and measured the concurrent affect ratings. Mixed effects modelling, an unsupervised modelling technique, was used to decipher the relationships between stimulation conditions, affect ratings and cortical modulations. Our study revealed that altering the duration of the vibrotactile stimuli can elicit distinct affect responses. The location of the VBT stimuli did not play a part in the perceptual aspects of affect but was involved in the cortical encoding of affect. Furthermore, early cortical processing in the somatosensory cortex (SCx) primarily encoded Arousal and not Valence. Our study lays out phenomenological aspects of cortical VBT processing and lays the groundwork for future research.
... For multisyllabic phee analysis, syllables were nested within monkeys' random effect. Models with and without lesions were used to test the effect of ACC lesion and lesion effect is considered significant at an αof 0.05 70,71 . Model assumptions were tested using the check_model() function available in the performance package. ...
Preprint
Full-text available
The social dynamics of vocal behavior has major implications for social development in humans. We asked whether early life damage to the anterior cingulate cortex (ACC), which is closely associated with socioemotional regulation more broadly, impacts the normal development of vocal expression. The common marmoset provides a unique opportunity to study the developmental trajectory of vocal behavior, and to track the consequences of early brain damage on aspects of social vocalizations. We created ACC lesions in neonatal marmosets and compared their pattern of vocalization to that of age-matched controls throughout the first 6 weeks of life. We found that while early life ACC lesions had little influence on the production of vocal calls, developmental changes to the quality of social contact calls and their associated syntactical and acoustic characteristics were compromised. These animals made fewer social contact calls, and when they did, they were short, loud and monotonic. We further determined that damage to ACC in infancy results in a permanent alteration in downstream brain areas known to be involved in social vocalizations, such as the amygdala and periaqueductal gray. Namely, in the adult, these structures exhibited diminished GABA-immunoreactivity relative to control animals, likely reflecting disruption of the normal inhibitory balance following ACC deafferentation. Together, these data indicate that the normal development of social vocal behavior depends on the ACC and its interaction with other areas in the vocal network during early life.
... The smallest values Akaike's Information Criterion (AIC), Schwarz's Bayesian Criterion (BIC), and -2 Log Likelihood (-2LL) identify the model of best fit (62). Examination of residual plots confirmed that linear mixed model assumptions were met (62,63). ...
Article
Full-text available
Introduction High variability in response and retention rates for posttraumatic stress disorder (PTSD) treatment highlights the need to identify "personalized" or "precision" medicine factors that can inform optimal intervention selection before an individual commences treatment. In secondary analyses from a non-inferiority randomized controlled trial, behavioral and physiological emotion regulation were examined as non-specific predictors (that identify which individuals are more likely to respond to treatment, regardless of treatment type) and treatment moderators (that identify which treatment works best for whom) of PTSD outcome. Methods There were 85 US Veterans with clinically significant PTSD symptoms randomized to 6 weeks of either cognitive processing therapy (CPT; n = 44) or a breathing-based yoga practice (Sudarshan kriya yoga; SKY; n = 41). Baseline self-reported emotion regulation (Difficulties in Emotion Regulation Scale) and heart rate variability (HRV) were assessed prior to treatment, and self-reported PTSD symptoms were assessed at baseline, end-of-treatment, 1-month follow-up, and 1-year follow-up. Results Greater baseline deficit in self-reported emotional awareness (similar to alexithymia) predicted better overall PTSD improvement in both the short- and long-term, following either CPT or SKY. High self-reported levels of emotional response non-acceptance were associated with better PTSD treatment response with CPT than with SKY. However, all significant HRV indices were stronger moderators than all self-reported emotion regulation scales, both in the short- and long-term. Veterans with lower baseline HRV had better PTSD treatment response with SKY, whereas Veterans with higher or average-to-high baseline HRV had better PTSD treatment response with CPT. Conclusions To our knowledge, this is the first study to examine both self-reported emotion regulation and HRV, within the same study, as both non-specific predictors and moderators of PTSD treatment outcome. Veterans with poorer autonomic regulation prior to treatment had better PTSD outcome with a yoga-based intervention, whereas those with better autonomic regulation did better with a trauma-focused psychological therapy. Findings show potential for the use of HRV in clinical practice to personalize PTSD treatment. Clinical trial registration ClinicalTrials.gov identifier, NCT02366403
... Linear mixed models (Winter, 2013) were implemented in R (Bates et al., 2014) to investigate differences in the features across each cohort (controls and participants with HD) for each speech task separately. For all features, age, biological sex, and language were included as fixed effects, and the intercept for participants was included as a random effect. ...
Article
Full-text available
Purpose Changes in voice and speech are characteristic symptoms of Huntington's disease (HD). Objective methods for quantifying speech impairment that can be used across languages could facilitate assessment of disease progression and intervention strategies. The aim of this study was to analyze acoustic features to identify language-independent features that could be used to quantify speech dysfunction in English-, Spanish-, and Polish-speaking participants with HD. Method Ninety participants with HD and 83 control participants performed sustained vowel, syllable repetition, and reading passage tasks recorded with previously validated methods using mobile devices. Language-independent features that differed between HD and controls were identified. Principal component analysis (PCA) and unsupervised clustering were applied to the language-independent features of the HD data set to identify subgroups within the HD data. Results Forty-six language-independent acoustic features that were significantly different between control participants and participants with HD were identified. Following dimensionality reduction using PCA, four speech clusters were identified in the HD data set. Unified Huntington's Disease Rating Scale (UHDRS) total motor score, total functional capacity, and composite UHDRS were significantly different for pairwise comparisons of subgroups. The percentage of HD participants with higher dysarthria score and disease stage also increased across clusters. Conclusion The results support the application of acoustic features to objectively quantify speech impairment and disease severity in HD in multilanguage studies. Supplemental Material https://doi.org/10.23641/asha.25447171
Article
The switch between automated and manual driving modes is currently an inevitable topic for automated vehicles. Understanding how long it takes drivers to stabilize physically and cognitively after the driving mode switch is important to maintain driving safety. Given that little attention has been paid to drivers’ stabilization time after the driving mode switch, this study focuses on drivers’ cognitive load and visual attention and aims to investigate drivers’ stabilization time after the driving mode switch. Twenty-eight participants drove in a high-fidelity driving simulator where they experienced mode switching from manual to automated and from automated to manual. Reaction time to the detection response task and on-road fixation durations were measured throughout the experiment to assess drivers’ cognitive load and visual attention. Results revealed that it took drivers 10 to 15 s to stabilize their cognitive load after taking over the manual control of the simulated vehicle, and 5 to 10 s to stabilize after relinquishing manual driving to the automated system. These findings indicate that drivers’ cognitive load and visual attention will fluctuate after driving mode switches and a buffer time should be provided to ensure driving safety. By exploring drivers’ cognitive load and visual attention after driving mode switches, this study offers valuable insights into the design of automated driving systems and helps to improve road safety. In developing automated driving systems, efforts should be made to identify an appropriate time window for drivers to perform stable driving performance and improve their in-vehicle experience.
Preprint
Full-text available
GM1 gangliosidosis is an inherited, progressive, and fatal neurodegenerative lysosomal storage disorder with no approved treatment. We calculated a predicted brain ages and Brain Structures Age Gap Estimation (BSAGE) for 81 MRI scans from 41 Type II GM1 gangliosidosis patients and 897 MRI scans from 556 neurotypical controls (NC) utilizing BrainStructuresAges, a machine learning MRI analysis pipeline. NC showed whole brain aging at a rate of 0.83 per chronological year compared with 1.57 in juvenile GM1 patients and 12.25 in late-infantile GM1 patients, accurately reflecting the clinical trajectories of the two disease subtypes. Accelerated and distinct brain aging was also observed throughout midbrain structures including the thalamus and caudate nucleus, hindbrain structures including the cerebellum and brainstem, and the ventricles in juvenile and late-infantile GM1 patients compared to NC. Predicted brain age and BSAGE both correlated with cross-sectional and longitudinal clinical assessments, indicating their importance as a surrogate neuroimaging outcome measures for clinical trials in GM1 gangliosidosis.
Conference Paper
Users often begin exploratory visual analysis (EVA) without clear analysis goals but iteratively refine them as they learn more about their data. As an essential step in data science, researchers want to aid EVA by developing responsive and personalized visualization tools. For this, accurate models of users’ exploration behavior are becoming increasingly vital. However, many computational models assume that the human exploration behavior is static , which goes against the dynamic nature of EVA. In this benchmark study, we investigate how users dynamically shift their data focus in EVA and seek to find the best online learning methods for modeling users’ data focus shifts. Through empirical analyses, we find reinforcement learning algorithms are better in this regard than existing approaches from visualization research. Furthermore, we discuss our findings and their impact on the future of user modeling for visualization system design.
Presentation
Full-text available
Graduate students in translation and interpreting face the challenging task of drafting theses that have practical implications without compromising scientific rigor. Against this backdrop, this talk is primarily based on the latest Basic Requirements for Master's Theses of Translation Professional Degree (Trial Version). Using practical research cases, the talk addresses common data collection methods (e.g., content analysis, retrospective interviews, surveys, and corpus analysis) and statistical methods (e.g., t-tests, chi-square tests, correlation analysis, reliability and validity testing, and regression analysis) involved in research on translation textbooks, translation market surveys, and the relationship between translators' cognitive traits and translation/interpreting performance. The talk aims to provide valuable insights for graduate students in Translation and Interpreting as they work on their theses.
Article
Pupil size is a non‐invasive index for autonomic arousal mediated by the locus coeruleus–norepinephrine (LC‐NE) system. While pupil size and its derivative (velocity) are increasingly used as indicators of arousal, limited research has investigated the relationships between pupil size and other well‐known autonomic responses. Here, we simultaneously recorded pupillometry, heart rate, skin conductance, pulse wave amplitude, and respiration signals during an emotional face–word Stroop task, in which task‐evoked (phasic) pupil dilation correlates with LC‐NE responsivity. We hypothesized that emotional conflict and valence would affect pupil and other autonomic responses, and trial‐by‐trial correlations between pupil and other autonomic responses would be observed during both tonic and phasic epochs. Larger pupil dilations, higher pupil size derivative, and lower heart rates were observed in the incongruent condition compared to the congruent condition. Additionally, following incongruent trials, the congruency effect was reduced, and arousal levels indexed by previous‐trial pupil dilation were correlated with subsequent reaction times. Furthermore, linear mixed models revealed that larger pupil dilations correlated with higher heart rates, higher skin conductance responses, higher respiration amplitudes, and lower pulse wave amplitudes on a trial‐by‐trial basis. Similar effects were seen between positive and negative valence conditions. Moreover, tonic pupil size before stimulus presentation significantly correlated with all other tonic autonomic responses, whereas tonic pupil size derivative correlated with heart rates and skin conductance responses. These results demonstrate a trial‐by‐trial relationship between pupil dynamics and other autonomic responses, highlighting pupil size as an effective real‐time index for autonomic arousal during emotional conflict and valence processing.
Article
What type of conceptual information about an object do we get at a brief glance? In two experiments, we investigated the nature of conceptual tokening—the moment at which conceptual information about an object is accessed. Using a masked picture‐word congruency task with dichoptic presentations at “brief” (50−60 ms) and “long” (190−200 ms) durations, participants judged the relation between a picture (e.g., a banana) and a word representing one of four property types about the object: superordinate ( fruit ), basic level ( banana ), a high‐salient ( yellow ), or low‐salient feature ( peel ). In Experiment 1, stimuli were presented in black‐and‐white; in Experiment 2, they were presented in red and blue, with participants wearing red‐blue anaglyph glasses. This manipulation allowed for the independent projection of stimuli to the left‐ and right‐hemisphere visual areas, aiming to probe the early effects of these projections in conceptual tokening. Results showed that superordinate and basic‐level properties elicited faster and more accurate responses than high‐ and low‐salient features at both presentation times. This advantage persisted even when the objects were divided into categories (e.g., animals , vegetables, vehicles, tools ), and when objects contained high‐salient visual features. However, contrasts between categories show that animals , fruits , and vegetables tend to be categorized at the superordinate level, while vehicles tend to be categorized at the basic level. Also, for a restricted class of objects, high‐salient features representing diagnostic color information ( yellow for the picture of a banana) facilitated congruency judgments to the same extent as that of superordinate and basic‐level labels. We suggest that early access to object concepts yields superordinate and basic‐level information, with features only yielding effects at a later stage of processing, unless they represent diagnostic color information. We discuss these results advancing a unified theory of conceptual representation, integrating key postulates of atomism and feature‐based theories.
Article
Rendering realistic tactile sensations of virtual objects remains a challenge in VR. While haptic interfaces have advanced, particularly with phased arrays, their ability to create realistic object properties like state and temperature remains unclear. This study investigates the potential of Ultrasound Mid-air Haptics (UMH) for enhancing the perceived congruency of virtual objects. In a user study with 30 participants, we assessed how UMH impacts the perceived material state and temperature of virtual objects. We also analyzed EEG data to understand how participants integrate UMH information physiologically. Our results reveal that UMH significantly enhances the perceived congruency of virtual objects, particularly for solid objects, reducing the feeling of mismatch between visual and tactile feedback. Additionally, UMH consistently increases the perceived temperature of virtual objects. These findings offer valuable insights for haptic designers, demonstrating UMH's potential for creating more immersive tactile experiences in VR by addressing key limitations in current haptic technologies.
Article
Full-text available
Resumo: Este estudo trata de uma estrutura inovadora do português brasileiro (PB) - as orações relativas com preposição órfã - comparando sua aceitabilidade e produção por falantes de PB com pouco ou nenhum conhecimento de inglês e estudantes brasileiros universitários de Letras da habilitação língua inglesa de uma universidade pública do Rio de Janeiro, por meio de duas tarefas: um julgamento de aceitabilidade com escala likert e uma tarefa de produção eliciada. Discute-se que essa construção é característica do inglês, o que poderia levar os bilíngues a níveis mais altos de aceitabilidade e produção dessa estrutura no PB. No entanto, manipula-se a divisão de preposições em dois grupos: com maior ou menor propensão a aparecerem em posição órfã, seguindo os estudos de Marcelino (2007) e Salles (2003). Os resultados indicam que há maior aceitabilidade dessa construção pelos bilíngues, sendo sua produção bastante alta, mas praticamente restrita às preposições legitimadas nessa posição no PB. Considera-se que o input duplo parece levar os bilíngues a se mostrarem mais tolerantes quanto a essa estrutura, mas apenas na tarefa de julgamento de aceitabilidade. Os resultados também permitem afirmar que as relativas com preposição órfã, respeitadas as restrições, são estruturas já bem aceitas e naturalmente produzidas pelos falantes de PB. Ao investigarmos falantes bilíngues, o estudo sugere que a presença de estruturas similares na língua adicional pode favorecer a aceitabilidade de variantes linguísticas inovadoras na língua materna. Palavras-chave: orações relativas; preposição órfã; bilinguismo; escala likert; produção eliciada. Abstract: This study investigates an innovative structure in Brazilian Portuguese (BP): relative clauses with preposition stranding. BP speakers and Brazilian undergraduate Portuguese-English bilingual students participated in an acceptability judgement task with a 5-point Likert scale and an elicited production task. Since relative clauses with preposition stranding is a very frequent construction in English, bilinguals were expected to show higher levels of acceptability and production of this kind of structure in BP. However, this phenomenon seems to be lexically restricted in BP: there are prepositions which may be stranded, but there are others which may not, according to Marcelino (2007) and Salles (2003). Our results show that bilinguals do indeed accept preposition stranding in higher levels than BP monolinguals, but their production is similar to that of BP speakers, conditioned by type of preposition, that is, they are only more tolerant with relative clauses with preposition stranding in BP in acceptability tasks. These results also show that these constructions are well-accepted and naturally produced by native BP speakers. The comparison between BP speakers and undergraduate Portuguese-English bilingual students suggests that the presence of similar structures in the additional language may increase acceptability rates of innovative variants in the mother tongue. Keywords: relative clauses; preposition stranding; bilingualism; Likert Scale; elicited production.
Article
Humans can estimate the number of visually presented items without counting. In most studies on numerosity perception, items are uniformly distributed across displays, with identical distributions in central and eccentric parts. However, the neural and perceptual representation of the human visual field differs between the fovea and the periphery. For example, in peripheral vision, there are strong asymmetries with regard to perceptual interferences between visual items. In particular, items arranged radially usually interfere more strongly with each other than items arranged tangentially (the radial–tangential anisotropy). This has been shown for crowding (the deleterious effect of clutter on target identification) and redundancy masking (the reduction of the number of perceived items in repeating patterns). In the present study, we tested how the radial–tangential anisotropy of peripheral vision impacts numerosity perception. In four experiments, we presented displays with varying numbers of discs that were predominantly arranged radially or tangentially, forming strong and weak interference conditions, respectively. Participants were asked to report the number of discs. We found that radial displays were reported as less numerous than tangential displays for all radial and tangential manipulations: weak (Experiment 1), strong (Experiment 2), and when using displays with mixed contrast polarity discs (Experiments 3 and 4). We propose that numerosity perception exhibits a significant radial–tangential anisotropy, resulting from local spatial interactions between items.
Preprint
Full-text available
Shared flow can be conceptualised as a collective state of flow that emerges within a group. It has been recently suggested that shared flow involves a spectrum of self-other overlap, joint attention, and social interaction, further facilitated by context and experience. To empirically test this, four gamelan groups - a musical ensemble originating from Indonesia - took part in a study (N=36), whereby aspects of the theorised spectrum were operationalised via (i) a self-report measure of self-other overlap, (ii) a measure of consensus of time distortion, and (iii) physiological synchrony. Using linear mixed-effects models, we tested whether associations between shared flow and these measures are modulated by different performance conditions and musical training. Lastly, we tested whether shared flow could be best predicted by all measures combined. While the relationship between self-other overlap and shared flow was not reliant on condition and expertise, it was for synchrony of skin conductance and consensus of time distortion. Furthermore, we found that models predicting shared flow encompassed combinations of all the above measures. The findings reveal the potential of physiological measures and a novel measure of consensus of time distortion as a supplement to self-reports in understanding the underlying social dynamics of shared flow.
Preprint
Full-text available
Objective Altered mechanical loading is a known risk factor for osteoarthritis. Destabilization of the medial meniscus (DMM) is a preclinical gold standard model for post-traumatic osteoarthritis and is thought to induce instability and locally increased loading. However, the joint- and tissue-level mechanical environment underlying cartilage degeneration remains poorly documented. Design Using a custom multiscale modeling approach, we assessed joint and tissue biomechanics in rats undergoing sham surgery and DMM. High-fidelity experimental gait data were collected in a setup combining biplanar fluoroscopy and a ground reaction force plate. Knee poses and joint-level loading were estimated through musculoskeletal modeling, using bony landmarks, semi-automatically tracked via deep learning on fluoroscopic images, and ground reaction forces. A musculoskeletal model of the rat hindlimb was adapted to represent knee flexion-extension, valgus-varus, and internal-external rotation. The tissue-level cartilage mechanical environment was then spatially estimated, using the musculoskeletal modeling parameters as inputs into a dedicated finite element (FE) model of the rat knee, comprising cartilage and meniscal tissues. Experimental gait data and modeling workflows, including musculoskeletal models and FE meshes, are openly shared through a data repository. Results In rats with DMM, the frontal plane knee pose was altered, yet there was no indication of joint-level overloading. Tissue-level mechanical cues typically linked with cartilage degeneration were not increased in the medial tibial cartilage, despite evidence of tissue structural changes. Conclusion DMM did not increase joint and tissue mechanical responses in the knee medial compartment, suggesting that mechanical loading alone does not explain the observed osteoarthritis-like structural changes.
Preprint
By exposing genes associated with disease, genomic studies provide hundreds of starting points that should lead to druggable processes. However, our ability to systematically translate these genomic findings into biological pathways remains limited. Here, we combine rapid loss-of-function mutagenesis of Alzheimer’s risk genes and behavioural pharmacology in zebrafish to predict disrupted processes and candidate therapeutics. FramebyFrame, our expanded package for the analysis of larval behaviours, revealed that decreased night-time sleep was common to F0 knockouts of all four late-onset Alzheimer’s risk genes tested. We developed an online tool, ZOLTAR, which compares any behavioural fingerprint to a library of fingerprints from larvae treated with 3,674 compounds. ZOLTAR successfully predicted that sorl1 mutants have disrupted serotonin signalling and identified betamethasone as a drug which normalises the excessive day-time sleep of presenilin-2 knockout larvae with minimal side effects. Predictive behavioural pharmacology offers a general framework to rapidly link disease-associated genes to druggable pathways.
Article
The notion of sound symbolism receives increasing interest in psycholinguistics. Recent research - including empirical effects of affective phonological iconicity on language processing (Adelman et al., 2018; Conrad et al., 2022) - suggested language codes affective meaning at a basic phonological level using specific phonemes as sublexical markers of emotion. Here, in a series of 8 rating-experiments, we investigate the sensitivity of language users to assumed affectively-iconic systematic distribution patterns of phonemes across the German vocabulary:After computing sublexical-affective-values (SAV) concerning valence and arousal for the entire German phoneme inventory according to occurrences of syllabic onsets, nuclei and codas in a large-scale affective normative lexical database, we constructed pseudoword material differing in SAV to test for subjective affective impressions.Results support affective iconicity as affective ratings mirrored sound-to-meaning correspondences in the lexical database. Varying SAV of otherwise semantically meaningless pseudowords altered affective impressions: Higher arousal was consistently assigned to pseudowords made of syllabic constituents more often used in high-arousal words - contrasted by less straightforward effects of valence SAV. Further disentangling specific differential effects of the two highly-related affective dimensions valence and arousal, our data clearly suggest arousal, rather than valence, as the relevant dimension driving affective iconicity effects.
Article
Synopsis Center of mass (COM) mechanics, often used as an energetic proxy during locomotion, has primarily focused on level movement and hardly explores climbing scenarios. This study examines three-dimensional COM movements across five phylogenetically distinct species to test theoretical expectations of climbing costs, explore how interspecific variation (i.e., different limb numbers, adhesion mechanisms, body masses [0.008–84 kg], and limb postures) affects COM mechanics, and determine the impact of out-of-plane COM movements on climbing costs. A parallel experiment with rosy-faced lovebirds explores how inclination angle affects COM mechanical energy and how these empirical data align with theoretical expectations. Results indicate that, irrespective of anatomical differences, total mechanical costs of climbing are primarily driven by potential energy, outweighing contributions from kinetic energy. Despite species exhibiting significant out-of-plane kinematics, these movements have minimal impact on overall locomotor costs. Inclination angle changes have minimal effects, as potential energy accumulation dominates quickly as steepness increases, suggesting climbing occurs even on acutely angled substrates from a COM perspective. The study challenges prior assumptions about factors influencing climbing costs, such as body mass, speed, or posture, indicating a lack of evident anatomical or behavioral adaptations for climbing efficiency across species. The research sheds light on the universal challenges posed by the mechanical demands of scaling vertical substrates, offering valuable insights for functional morphologists studying climbing behaviors in extant and fossilized species.
Article
Full-text available
Linear mixed-effects models (LMEMs) have become increasingly prominent in psycholinguistics and related areas. However, many researchers do not seem to appreciate how random effects structures affect the generalizability of an analysis. Here, we argue that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades. Through theoretical arguments and Monte Carlo simulation, we show that LMEMs generalize best when they include the maximal random effects structure justified by the design. The generalization performance of LMEMs including data-driven random effects structures strongly depends upon modeling criteria and sample size, yielding reasonable results on moderately-sized samples when conservative criteria are used, but with little or no power advantage over maximal models. Finally, random-intercepts-only LMEMs used on within-subjects and/or within-items data from populations where subjects and/or items vary in their sensitivity to experimental manipulations always generalize worse than separate F1 and F2 tests, and in many cases, even worse than F1 alone. Maximal LMEMs should be the ‘gold standard’ for confirmatory hypothesis testing in psycholinguistics and beyond.
Article
Full-text available
Influence.ME provides tools for detecting influential data in mixed effects models. The application of these models has become common practice, but the development of diagnostic tools has lagged behind. influence.ME calculates standardized measures of influential data for the point estimates of generalized mixed effects models, such as DFBETAS, Cook's distance, as well as percentile change and a test for changing levels of significance. influence.ME calculates these measures of influence while accounting for the nesting structure of the data. The package and measures of influential data are introduced, a practical example is given, and strategies for dealing with influential data are suggested.
Article
Full-text available
A recent research report (Brandt, 2011) suggests that sexist ideologies predict increases in gender inequality at the country level. Using advanced multilevel modeling methods, the author combined aggregated individual-level data on sexism (N > 80,000) from the World Values Survey (World Values Survey Association, 2009) with country-level data on gender inequality (N < 60) from United Nations Human Development Reports. We were curious and plotted the sexism effect based on the data printed in the article. Figure 1 reveals that the effect was driven by only a few influential cases. When we replicated Brandt’s (2011) analysis in Mplus 6.1 (Muthén & Muthén, 2010), the effect dropped to nonsignificance after we excluded Switzerland, and was reduced to virtually zero after we excluded three further outliers. Our reanalysis casts serious doubts on Brandt’s (2011) conclusion. We are not suggesting that it would have been better for these influential cases to have been deleted. However, they should have been discussed so that readers could assess the robustness of the effect. Furthermore, unusual cases often tell their own important stories, which may critically inform future research (McClelland, 2002).
Article
Full-text available
How should ecologists and evolutionary biologists analyze nonnormal data that involve random effects? Nonnormal data such as counts or proportions often defy classical statistical procedures. Generalized linear mixed models (GLMMs) provide a more flexible approach for analyzing nonnormal data when random effects are present. The explosion of research on GLMMs in the last decade has generated considerable uncertainty for practitioners in ecology and evolution. Despite the availability of accurate techniques for estimating GLMM parameters in simple cases, complex GLMMs are challenging to fit and statistical inference such as hypothesis testing remains difficult. We review the use (and misuse) of GLMMs in ecology and evolution, discuss estimation and inference and summarize 'best-practice' data analysis procedures for scientists facing this challenge.
Article
Full-text available
Pseudoreplication occurs when observations are not statistically independent, but treated as if they are. This can occur when there are multiple observations on the same subjects, when samples are nested or hierarchically organised, or when measurements are correlated in time or space. Analysis of such data without taking these dependencies into account can lead to meaningless results, and examples can easily be found in the neuroscience literature. A single issue of Nature Neuroscience provided a number of examples and is used as a case study to highlight how pseudoreplication arises in neuroscientific studies, why the analyses in these papers are incorrect, and appropriate analytical methods are provided. 12% of papers had pseudoreplication and a further 36% were suspected of having pseudoreplication, but it was not possible to determine for certain because insufficient information was provided. Pseudoreplication can undermine the conclusions of a statistical analysis, and it would be easier to detect if the sample size, degrees of freedom, the test statistic, and precise p-values are reported. This information should be a requirement for all publications.
Article
Full-text available
Hurlbert (1984) laid out a rationale for why experimental designs need to consider statistical problems related to pseudoreplication of data. This paper has been a cornerstone for more proper experimental design in many fields. Nonetheless, Schank and Koehnle (2009) argue that pseudoreplication is no longer a relevant issue. We disagree with this conclusion. We list a number of distortions of the original Hurlbert paper that are found in Schank and Koehnle. In addition, we note that some of the original suggestions in Hurlbert are no longer valid because statistical techniques have inevitably evolved to provide stronger tests than were available 25 years ago.
Article
Full-text available
Mixed-effect models are frequently used to control for the nonindependence of data points, for example, when repeated measures from the same individuals are available. The aim of these models is often to estimate fixed effects and to test their significance. This is usually done by including random intercepts, that is, intercepts that are allowed to vary between individuals. The widespread belief is that this controls for all types of pseudoreplication within individuals. Here we show that this is not the case, if the aim is to estimate effects that vary within individuals and individuals differ in their response to these effects. In these cases, random intercept models give overconfident estimates leading to conclusions that are not supported by the data. By allowing individuals to differ in the slopes of their responses, it is possible to account for the nonindependence of data points that pseudoreplicate slope information. Such random slope models give appropriate standard errors and are easily implemented in standard statistical software. Because random slope models are not always used where they are essential, we suspect that many published findings have too narrow confidence intervals and a substantially inflated type I error rate. Besides reducing type I errors, random slope models have the potential to reduce residual variance by accounting for between-individual variation in slopes, which makes it easier to detect treatment effects that are applied between individuals, hence reducing type II errors as well.
Article
Full-text available
The proper analysis of experiments using language materials has been a source of controversy and debate among researchers. We summarize the main issues and discuss the solutions that have been presented. Even though the major issues have been dealt with extensively in the literature, there still exists quite a bit of confusion about how to analyze the data from such experiments. We discuss a number of the most frequently voiced objections. In particular, we discuss the issue of what happens if in a counterbalanced design only some of the items show the treatment effect. Finally, a possible solution is discussed for the case where only partial matching of items between conditions is possible.
Article
Full-text available
The use of multilevel modeling is presented as an alternative to separate item and subject ANOVAs (F 1 ×F 2) in psycholinguistic research. Multilevel modeling is commonly utilized to model variability arising from the nesting of lower level observations within higher level units (e.g., students within schools, repeated measures within individuals). However, multilevel models can also be used when two random factors are crossed at the same level, rather than nested. The current work illustrates the use of the multilevel model for crossed random effects within the context of a psycholinguistic experimental study, in which both subjects and items are modeled as random effects within the same analysis, thus avoiding some of the problems plaguing current approaches.
Article
Linear mixed-effects models (LMEMs) have become increasingly prominent in psycholin-guistics and related areas. However, many researchers do not seem to appreciate how random effects structures affect the generalizability of an analysis. Here, we argue that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades. Through theoretical arguments and Monte Carlo simulation, we show that LMEMs generalize best when they include the maximal random effects structure justified by the design. The generalization performance of LMEMs including data-driven random effects structures strongly depends upon modeling criteria and sample size, yielding reasonable results on moderately-sized samples when conservative criteria are used, but with little or no power advantage over maximal models. Finally, random-intercepts-only LMEMs used on within-subjects and/or within-items data from populations where subjects and/or items vary in their sensitivity to experimental manipulations always generalize worse than separate F 1 and F 2 tests, and in many cases, even worse than F 1 alone. Maximal LMEMs should be the 'gold standard' for confirmatory hypothesis testing in psycholinguistics and beyond.
Article
This methodological paper attempts to bring the problem of pseudoreplication to the attention of the phonetic community. Pseudoreplication refers to the treatment of dependent observations as independent data points, which causes an overabundance of erroneously significant results. The relevance of this problem is demonstrated by analyses of phonetic data, and it is shown that the problem occurs frequently in the phonetic literature. Finally, simple solutions to combat pseudoreplication in the design and analysis of phonetic experiments are proposed.
Code
Statistical analysis is a useful skill for linguists and psycholinguists, allowing them to understand the quantitative structure of their data. This textbook provides a straightforward introduction to the statistical analysis of language. Designed for linguists with a non-mathematical background, it clearly introduces the basic principles and methods of statistical analysis, using ’R’, the leading computational statistics programme. The reader is guided step-by-step through a range of real data sets, allowing them to analyse acoustic data, construct grammatical trees for a variety of languages, quantify register variation in corpus linguistics, and measure experimental data using state-of-the-art models. The visualization of data plays a key role, both in the initial stages of data exploration and later on when the reader is encouraged to criticize various models. Containing over 40 exercises with model answers, this book will be welcomed by all linguists wishing to learn more about working with and presenting quantitative data.
Article
Although Clark's (1973) critique of statistical procedures in language and memory studies (the "language-as-fixed-effect fallacy") has had a profound effect on the way such analyses have been carried out in the past 20 years, it seems that the exact nature of the problem and the proposed solution have not been understood very well. Many investigators seem to assume that generalization to both the subject population and the language as a whole is automatically ensured if separate subject (F1) and item (F2) analyses are performed and that the null hypothesis may safely be rejected if these F values are both significant. Such a procedure is, however, unfounded and not in accordance with the recommendations of Clark (1973). More importantly and contrary to current practice, in many cases there is no need to perform separate subject and item analyses since the traditional F1 is the correct test statistic. In particular this is the case when item variability is experimentally controlled by matching or by counterbalancing.
Article
Current investigators of words, sentences, and other language materials almost never provide statistical evidence that their findings generalize beyond the specific sample of language materials they have chosen. Nevertheless, these same investigators do not hesitate to conclude that their findings are true for language in general. In so doing, it is argued, they are committing the language-as-fixed-effect fallacy, which can lead to serious error. The problem is illustrated for one well-known series of studies in semantic memory. With the appropriate statistics these studies are shown to provide no reliable evidence for most of the main conclusions drawn from them. A review of other experiments in semantic memory shows that many of them are likewise suspect. It is demonstrated how this fallacy can be avoided by doing the right statistics, selecting the appropriate design, and sampling by systematic procedures, or, alternatively, by proceeding according to the so-called method of single cases.
Article
In this exploratory sociophonetic study, we investigated the properties of formal and informal speech registers in Korean. We found that in formal speech, Korean male and female speakers lowered their average fundamental frequency and pitch range. The acoustic signal furthermore exhibited overall less variability, as evidenced by decreased fundamental frequency and intensity standard deviations, and decreased period and amplitude perturbations. Differences in speech registers affected Harmonics-to-Noise-ratio and the difference between the first and second harmonic as well, suggesting breathiness-related changes, and the speech was slower and included more non-lexical fillers such as ah and oh. Unexpectedly, formality also affected breathing patterns, leading to a noticeable increase in the amount of loud “hissing” breath intakes in formal speech. We thus show that a variety of different means of vocal expression play a role in signaling formality in Korean. Further, we outline the implications of this study for phonetic theory and discuss our results with respect to the Frequency Code and research on clear speech.
Book
This is an R package (a piece of Software) to fit and do inference on mixed-effects models. The package is Free Software (hence open-source) and the package and much documentation about it is freely available from CRAN at https://cran.r-project.org/package=lme4
Article
Clark's arguments for treating language materials as random rather than fixed effects are examined, and the problems with random effects designs and approximate statistical tests (quasiF-ratios) are reviewed. In view of the difficulties with Clark's recommended procedures and the present lack of knowledge regarding approximate tests, it is suggested that researchers use fixed factors, which are better understood statistically, and seek nonstatistical generality by means of replication.
Article
This paper provides an introduction to mixed-effects models for the analysis of repeated measurement data with subjects and items as crossed random effects. A worked-out example of how to use recent software for mixed-effects modeling is provided. Simulation studies illustrate the advantages offered by mixed-effects analyses compared to traditional analyses based on quasi-F tests, by-subjects analyses, combined by-subjects and by-items analyses, and random regression. Applications and possibilities across a range of domains of inquiry are discussed.
Article
The properties of four different tests of the treatment effect in experiments using linguistic materials are examined using Monte Carlo procedures for estimating Type I error rates. It is shown that: (a) in extreme cases, the Type I error rates for F1 and F2 can exceed the desired rate by a factor of at least 10; (b) minF′ tends to be a very close estimate of F′; (c) both minF′ and F′ are very conservative tests when between item variance or subject-by-treatment variance is low; (d) requiring both F1 and F2 to be significant before H0 is rejected does not prevent the nominal Type I error rate from being exceeded; (e) most of these problems can be minimized by using multistage decision rules which select the most appropriate test on the basis of preliminary tests of item variance and subject-by-treatment variance.
Article
Pseudoreplication is defined as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent. In ANOVA terminology, it is the testing for treatment effects with an error term inappropriate to the hypothesis being considered. Scrutiny of 176 experimental studies published between 1960 and the present revealed that pseudoreplication occurred in 27% of them, or 48% of all such studies that applied inferential statistics. The incidence of pseudoreplication is especially high in studies of marine benthos and small mammals. The critical features of controlled experimentation are reviewed. Nondemonic intrusion is defined as the impingement of chance events on an experiment in progress. As a safeguard against both it and preexisting gradients, interspersion of treatments is argued to be an obligatory feature of good design. Especially in small experiments, adequate interspersion can sometimes be assured only by dispensing with strict randomization procedures. Comprehension of this conflict between interspersion and randomization is aided by distinguishing pre-layout (or conventional) and layout-specific alpha (probability of type I error). Suggestions are offered to statisticians and editors of ecological journals as to how ecologists' understanding of experimental design and statistics might be improved.
Book
Linear Mixed-Effects * Theory and Computational Methods for LME Models * Structure of Grouped Data * Fitting LME Models * Extending the Basic LME Model * Nonlinear Mixed-Effects * Theory and Computational Methods for NLME Models * Fitting NLME Models
Chapter
Data Analysis Using Regression and Multilevel/Hierarchical Models, first published in 2007, is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout.
lme4: Linear mixed--effects models using S4 classes. R package version 0
  • D M Bates
  • M Maechler
  • B Bolker
Bates, D.M., Maechler, M., & Bolker, B. (2012). lme4: Linear mixed--effects models using S4 classes. R package version 0.999999--0.
Analyzing Linguistic Data: A Practical Introduction to
  • R H Baayen
Baayen, R.H. (2008). Analyzing Linguistic Data: A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.