Russell D. Gray’s research while affiliated with University of Auckland and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (27)


Author Correction: The evolutionary dynamics of how languages signal who does what to whom
  • Article
  • Full-text available

July 2024

·

80 Reads

·

Damián E. Blasi

·

·

[...]

·

Download

World science and Indigenous knowledge

July 2024

·

193 Reads

·

3 Citations

Science

Indigenous knowledge—although it contains empirical and cultural knowledge of great value—should be taught as a distinct subject or as aspects of other subjects, not “alongside” science in science classes. Placing science and Indigenous knowledge alongside each other does disservice to the coherence and understanding of both.


Locations of languages used in the phylogenetic analysis.
50% Majority-rule consensus tree of the phylogenetic analysis of Philippine languages. Formosan outgroup not shown on figure.
of relationships between the major strongly supported Philippine subgroups from the posterior sample of trees, in descending order of support. The number above each tree indicates posterior support. The 20 tree topologies shown together represent a 73% credible set.
Cognate sets supporting either an early branching position for Gorontalo-Mongondow (top, left_1 to egg_17) or a Central Philippine (bottom, wing_2 to blood_16) position. 50% highest posterior density intervals of the likelihood of cognates in analyses with one of the two topologies constrained. Likelihoods are relative to the median in the early branching analysis. Only cognate sets with non-overlapping HPD intervals between the two analyses are shown.
Language switching to Austronesian leaves no noticeable impression on the core vocabulary lexicon of the languages of ‘Negrito’ groups. (A) Phylogenetic tree (Maximum Clade Credibility tree) with branches on which language switching is inferred to have occurred highlighted in red. (B) Branch rates taken from the MCC tree (dot plot with random jitter). (C) Violin plot of the posterior distribution of the weighted mean rate of lexical evolution estimated on branches within Austronesian populations, within ‘Negrito’ populations and at the transition.

+1

Bayesian phylogenetic analysis of Philippine languages supports a rapid migration of Malayo-Polynesian languages

June 2024

·

211 Reads

·

2 Citations

The Philippines are central to understanding the expansion of the Austronesian language family from its homeland in Taiwan. It remains unknown to what extent the distribution of Malayo-Polynesian languages has been shaped by back migrations and language leveling events following the initial Out-of-Taiwan expansion. Other aspects of language history, including the effect of language switching from non-Austronesian languages, also remain poorly understood. Here we apply Bayesian phylogenetic methods to a core-vocabulary dataset of Philippine languages. Our analysis strongly supports a sister group relationship between the Sangiric and Minahasan groups of northern Sulawesi on one hand, and the rest of the Philippine languages on the other, which is incompatible with a simple North-to-South dispersal from Taiwan. We find a pervasive geographical signal in our results, suggesting a dominant role for cultural diffusion in the evolution of Philippine languages. However, we do find some support for a later migration of Gorontalo-Mongondow languages to northern Sulawesi from the Philippines. Subsequent diffusion processes between languages in Sulawesi appear to have led to conflicting data and a highly unstable phylogenetic position for Gorontalo-Mongondow. In the Philippines, language switching to Austronesian in ‘Negrito’ groups appears to have occurred at different time-points throughout the Philippines, and based on our analysis, there is no discernible effect of language switching on the basic vocabulary.


12 competing causal models representing the potential relationships between case and two word order features, verb-final word order and flexible word order. Model i represents the diachronic scenarios inferred from processing hypothesis, noisy-channel hypothesis, and the word order universal. Model j stems from a theory of licensing and structural case and the results of empirical corpus studies. Other models represent the inverted directions of these models (k and l) and the possible combinations of causal paths between three or two features (a-h) that additionally test for the potential indirect relationship of one of word order features on case.
The averaged standardized regression coefficients (ranging from 0.25 to 1.58) of the best-fitting models with their Confidence Intervals. The coefficient values range from 0.1 to 0.41 for Flexible → Case, from 1.34 to 1.83 for Case → Flexible, from 0.58 to 1.16 for Verb-final → Case, and from 0.24 to 0.59 for Case → Verb-final, where values indicate how strongly the features are correlated. All identified causal links are positive and robust.
Maps showing the distribution of case and word order patterns in the language of the world (represented by colored dots); yellow: case without the word order type indicated in the map (11% of languages without verb-final and 17% of languages without flexible word order); blue: word order type indicated in the map without case (15% of verb-final languages and 22% of flexible word order languages); green: presence of case and word order type indicated in the map (22% of languages have case and verb-final word order and 16% of languages have case and flexible word order); gray: absence of case and word order type indicated in the map (in 52% of languages verb-final and case are absent, and in 46% flexible word order and case are absent). The top map depicts the combination of case with verb-final word order, while the bottom map illustrates case with flexible word order. The Indian subcontinent and the Caucasus region contain languages that almost exclusively combine case and verb-final word order. Many languages that combine case and flexible word order are located in South America, Eurasia, the Indian subcontinent, and Australia.
Phylogeny (global tree on the left) and the distribution of case and word order patterns (colored blocks on the right); yellow: case without the word order type indicated in the column header; blue: word order type indicated in the column header without case; green: presence of case and word order type indicated in the column header; gray: absence of case and word order type indicated in the column header. Verb-final word order is a more stable feature than flexible word order, which is visible from larger blocks of color sequences across languages representing its presence (blue and green) or absence (yellow and gray). By contrast, flexible word order can be present or absent in groups of closely related languages.
The evolutionary dynamics of how languages signal who does what to whom

March 2024

·

214 Reads

·

2 Citations

Languages vary in how they signal “who does what to whom”. Three main strategies to indicate the participant roles of “who” and “whom” are case, verbal indexing, and rigid word order. Languages that disambiguate these roles with case tend to have either verb-final or flexible word order. Most previous studies that found these patterns used limited language samples and overlooked the causal mechanisms that could jointly explain the association between all three features. Here we analyze grammatical data from a Grambank sample of 1705 languages with phylogenetic causal graph methods. Our results corroborate the claims that verb-final word order generally gives rise to case and, strikingly, establish that case tends to lead to the development of flexible word order. The combination of novel statistical methods and the Grambank database provides a model for the rigorous testing of causal claims about the factors that shape patterns of linguistic diversity.


Climate, Climate Change and the Global Diversity of Human Houses

March 2024

·

148 Reads

Evolutionary Human Sciences

Globally, human house types are diverse, varying in shape, size, roof type, building materials, arrangement, decoration, and many other features. Here we offer the first rigorous, global evaluation of the factors that influence the construction of traditional (vernacular) houses. We apply macroecological approaches to analyze data describing house features from 1900 to 1950 across 1000 societies. Geographic, social and linguistic descriptors for each society were used to test the extent to which key architectural features may be explained by the biophysical environment, social traits, house features of neighbouring societies, or cultural history. We find strong evidence that some aspects of the climate shape house architecture, including floor height, wall material, and roof shape. Other features, particularly ground plan, appear to also be influenced by social attributes of societies, such as whether a society is nomadic, polygynous, or politically complex. Additional variation in all house features was predicted both by the practices of neighboring societies and by a society's language family. Collectively, the findings from our analyses suggest those conditions under which traditional houses offer solutions to architects seeking to reimagine houses in light of warmer, wetter or more variable climates.


Figure 2. Example of (a) topographic, (b) temperature, and (c) aridity least-cost path between the locations of two societies (starting point is marked by a star). Dark colours on the cost maps indicate cells of high cost -i.e., high elevation (a), or high differences in temperature harshness (b) and aridity index (c) as compared to the starting point. Environmental heterogeneity along the least-cost path connecting two societies in a pair is estimated as the log value of accumulated cost and log value of path length.
Figure 3. Estimated effects of principal components representing environmental and travel costs to cultural similarity. Each point denotes an independent model. The points' values represent the estimated effect of the principal component associated with the environmental barriers listed on the y-axis label; arrows denote the size of standard errors. Models are binned into broad cultural trait categories (1 -8) according to their corresponding response variables. If environmental dissimilarities are associated with a lowered potential for cultural similarity, we expect to see negative values for estim ated effects. Non-significant relationships are depicted with grey, whereas significant effects are shown in red (positive) and blue (negative). P-values are adjusted for false discovery rates.
Geography is not destiny: A quantitative test of Diamond's axis of orientation hypothesis

January 2024

·

259 Reads

·

1 Citation

Evolutionary Human Sciences

Jared Diamond suggested that the unique East-West orientation of Eurasia facilitated the spread of cultural innovations and gave it substantial political, technological, and military advantages over other continental regions. This controversial hypothesis assumes that innovations can spread more easily across similar habitats, and that environments tend to be more homogeneous at similar latitudes. The resulting prediction is that Eurasia is home to environmentally homogenous corridors that enable fast cultural transmission. Despite indirect evidence supporting Diamond's influential hypothesis, quantitative tests of its underlying assumptions are currently lacking. Here we address this critical gap by leveraging ecological, cultural, and linguistic datasets at a global scale. Our analyses show that although societies that share similar ecologies are more likely to share cultural traits, the Eurasian continent is not significantly more ecologically homogeneous than other continental regions. Our findings highlight the perils of single factor explanations and remind us that even the most compelling ideas must be thoroughly tested to gain a solid understanding of the complex history of our species.


Fig. 1. The global distribution of fusion and informativity scores. The scores with a minimum of 0 (absence of all metric features) and a maximum of 1 (presence of all metric features) have been standardized to a mean of 0 and a variance of 1. The hotspots of low fusion are located in West Africa and Southeast Asia. Many Austronesian languages also rank low on fusion. The geographic patterns of informativity scores are less clear compared to fusion. Among lower-scoring languages are those spoken in West Africa, Southeast Asia, many Uralic languages, and languages spoken in India (Indo-Aryan and Dravidian).
Fig. 3. The scores of fusion and informativity on the global tree. The scores with a minimum of 0 (absence of all metric features) and a maximum of 1 (presence of all metric features) have been standardized to a mean of 0 and a variance of 1. We detect many patterns of closely related languages scoring similarly, which might indicate the faithful transmission of grammatical complexity from ancestor languages to their descendants rather than large-scale adaptations of grammatical complexity to changes in sociodemographic factors. Similar to geographic distribution, we see that fusion scores follow a more defined pattern of phylogenetic clustering compared to informativity scores.
Societies of strangers do not speak less complex languages

August 2023

·

352 Reads

·

17 Citations

Science Advances

Many recent proposals claim that languages adapt to their environments. The linguistic niche hypothesis claims that languages with numerous native speakers and substantial proportions of nonnative speakers (societies of strangers) tend to lose grammatical distinctions. In contrast, languages in small, isolated communities should maintain or expand their grammatical markers. Here, we test these claims using a global dataset of grammatical structures, Grambank. We model the impact of the number of native speakers, the proportion of nonnative speakers, the number of linguistic neighbors, and the status of a language on grammatical complexity while controlling for spatial and phylogenetic autocorrelation. We deconstruct "grammatical complexity" into two separate dimensions: how much morphology a language has ("fusion") and the amount of information obligatorily encoded in the grammar ("informativity"). We find several instances of weak positive associations but no inverse correlations between grammatical complexity and sociodemographic factors. Our findings cast doubt on the widespread claim that grammatical complexity is shaped by the sociolinguistic environment.


Paul Heggarty, Cormac Anderson, Denise Kühnert, Russell D. Gray [18 co-authors, Matilde Serangeli, 9 co-authors]. Languages trees with sampled ancestors support an early origin of the Indo-European languages.

July 2023

·

8,215 Reads

·

36 Citations

Science

**To download free**, follow the info at: https://iecor.clld.org — The origins of the Indo-European language family are hotly disputed. Bayesian phylogenetic analyses of core vocabulary have produced conflicting results, with some supporting a farming expansion out of Anatolia ~9000 years before present (yr B.P.), while others support a spread with horse-based pastoralism out of the Pontic-Caspian Steppe ~6000 yr B.P. Here we present an extensive database of Indo-European core vocabulary that eliminates past inconsistencies in cognate coding. Ancestry-enabled phylogenetic analysis of this dataset indicates that few ancient languages are direct ancestors of modern clades and produces a root age of ~8120 yr B.P. for the family. Although this date is not consistent with the Steppe hypothesis, it does not rule out an initial homeland south of the Caucasus, with a subsequent branch northward onto the steppe and then across Europe. We reconcile this hybrid hypothesis with recently published ancient DNA evidence from the steppe and the northern Fertile Crescent.


Fig. 1. Variance explained by phylogeny and geography. Each point is a Grambank feature. The panels represent different domains of grammar that the features are associated with: (A) clausal, (B) nominal domain, (C) pronominal domain, and (D) verbal domain. A high value indicates that a large part of the variance is explained by either space (y axis) or phylogeny (x axis). The ellipses represent the standard deviation of the joint posterior, tilted for the covariance.
Fig. 2. Grammatical similarity in the Grambank sample of languages. The color coding represents the distribution of languages according to the first three principal components (PCs) mapped onto RGB color space (PC1, red; PC2, green; PC3, blue). Similarity in color indicates similarity in grammatical structure on the first three dimensions. See fig. S15 for loading of Grambank features on the first two components and fig. S16 for correlation with theoretical metrics.
Fig. 3. Distribution of the 12 largest families in our dataset in Grambank design space. The x axis represents the first principal component (PC1), and the y axis represents the second principal component (PC2). All languages are plotted, and for each facet, one family is highlighted in a different color. Austronesian languages, which are known for lacking gender and having little morphology, are found on the far left.
Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss

April 2023

·

1,071 Reads

·

62 Citations

Science Advances

While global patterns of human genetic diversity are increasingly well characterized, the diversity of human languages remains less systematically described. Here, we outline the Grambank database. With over 400,000 data points and 2400 languages, Grambank is the largest comparative grammatical database available. The comprehensiveness of Grambank allows us to quantify the relative effects of genealogical inheritance and geographic proximity on the structural diversity of the world's languages, evaluate constraints on linguistic diversity, and identify the world's most unusual languages. An analysis of the consequences of language loss reveals that the reduction in diversity will be strikingly uneven across the major linguistic regions of the world. Without sustained efforts to document and revitalize endangered languages, our linguistic window into human history, cognition, and culture will be seriously fragmented.


Fig. 1. 50% Majority-rule consensus tree of the phylogenetic analysis of Philippine languages. Formosan outgroup not shown on figure.
Fig. 5. Scenario for the accumulation of the Gorontalo-Mongondow lexicon in three phases. Lexicon exemplified by the forms from Bolaang Mongondow. The bottom panels show examples of phylogenetic trees (one example for each panel, selected from Fig. 2) that each of the three levels of lexical data would support. A) First divergence of Philippine languages; languages in the Philippines proper acquire cognate sets not found in Sangiric/Minahasan. B) Philippine languages split into a Northern and a South-Central group ("Greater Central Philippines"). C) Gorontalo-Mongondow migrates to Northern Sulawesi, and shared cognate sets with Sangiric/Minahasan spread by diffusion. Dates based on median branching time of A) Sangiric/Minahasan from other languages in trees where this is the first split (pp=0.91), B) Splitting of "GCP" from other languages in trees where GCP is recovered (pp=0.43) and C) splitting of Gorontalo-Mongondow and its closest relative, in trees where GCP is recovered.
Bayesian phylogenetic analysis of Philippine languages supports a rapid migration of Malayo-Polynesian languages.

March 2023

·

538 Reads

·

1 Citation

The Philippines are central to understanding the expansion of the Austronesian language family from its homeland in Taiwan. It remains unknown to what extent the distribution of Malayo-Polynesian languages has been shaped by back migrations and language leveling events following the initial Out-of-Taiwan expansion. Other aspects of language history, including the effect of language switching from non-Austronesian languages, also remain poorly understood. Here we apply Bayesian phylogenetic methods to a core-vocabulary dataset of Philippine languages. Our analysis strongly supports a sister group relationship between the Sangiric and Minahasan groups of Northern Sulawesi on one hand, and the rest of the Philippine languages on the other, which is incompatible with a simple North-to-South dispersal from Taiwan. We find a pervasive geographical signal in our results, suggesting a dominant role for cultural diffusion in the evolution of Philippine languages. However, we do find some support for a later migration of Gorontalo-Mongondow languages to Northern Sulawesi from the Philippines. Subsequent diffusion processes between languages in Sulawesi appear to have led to conflicting data and a highly unstable phylogenetic position for Gorontalo-Mongondow. In the Philippines, language switching to Austronesian in ‘Negrito’ groups appears to have occurred at different time-points throughout the Philippines, and based on our analysis, there is no discernible effect of language switching on the basic vocabulary.


Citations (17)


... On one hand, graduate students are encouraged to engage in collaborative practices by changes to the Social Sciences and Humanities Research Council funding priorities which emphasize collaborative work with Indigenous peoples [13] and the growth of Indigenous scholarship on Indigenous and collaborative research methodologies [3,5]. On the other hand, University programs can present road blocks such as continued emphasis on fast project completion rates and the continued lack of attention to creating funding models that include funds for research dissemination to community partners, despite this being pointed out as problematic since the early 1980s [26,41]. Sandra Harding, in an address to the University of Toronto in 2015, suggested that "the real sticking point here is [that] researchers have to give up control of the research project . . . the design and management of the research project has to be negotiated with the people who are the major stakeholders in the questions being asked . . . ...

Reference:

Aligning Intentions with Community: Graduate Students Reflect on Collaborative Methodologies with Indigenous Research Partners
World science and Indigenous knowledge

Science

... The Austronesian expansion refers to the movement of Austronesian language speakers from Taiwan into the Philippines and the rest of the Asia Pacific around 4 kya [34][35][36][37][38]. Linguistic support for this expansion includes similarities in extant languages [39], with Filipino groups sharing varying genetic affinities with Austronesian-speaking groups in Taiwan and in the Asia Pacific [31,35,36]. The period of the initial peopling of the Philippines, estimated to be about 50 kya [40] and subsequent interaction with other groups, local and foreign, significantly influenced the selection of their so-called 'ancestral land', which is an integral part of the life history of ICC/IPs [41]. ...

Bayesian phylogenetic analysis of Philippine languages supports a rapid migration of Malayo-Polynesian languages

... The unevenness of this underlying landscape makes some mutations more probable and frequent than others, leading to a reliance on the reuse of old forms to serve new functions. Emergentist accounts in this area have emphasized the ways in which language, society and cognition have undergone co-evolution (MacWhinney 2002) based on the linking of dynamic systems. To trigger this co-evolutionary advantage, changes in linguistic abilities must arise in parallel with advances in cognitive or social abilities. ...

Cultural Evolution of Language
  • Citing Chapter
  • November 2013

... At the same time, new forms of spirituality, such as New Age beliefs and secular humanism, have emerged in response to the complexities of modern life. The evolution of religion in the contemporary world reflects a diverse tapestry of beliefs, practices, and worldviews, highlighting the ongoing quest for meaning, purpose, and transcendence in an ever-changing society (Bulbulia, J., (2013). [4]). ...

The Cultural Evolution of Religion
  • Citing Chapter
  • November 2013

... Additionally, languages originating from the same regions often tend to be influenced by common factors, further complicating the analysis [49][50][51]. While we have included language family, macro-area and country as factors to account for the genealogical and geographic relatedness of languages in our prior paper, this approach ignores variation within language families and geographical units as pointed out in several recent studies [49][50][51][52][53]. To address this issue, we develop two quantitative approaches: (i) a semiparametric machine learning estimation method capable of simultaneously controlling for document-and language-specific characteristics while directly modelling potential effects due to phylogenetic relatedness and geographic proximity; (ii) a multi-model multilevel inference approach designed to test whether cross-linguistic outcomes are statistically associated with sociodemographic factors, while accounting for phylogenetic and spatial autocorrelation via the inclusion of random effects and slopes. ...

Societies of strangers do not speak less complex languages

Science Advances

... 6,500 years BP (Gimbutas 1970;Mallory, 1989;Anthony, 2007) The alternative "Anatolian" hypothesis proposes that the expansion originated from Anatolia region and occurred much earlier, around 9,500-8,500 years BP (Renfrew, 1987;Bellwood, 2005). In addition to that, one of the recent studies proposed a hybrid model: here, the primary homeland is seen south of the Caucasus (as in the "Anatolian" hypothesis) while the later secondary homeland is located in the steppe region (in line with the "Steppe" hypothesis) (Heggarty et al., 2023). As for development of the languages that are of primary importance for our study, it is widely accepted that the Baltic languages grew out of the Balto-Slavic branch of the Proto-Indo-European, while the Indo-European languages of India are surely known to have evolved from the Indo-Iranian branch. ...

Paul Heggarty, Cormac Anderson, Denise Kühnert, Russell D. Gray [18 co-authors, Matilde Serangeli, 9 co-authors]. Languages trees with sampled ancestors support an early origin of the Indo-European languages.

Science

... We present the two sources per language in Table 2. In order to make GATA maximally compatible with other cross-linguistic databases, we adopt the Cross-Linguistic Data Formats (CLDF) 29,30 . This framework supports sharing, re-use and comparison of data in a cross-linguistic framework. ...

Managing Historical Linguistic Data for Computational Phylogenetics and Computer-Assisted Language Comparison

... The languages are evaluated as more similar if their phoneme distributions are alike. In the typology-based similarity assessment, we examine how similar the typological features are using the Grambank dataset [20], which numerically records the typological characteristics of languages. In this study, we primarily utilize corpus-based similarity assessment, while typology-based similarity evaluation is employed as a supplementary method to examine how well it aligns with the similarity evaluation of data within the same language family. ...

Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss

Science Advances

... For decades, researchers debated the curious juxtaposition of the Tepiman and Opata-Cahitan branches of the larger southern Uto-Aztecan family that intermingle in this region. Most now accept Tepiman speakers as later arrivals, but their presence is sufficiently ancient that we should assume both groups were regionally resident for the entirety of the concerned period (Greenhill et al. 2023), ca. A.D. 1000-1500. ...

A recent northern origin for the Uto-Aztecan family

Language

... Also other domains of language, such as phonology (Blaxter, 2017), phonotactics (Baumann & Matzinger, 2021;Napoleão de Souza & Sinnemäki, 2022) and syntax (Benítez-Burraco, S. Chen, Gil, Gaponov, et al., 2024) have been shown to be influenced by societal factors, such as language contact. At the same time, there is still discussion if and how language contact causes morphological simplification, from both experimental (Cuskley et al., 2015;De Smet, Rosseel, & Van De Velde, 2022) and quantitative cross-linguistic studies (Kauhanen, Einhaus, & Walkden, 2023;Koplenig, 2019;Lupyan & Raviv, 2024;Shcherbakova et al., 2023). In any case, instead of just looking at correlations between proportions of L2 speakers and morphological complexity, it is worthwhile to study which specific language-internal and sociodemographic factors mediate the relationship between social and language structure (Sinnemäki, 2020;Sinnemäki & Di Garbo, 2018). ...

Societies of strangers do not speak grammatically simpler languages