Preprint

Spatial evolution of human dialects

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the author.

Abstract

The geographical pattern of human dialects is a result of history. Here, we formulate a simple spatial model of language change which shows that the final result of this historical evolution may, to some extent, be predictable. The model shows that the boundaries of language dialect regions are controlled by a length minimizing effect analogous to surface tension, mediated by variations in population density which can induce curvature, and by the shape of coastline or similar borders. The predictability of dialect regions arises because these effects will drive many complex, randomized early states toward one of a smaller number of stable final configurations. The model is able to reproduce observations and predictions of dialectologists. These include dialect continua, isogloss bundling, fanning, the wave-like spread of dialect features from cities, and the impact of human movement on the number of dialects that an area can support. The model also provides an analytical form for S\'{e}guy's Curve giving the relationship between geographical and linguistic distance, and a generalisation of the curve to account for the presence of a population centre. A simple modification allows us to analytically characterize the variation of language use by age in an area undergoing linguistic change.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The use of apparent time differences to study language change in progress has been a basic analytical construct in quantitative sociolinguistics for over 30 years. The basic assumption underlying the construct is that, unless there is evidence to the contrary, differences among generations of similar adults mirror actual diachronic developments in a language: the speech of each generation is assumed to reflect the language more or less as it existed at the time when that generation learned the language. In providing a mirror of real time change, apparent time forms the basis of a conceptual framework for exploring language change in progress. However, the basic assumptions that underlie apparent time have never been fully tested. This article tests those assumptions by comparing apparent time data from two recent random sample telephone surveys of Texas speech with real time data from the Linguistic Atlas of the Gulf States, which was conducted some 15 years before the telephone surveys. The real time differences between the linguistic atlas data and the data from the telephone surveys provide strong support for the apparent time construct. Whenever apparent time data in the telephone surveys clearly suggest change in progress, the atlas data show substantially fewer innovative forms. Whenever the apparent time data suggest stable variation, the atlas data are virtually identical to that from the more recent surveys. Whenever the relationships between real and apparent time data are unclear, sorting out mitigating factors, such as nativity and subregional residence, clarifies and confirms the relationships. The results of our test of the apparent time construct suggest that it is unquestionably a valid and useful analytical tool.
Article
Full-text available
Historically, infectious diseases caused considerable damage to human societies, and they continue to do so today. To help reduce their impact, mathematical models of disease transmission have been studied to help understand disease dynamics and inform prevention strategies. Vaccination - one of the most important preventive measures of modern times - is of great interest both theoretically and empirically. And in contrast to traditional approaches, recent research increasingly explores the pivotal implications of individual behavior and heterogeneous contact patterns in populations. Our report reviews the developmental arc of theoretical epidemiology with emphasis on vaccination, as it led from classical models assuming homogeneously mixing (mean-field) populations and ignoring human behavior, to recent models that account for behavioral feedback and/or population spatial/social structure. Many of the methods used originated in statistical physics, such as lattice and network models, and their associated analytical frameworks. Similarly, the feedback loop between vaccinating behavior and disease propagation forms a coupled nonlinear system with analogs in physics. We also review the new paradigm of digital epidemiology, wherein sources of digital data such as online social media are mined for high-resolution information on epidemiologically relevant individual behavior. Armed with the tools and concepts of statistical physics, and further assisted by new sources of digital data, models that capture nonlinear interactions between behavior and disease dynamics offer a novel way of modeling real-world phenomena, and can help improve health outcomes. We conclude the review by discussing open problems in the field and promising directions for future research.
Article
Full-text available
Online social media has greatly affected the way in which we communicate with each other. However, little is known about what fundamental mechanisms drive dynamical information flow in online social systems. Here, we introduce a generative model for online sharing behavior that is analytically tractable and that can reproduce several characteristics of empirical micro-blogging data on hashtag usage, such as (time-dependent) heavy-tailed distributions of meme popularity. The presented framework constitutes a null model for social spreading phenomena that, in contrast to purely empirical studies or simulation-based models, clearly distinguishes the roles of two distinct factors affecting meme popularity: the memory time of users and the connectivity structure of the social network.
Article
Full-text available
Dialectometry applies computational and statistical analyses within dialectology, making work more easily replicable and understandable. This survey article first reviews the field briefly in order to focus on developments in the past five years. Dialectometry no longer focuses exclusively on aggregate analyses, but rather deploys various techniques to identify representative and distinctive features with respect to areal classifications. Analyses proceeding explicitly from geostatistical techniques have just begun. The exclusive focus on geography as explanation for variation has given way to analyses combining geographical, linguistic, and social factors underlying language variation. Dialectometry has likewise ventured into diachronic studies and has also contributed theoretically to comparative dialectology and the study of dialect diffusion. Although the bulk of research involves lexis and phonology, morphosyntax is receiving increasing attention. Finally, new data sources and new (online) analytical software are expanding dialectometry's remit and its accessibility.
Article
Full-text available
In our earlier work, an approach to defining dialect areas using multidimensional scaling (MDS) of the total collection of available raw data (from a region of Romania) has produced results that showed 'some' but 'not all' of the dialect distinctions that were anticipated. To investigate this situation, we have extended our approach in two ways, one methodological and one technical. Methodologically, we have switched from looking at raw data to examining interpretive maps based on recognized dialect distinctions. Further, we have categorized these interpretations as phonetic (regular and irregular), morphophonemic, morphological, and lexical, examining each category separately. The result is a much clearer set of dialect distinctions, as seen in the MDS pictures. However, the dialect distinctions vary by category, leading us to make suggestions about the role of each category in defining the notion of dialect. Our technical extension is the creation and use of a 3D viewer for looking at the MDS pictures. We project the linguistic-distance space into three, instead of two, dimensions, and manipulate the resulting structure interactively, thus uncovering and eliminating any accidental 'closeness', as sometimes happens in the 2D case. Strikingly, the resulting 3D objects seem to be very flat, which strongly suggests that there are only two relevant dimensions for distinguishing these dialects, although the two dimensions do not correspond exclusively to geographic dimensions. The result of these extensions is that the multidimensional approach becomes even more viable as a way of selecting dialect and dialect-transition areas, and perhaps more accessible for use with languages and dialects beyond our own study area.
Article
Full-text available
Containing the spreading of crime in urban societies remains a major challenge. Empirical evidence suggests that, left unchecked, crimes may be recurrent and proliferate. On the other hand, eradicating a culture of crime may be difficult, especially under extreme social circumstances that impair the creation of a shared sense of social responsibility. Although our understanding of the mechanisms that drive the emergence and diffusion of crime is still incomplete, recent research highlights applied mathematics and methods of statistical physics as valuable theoretical resources that may help us better understand criminal activity. We review different approaches aimed at modeling and improving our understanding of crime, focusing on the nucleation of crime hotspots using partial differential equations, self-exciting point process and agent-based modeling, adversarial evolutionary games, and the network science behind the formation of gangs and large-scale organized crime. We emphasize that statistical physics of crime can relevantly inform the design of successful crime prevention strategies, as well as improve the accuracy of expectations about how different policing interventions should impact malicious human activity deviating from social norms. We also outline possible directions for future research, related to the effects of social and coevolving networks and to the hierarchical growth of criminal structures due to self-organization.
Article
Full-text available
In this paper we combine statistical analysis of large text databases and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. Besides the sublinear scaling of the vocabulary size with database size (Heaps' law), here we report a new scaling of the fluctuations around this average (fluctuation scaling analysis). We explain both scaling laws by modeling the usage of words by simple stochastic processes in which the overall distribution of word-frequencies is fat tailed (Zipf's law) and the frequency of a single word is subject to fluctuations across documents (as in topic models). In this framework, the mean and the variance of the vocabulary size can be expressed as quenched averages, implying that: i) the inhomogeneous dissemination of words cause a reduction of the average vocabulary size in comparison to the homogeneous case, and ii) correlations in the co-occurrence of words lead to an increase in the variance and the vocabulary size becomes a non-self-averaging quantity. We address the implications of these observations to the measurement of lexical richness. We test our results in three large text databases (Google-ngram, Enlgish Wikipedia, and a collection of scientific articles).
Article
Full-text available
The nineteenth century Russian author Leo Tolstoy based his egalitarian views on sociology and history on mathematical and probabilistic views, and he also proposed a mathematical theory of waging war.
Article
Full-text available
The processes leading to change in languages are manifold. In order to reduce ambiguity in the transmission of information, agreement on a set of conventions for recurring problems is favored. In addition to that, speakers tend to use particular linguistic variants associated with the social groups they identify with. The influence of other groups propagating across the speech community as new variant forms sustains the competition between linguistic variants. With the utterance selection model, an evolutionary description of language change, Baxter et al. [Phys. Rev. E 73, 046118 (2006)] have provided a mathematical formulation of the interactions inside a group of speakers, exploring the mechanisms that lead to or inhibit the fixation of linguistic variants. In this paper, we take the utterance selection model one step further by describing a speech community consisting of multiple interacting groups. Tuning the interaction strength between groups allows us to gain deeper understanding about the way in which linguistic variants propagate and how their distribution depends on the group partitioning. Both for the group size and the number of groups we find scaling behaviors with two asymptotic regimes. If groups are strongly connected, the dynamics is that of the standard utterance selection model, whereas if their coupling is weak, the magnitude of the latter along with the system size governs the way consensus is reached. Furthermore, we find that a high influence of the interlocutor on a speaker's utterances can act as a counterweight to group segregation.
Article
Full-text available
The range of dialectometric methods suggests the need for validation work. We propose a gold standard, based on the consensual classification of a well-studied area. Fidelity to the gold standard is assessed via matrix overlap measures (Rand and Fowlkes/Mallows). Word-based techniques in which varieties are compared to each other directly emerge as superior.
Article
Full-text available
This article illustrates the utility of a variety of quantitative techniques by applying them to phonetic data from the traditional English dialects. The techniques yield measures of variation in phonetic usage among English localities, identify dialect regions as clusters of localities with relatively similar patterns of usage, distinguish regions of relative uniformity from transitional zones with substantially greater variation, and identify regionally coherent groups of features that can be used to distinguish some dialect regions. Complementing each other, the techniques provide a reasonably objective method for classifying at least some traditional English dialect regions on the basis of characteristic features. The results largely corroborate standard presentations in the literature but differ in the placement of regional boundaries and identification of regional features, as well as in placing those systemic elements in a broader context of largely continuous and often random variation.
Article
Full-text available
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This "cooling pattern" forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
Article
Full-text available
We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core-words which have higher frequency and do not affect the probability of a new word to be used; and (ii) the remaining virtually infinite number of noncore-words which have lower frequency and once used reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the google-ngram database of books published in the last centuries and its main consequence is the generalization of Zipf's and Heaps' law to two scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model the main change on historical time scales is the composition of the specific words included in the finite list of core-words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.
Article
Full-text available
We study the ferromagnetic Ising model with long-range interactions in two dimensions. We first present results of a Monte Carlo study which shows that the long-range interactions dominate over the short-range ones in the intermediate regime of interaction range. Based on a renormalization group analysis, we propose a way of computing the influence of the long-range interactions as a dimensional change.
Article
Full-text available
A meta-analysis of conformity studies using an Asch-type line judgment task (1952, 1956) was conducted to investigate whether the level of conformity has changed over time and whether it is related cross-culturally to individualism–collectivism. The literature search produced 133 studies drawn from 17 countries. An analysis of US studies found that conformity has declined since the 1950s. Results from 3 surveys were used to assess a country's individualism–collectivism, and for each survey the measures were found to be significantly related to conformity. Collectivist countries tended to show higher levels of conformity than individualist countries. Conformity research must attend more to cultural variables and to their role in the processes involved in social influence. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Although variationists have explored the diffusion of linguistic changes from one social group to another and from one linguistic environment to another in some detail, they have done much less work on the spatial diffusion of changes. In fact, Trudgill's (1974, 1975) use of Hagerstrand's gravity model to explain in hierarchical diffusion of innovations in East Anglia and Norway was the only systematic account of spatial diffusion in the literature. This article uses data from the random sample telephone survey portion of a Survey of Oklahoma Dialects (SOD) to explore the spatial diffusion of linguistic innovations in Oklahoma. It analyzes that data using a variety of techniques of computer cartography and the General Linear Models procedure in SAS. The data clearly show that, whereas some linguistic innovations diffuse hierarchically (as linguists have long contended), others diffuse contrahierarchically, while still others diffuse in complex patterns that show characteristics of both contagious and hierarchical diffusion. An analysis of the barriers to and amplifiers of diffusion suggests that these different types of diffusion are a consequence of the different social meanings that linguistic forms carry.
Article
Full-text available
Language dynamics is a rapidly growing field that focuses on all processes related to the emergence, evolution, change and extinction of languages. Recently, the study of self-organization and evolution of language and meaning has led to the idea that a community of language users can be seen as a complex dynamical system, which collectively solves the problem of developing a shared communication framework through the back-and-forth signaling between individuals. We shall review some of the progress made in the past few years and highlight potential future directions of research in this area. In particular, the emergence of a common lexicon and of a shared set of linguistic categories will be discussed, as examples corresponding to the early stages of a language. The extent to which synthetic modeling is nowadays contributing to the ongoing debate in cognitive science will be pointed out. In addition, the burst of growth of the web is providing new experimental frameworks. It makes available a huge amount of resources, both as novel tools and data to be analyzed, allowing quantitative and large-scale analysis of the processes underlying the emergence of a collective information and language dynamics.
Article
Full-text available
By determining the most common English words and phrases since the beginning of the sixteenth century, we obtain a unique large-scale view of the evolution of written text. We find that the most common words and phrases in any given year had a much shorter popularity lifespan in the sixteenth century than they had in the twentieth century. By measuring how their usage propagated across the years, we show that for the past two centuries, the process has been governed by linear preferential attachment. Along with the steady growth of the English lexicon, this provides an empirical explanation for the ubiquity of Zipf's law in language statistics and confirms that writing, although undoubtedly an expression of art and skill, is not immune to the same influences of self-organization that are known to regulate processes as diverse as the making of new friends and World Wide Web growth.
Article
Full-text available
Psychologists have debated the form of the forgetting curve for over a century. We focus on resolving three problems that have blocked a clear answer on this issue. First, we analyzed data from a longitudinal experiment measuring cued recall and stem completion from 1 min to 28 days after study, with more observations per interval per participant than in previous studies. Second, we analyzed the data using hierarchical models, avoiding distortions due to averaging over participants. Third, we implemented the models in a Bayesian framework, enabling our analysis to account for the ability of candidate forgetting functions to imitate each other. An exponential function provided the best fit to individual participant data collected under both explicit and implicit retrieval instructions, but Bayesian model selection favored a power function. All analysis supported above chance asymptotic retention, suggesting that, despite quite brief study, storage of some memories was effectively permanent.
Article
Full-text available
The Linguistic Atlas of the Middle and South Atlantic States(LAMSAS) is admirably accessible for reanalysis (seehttp://hyde.park.uga.edu/lamsas/,Kretzschmar, 1994). The present paper applies alexical distance measure to assess the lexical relatedness of LAMSAS'ssites, a popular focus of investigation in the past(Kurath, 1949; Carver, 1989; McDavid, 1994). Several conclusions arenoteworthy: First, and least controversially, we note that LAMSAS isdialectometrically challenging at least due to the range of fieldworkers and questionnaires employed. Second, on the issue of whichareas ought to be recognized, we note that our investigations tend tosupport a three-wayNorth/South/Midlands division rather than a two-wayNorth/South division, i.e. they tend to support Kurath and McDavidrather than Carver, but this tendency is not conclusive. Third, weextend dialectometric technique in suggesting means of dealing withalternate forms and multiple responses.
Book
One of the first accounts of social variation in language, this groundbreaking study founded the discipline of sociolinguistics, providing the model on which thousands of studies have been based. In this second edition, Labov looks back on forty years of sociolinguistic research, bringing the reader up to date on its methods, findings and achievements. In over thirty pages of new material, he explores the unforeseen implications of his earlier work, addresses the political issues involved, and evaluates the success of newer approaches to sociolinguistic investigation. In doing so, he reveals the outstanding accomplishments of sociolinguistics since his original study, which laid the foundations for studying language variation, introduced the crucial concept of the linguistic variable, and showed how variation across age groups is an indicator of language change. Bringing Labov's pioneering study into the 21st century, this classic volume will remain the benchmark in the field for years to come.
Book
If life could write, it would write like Tolstoy.’ Isaac Babel Tolstoy’s epic masterpiece intertwines the lives of private and public individuals during the time of the Napoleonic wars and the French invasion of Russia. The fortunes of the Rostovs and the Bolkonskys, of Pierre, Natasha, and Andrei, are intimately connected with the national history that is played out in parallel with their lives. Balls and soirées alternate with councils of war and the machinations of statesmen and generals, scenes of violent battles with everyday human passions in a work whose extraordinary imaginative power has never been surpassed. The prodigious cast of characters, both great and small, seem to act and move as if connected by threads of destiny as the novel relentlessly questions ideas of free will, fate, and providence. Yet Tolstoy’s portrayal of marital relations and scenes of domesticity is as truthful and poignant as the grand themes that underlie them. In this revised and updated version of the definitive and highly acclaimed Maude translation, Tolstoy’s genius and the power of his prose are made newly available to the contemporary reader.
Article
Dialect variation brings together language synchrony and diachrony in a unique way. Language change is typically initiated by a group of speakers in a particular locale at a given point in time, spreading from that locus outward in successive stages that reflect an apparent time depth in the spatial dispersion of forms. Thus, there is a time dimension that is implied in the layered boundaries, or isoglosses , that represent linguistic diffusion from a known point of origin. Insofar as the synchronic dispersion patterns are reflexes of diachronic change, the examination of synchronic points in a spatial continuum also may open an important observational window into language change in progress.
Article
We use a mathematical model to examine three phenomena involving language change across the lifespan: the apparent time construct, the adolescent peak, and two different patterns of individual change. The apparent time construct is attributed to a decline in flexibility toward language change over one's lifetime; this explanation is borne out in our model. The adolescent peak has been explained by social networks: children interact more with caregivers a generation older until later childhood and adolescence. We find that the peak also occurs with many other network structures, so the peak is not specifically due to caregiver interaction. The two patterns of individual change are one in which most individuals change gradually, following the mean of community change, and another in which most individuals have more categorical behavior and change rapidly if they change at all. Our model suggests that they represent different balances between the differential weighting of competing variants and degree of accommodation to other speakers.
Article
The songs and calls of many bird species, like human speech, form distinct regional dialects. We suggest that the process of dialect formation is analogous to the physical process of magnetic domain formation. We take the coastal breeding grounds of the Puget Sound white crowned sparrow as an example. Previous field studies suggest that birds of this species learn multiple songs early in life, and when establishing a territory for the first time, retain one of these dialects in order to match the majority of their neighbours. We introduce a simple lattice model of the process, showing that this matching behaviour can produce single dialect domains provided the death rate of adult birds is sufficiently low. We relate death rate to thermodynamic temperature in magnetic materials, and calculate the critical death rate by analogy with the Ising model. Using parameters consistent with the known behavior of these birds we show that coastal dialect domain shapes may be explained by viewing them as low temperature "stripe states".
Chapter
This book brings together a variety of approaches to English corpus linguistics and shows how corpus methodologies can contribute to the linking of diachronic and synchronic studies. The articles in this volume investigate historical changes in the English language as well as specific aspects of Middle and Modern English and, moreover, of English dialects. The contributions also discuss the development of English corpus linguistics generally and its potential in the future. Special focus is given to the continuity between Middle and Modern English – much in line with the linking in previous studies of Middle English and Old English under the generic term “medievalism”. This volume highlights the continual development of English from the medieval to modern period.
Article
Porod’s law and Tomita’s sum rule are two universal features expected for form factors of systems undergoing phase ordering processes, but have never been conclusively observed in numerical experiments. We demonstrate the drastic effect of finite thickness of interfaces on these asymptotic laws. Our results strongly suggest that the form factor obtained by Ohta, Jasnow and Kawasaki is asymptotically accurate, if not exact.
Article
In this study, we present the first agent-based simulation of vowel chain shifts across large communities, providing a parsimonious reinterpretation of Labov's (2007) notions of transmission, diffusion, and incrementation. Labov determined that parent-to-child transmission faithfully reproduces structural patterns such as the Northern Cities Shift (NCS), but adult-to-adult diffusion does not. NCS is transmitted faithfully to new generations of U.S. Inland North children. But St. Louis speakers, depending only on adult-adult contact, only attain an incomplete, unsystematic version. Labov (2007) attributed the difference to children's superior language-learning ability; transmission and diffusion are categorically different processes in that approach. By contrast, our multiagent simulation suggests that such transmission/diffusion effects can be derived by simple density of interactions and simple exemplar learning; we also find that incrementation is a natural outcome of this model. Unlike Labov (2007), this model does not require a dichotomy between transmission and diffusion. While dichotomous assumptions about child versus adult learning may be necessary in other contexts, our results suggest that the NCS effects in Labov (2007) may be explained economically in terms of simple density of interactions between speakers. Our results also provide an agent-based perspective supporting and explicating the notion of speech community.
Article
The paper illustrates the results of a correlation study focusing on linguistic variation in an Italian region, Tuscany. By exploiting a multilevel representation scheme of dialectal data, the study analyses attested patterns of phonetic and morpho-lexical variation with the aim of testing the degree of correlation between a) phonetic and morpho-lexical variation, and b) linguistic variation and geographic distance. The correlation analysis was performed by combining two complementary approaches proposed in dialectometric literature, namely by computing both global and place-specific correlation measures and by inspecting their spatial distribution. Achieved results demonstrate that phonetic and morpho-lexical variations in Tuscany seem to follow a different pattern than encountered in previous studies. © Edinburgh University Press and the Association for History and Computing 2009.
Article
Assuming that numerical scores are available for the performance of each of n persons on each of n jobs, the "assignment problem" is the quest for an assignment of persons to jobs so that the sum of the n scores so obtained is as large as possible. It is shown that ideas latent in the work of two Hungarian mathematicians may be exploited to yield a new method of solving this problem. 1.
Article
The dynamics of interfaces where the normal component of an interface velocity is proportional to the curvature is studied. The dynamic structure function due to the motion of random interfaces is shown to satisfy a scaling law. The results are compared with Monte Carlo simulations of the kinetics of the order-disorder transition in a quenched system.
Book
How do children learn that the word "dog" refers not to all four-legged animals, and not just to Ralph, but to all members of a particular species? How do they learn the meanings of verbs like "think," adjectives like "good," and words for abstract entities such as "mortgage" and "story"? The acquisition of word meaning is one of the fundamental issues in the study of mind. According to Paul Bloom, children learn words through sophisticated cognitive abilities that exist for other purposes. These include the ability to infer others' intentions, the ability to acquire concepts, an appreciation of syntactic structure, and certain general learning and memory abilities. Although other researchers have associated word learning with some of these capacities, Bloom is the first to show how a complete explanation requires all of them. The acquisition of even simple nouns requires rich conceptual, social, and linguistic capacities interacting in complex ways. This book requires no background in psychology or linguistics and is written in a clear, engaging style. Topics include the effects of language on spatial reasoning, the origin of essentialist beliefs, and the young child's understanding of representational art. The book should appeal to general readers interested in language and cognition as well as to researchers in the field. Bradford Books imprint
Article
Wright's model of a random process in genetica is modified by supposing that births and deaths occur individually at random so that the generations are no longer simultaneous. An exact solution is then obtained for the distribution of the number of a-genes in a haploid organism when mutation is occurring in both directions. When there is no mutation, the rate of approach to homozygosity is found in more complicated models with two sexes and diploid individuate. This rate is twice that occurring in Wright's models.
Article
Over the past ten years the study of language in its social context has become a mature field with a substantial body of method and empirical results. As a result of this work we are arriving at new insights into such classical problems as the origin and diffusion of linguistic change, the nature of stylistic variation in language use, and the effect of class structure on linguistic variation within a speech community. Advances in sociolinguistics have been most evident in the study of co-variation between social context and the sound pattern of speech. The results reported in numerous monographs have laid the basis for substantial theoretical progress in our understanding of the factors that govern dialect variation in stratified communities, at least in its phonological aspect. The formulation of theories of the causes of phonological variation that go beyond guesswork and vague generalities appears at last to be possible. Therefore, we offer the following discussion, based on the material that is now available, as a contribution to the development of an explanatory theory of the mechanisms underlying social dialect variation. Although we shall state our views strongly, we know that they are far from definitive. We present them, not as positions to be defended at all costs, but as stimuli to further theoretical reflection in a field that has been, thus far, descriptively oriented.
Article
In this paper we measure the degrees of association among aggregate pronunciational, lexical and syntactic differences in 70 Dutch dialect varieties. First, we show that pronunciation is marginally more strongly associated with syntax than it is with lexis and that syntax and lexis are only weakly associated. Then, we check for the influence of geography as an underlying factor because geography is known to strongly correlate with each of the linguistic levels under investigation. We find that pronunciation and syntax are more strongly associated with geography than lexis is. Finally, we refine the results by accounting for the influence of geography as an underlying factor and show that the association between pronunciation and syntax turns out to be largely based on geography. Some influence between pronunciation and syntax remains but the association between pronunciation and lexis is stronger. There is virtually no association between syntax and lexis.
Article
In this study we use bipartite spectral graph partitioning to simultaneously cluster varieties and identify their most distinctive linguistic features in Dutch dialect data. While clustering geographical varieties with respect to their features, e.g. pronunciation, is not new, the simultaneous identification of the features which give rise to the geographical clustering presents novel opportunities in dialectometry. Earlier methods aggregated sound differences and clustered on the basis of aggregate differences. The determination of the significant features which co-vary with cluster membership was carried out on a post hoc basis. Bipartite spectral graph clustering simultaneously seeks groups of individual features which are strongly associated, even while seeking groups of sites which share subsets of these same features. We show that the application of this method results in clear and sensible geographical groupings and discuss and analyze the importance of the concomitant features.
Article
A microscopic diffusional theory for the motion of a curved antiphase boundary is presented. The interfacial velocity is found to be linearly proportional to the mean curvature of the boundary, but unlike earlier theories the constant of proportionality does not include the specific surface free energy, yet the diffusional dissipation of free energy is shown to be equal to the reduction in total boundary free energy. The theory is incorporated into a model for antiphase domain coarsening. Experimental measurements of domain coarsening kinetics in Fe-Al alloys were made over a temperature range where the specific surface free energy was varied by more than two orders of magnitude. The results are consistent with the theory; in particular, the domain coarsening kinetics do not have the temperature dependence of the specific surface free energy.
Article
Many intuitively appealing methods have been suggested for clustering data, however, interpretation of their results has been hindered by the lack of objective criteria. This article proposes several criteria which isolate specific aspects of the performance of a method, such as its retrieval of inherent structure, its sensitivity to resampling and the stability of its results in the light of new data. These criteria depend on a measure of similarity between two different clusterings of the same set of data; the measure essentially considers how each pair of data points is assigned in each clustering.
Article
The aim of the present investigation1 was to get an impression of the geo- graphic influences on the dialectal variation in a country. In previous in- vestigations, the correlations between linguistic distances and geographic distances using dialect data from the Netherlands and Norway were calcu- lated (Gooskens and Heeringa 2004, Nerbonne et al. 1996). The results showed a high correlation in the case of Dutch data while the correlation was considerably lower in the case of Norwegian data. This seems to re- flect the fact that especially for Norway the direct distance between two settlements does not reflect the difficulty of travel and therefore social contact, which is expected to play a role in keeping linguistic distance within limits. Holland is a country with a flat, regularly populated land- scape with few natural obstacles such as mountains and rivers. This is in great contrast with Norway with its high mountains and many fjords which made it quite difficult to travel between places, especially in the past. These differences in geographical situations are clearly reflected in the correlations between the linguistic and geographical distances between the dialects of the two countries. The present investigation is searching for more successful ways of pre- dicting linguistic distances by means of geographic distances in Norway. To this end, old and new traveling data were used providing information about traveling times by road, train, and boat between the places where the different dialects are spoken. The results show that a large part of the lin- guistic variation can be accounted for by geography in Norway, just as in the Netherlands. However, in the case of a geographically more compli-