Variable Constraints on Past Reference in Dialects of
Chad Howe and Scott A. Schwenter
University of Georgia and The Ohio State University
1. Introduction
The extensive literature on the meaning of perfects cross-linguistically has focused primarily on
describing typological distinctions via sets of purportedly discernable types or uses (Comrie 1976).
Examples from Spanish of these distinct types include the following:
(1) Perfect of Result
Integración es una... rama del derecho que ha surgido (PP), bueno, es antiguo ¿no? pero ha
surgido recientemente acá en el Perú (Caravedo 1989:39)
(2) Experiential (or Existential) Perfect
Bueno he estado (PP)...en Arica que es chileno, y...por el norte en Guaquía. (Caravedo
(3) Continuative (or Universal) Perfect or Perfect of Persistant Situation
Bueno, ella es muy, mira muy…ha tenido (PP) siempre un enorme interés
cosa artística sobre todo plástica. (Caravedo 1989:101)
(4) Perfect of Recent Past
Usted ha dicho (PP) dos cosas muy interesantes. (Caravedo 1989:280)
It has long been noted that the distribution of the Spanish Present Perfect (henceforth PP) vis-à-vis the
simple (perfective) past, or Preterit, varies across dialects (Kany 1945, Alonso and Henríquez Ureña
1951, Westmoreland 1988, López Morales 1996, Penny 2000). In a quantitative analysis of PP and
Preterit usage found in the corpora of the Habla Culta project, DeMello (1994) observes that the
highest overall frequencies of the “pretérito compuesto para indicar acción con límite en el pasado” (or
PCALP) are found in the La Paz, Lima, and Madrid samples, though there is no discussion of whether
or not these results are indicative of similarities in function. Howe and Schwenter (2003) argue that
comparison of overall frequencies can obscure relevant distinctions and that functional parameters
must be taken into account in determining the cross-dialectal distribution of PPs. The current analysis
follows the proposal of Schwenter and Torres Cacoullos (2008) that a multivariate analysis of PP and
Preterit distributions is required to tease apart the dialect specific factors that contribute to variation.
Using a sample taken from the Lima Habla Culta corpus (Caravedo 1989), our analysis demonstrates
that the internal linguistic constraints governing the distribution of the PP and the Preterit in the Lima
data, which, in terms of overall frequency, exhibits increased PP usage similar to that of the Peninsular
data, are distinct from those noted previously for other dialects.
As has been observed by a number of researchers, the PP in Peninsular Spanish (with the notable
exception of areas of Northwestern Spain and the Canary Islands) has become increasingly compatible
with uses generally attributed only to the Preterit, specifically those related to perfectivity (see e.g.
Schwenter 1994, Serrano 1994, Kempas 2006). In this development, Peninsular Spanish is moving
along the same path of grammaticalization already traversed by nearby varieties such as French or
Northern Italian (Harris 1982 and Fleischman 1983). Squartini and Bertinetto (2000) describe the
overall trend in Romance for perfects to develop into perfective pasts as an “aoristic drift” (404),
whose outcome is a periphrastic perfective past—e.g., the French passé compose, which in speech at
least has completely supplanted the simple past form (passé simple). Two features commonly
associated with this development are illustrated in the following examples, both of which are normally
found only with simple past, not perfect, forms.
(5) Compatibility of PP with temporally specific past adverbials (e.g., ayer)
Vale. Bueno, pues, eh…hoy me he levantado (PP) por la mañana…y ayer me quedé (Preterit)
a dormir en casa de mis padres… (Madrid, Howe 2006:93)
(6) Compatibility of PP with narrative sequencing
Me he levantado (PP) a las…a las nueve de la mañana. He desayunado (PP) en casa. (PP)
Me (he)1 hecho la comida. He ido (PP) a la casa de mis padres a…para hacer unas
burocracias, y luego he venido (PP) a la universidad… (Madrid, Howe 2006:96)
In addition to the well-known diachronic process still occurring in Peninsular Spanish, the PP in at
least some varieties of South American Spanish has been described by a number of scholars as
undergoing a similar process of change (namely, perfect > perfective). Alonso & Henríquez Ureña, for
instance, note that:
...modernamente existe la tendencia a fundir los usos [del pretérito perfecto y pretérito simple]:
mientras en Madrid se prefiere el pretérito perfecto y se emplea para significaciones que antes
correspondían al pretérito simple ... en gran parte de América se hace lo contrario ... En nuestras
provincias andinas [de Argentina], el uso coincide con el de Madrid, y no con el porteño.
This claim is representative of the view that the greater overall frequency of PPs found in some South
American varieties is evidence of a similar “drift” from perfect to perfective meaning. Other issues
commonly cited in reference to this claim are (i) that there are a number of perfective-like uses of PPs
in Andean varieties of Spanish and (ii) that the increased usage of the perfect is influenced by contact
with indigenous languages, specifically Quechua (see e.g. Escobar 1997, Klee and Ocampo 1995).
Concerning the latter claim, we will not address the type or extent of influence that might be attributed
to contact between Spanish and Quechua. With respect to perfective ‘uses’ of perfects, some typical
examples are given below:
(8) Bueno, yo he vivido (PP) y he nacido (PP) en Lima, pero ya, estoy en Cusco hace siete años.
(Cusco, Howe 2006:125)
(9) Yo no he estado (PP) en aula ayer. (Cusco, Howe 2006:124)
(10) Bueno, desde ahí, esto ha sido (PP) en el setenta y dos, hasta la fecha sigo en esto y espero
terminar este año. (Lima, Caravedo 1989:114)
Howe and Schwenter (2003) provide a preliminary qualitative analysis of corpus data from Peru
and Bolivia, noting that “quantitative similarity in [Perfect usage] is not necessarily reflective of
functional similarity in how the [Perfect] has been extended into the domain of the Preterite”
(2003:67). In a EUROTYP-style sentence judgment task (see Dahl 2000) conducted with speakers
from Madrid and Valencia, Spain and Cusco, Peru, Howe (2006) corroborates this claim,
demonstrating that PP/Preterit preference is variable across dialects in a range of different contexts
(e.g. co-occurrence with definite past adverbials, negation, sequenced narratives, etc.). A summary of
these contexts for each dialect is presented in Table 1 below.
Given this background on the dialectal distribution of the Preterit and PP across Spanish varieties,
we posed the following research questions in the present study:
1. Is there is any evidence, other than increased overall frequency, to suggest that the PP in
Peruvian (specifically Lima) Spanish is developing perfective uses similar to those in
Peninsular Spanish?
1 Omission/deletion of the auxiliary among speakers of Peninsular Spanish, especially in the 1st singular is
common (see Howe 2006). Schwenter and Torres Cacoullos (2008) also note this tendency in the COREC data.
2. What are the constraints that determine the distribution of the PP in contrast to the Preterit?
3. What does constraint interaction in this case tell us about the development of the PP across
varieties of Spanish? Are there different trends in the grammaticalization of the PP?
Table 1. Overall Distribution of PP/Preterit in three dialects of Spanish (Howe 2006)
Factor Group
Co-occurrence with
‘today’ adverbials
Co-occurrence with
Pre-‘today’ adverbials
ya ‘already’
‘Today’ Narratives
2. Methodology
A sample of tokens of the PP and the Preterit was collected from a corpus of spoken Spanish (oral
interviews from El español de Lima. Materiales para el estudio del habla culta (Caravedo 1989)). The
envelope of variation was determined by verb form. That is, all morphologically Preterit forms and PP
forms (AUXPres + Participle) were included in the statistical analysis. A sample of 2106 tokens was
extracted for this study. Out of this total, there were 110 exclusions. Some sample exclusions are listed
(11) Progressive
Un tiempo estuvo (PRET) enseñando en el consejo. (Caravedo 1989:36)
(12) Ambiguity with Present Tense
a veces nos olvidamos un poco de lo que está a nuestro alrededor. (Caravedo
(13) Repetition/False Start
He ido varias veces sí, he ido (PP) varias veces. (Caravedo 1989:64)
(14) Missing Auxiliary (i.e. possible ambiguity with other compound tenses)
Hemos hecho guardias nocturnas habido días en que... Individualmente todos tuvimos que
correr con aprestos de la cocina, (Caravedo 1989:193)
The tokens extracted from the data were subsequently coded for the following factor groups and
corresponding individual factors:
1. Temporal Reference (See note below)
2. Presence of Temporal Adverbials and Types of Adverbials: What types of adverbials co-
occur with PPs and Preterits? (e.g. durative, definite past, hodiernal, frequency, etc.)
3. Presence of ya (see Schwenter & Torres Cacoullos 2008)
4. Polarity: Affirmative vs. Negative
5. Clause Type: What type of clause is the PP or Preterit located in? (e.g. temporal, matrix,
relative, interrogative, etc.)
6. Transitivity: What type of object (if transitive) is present? (e.g. lexical animate/inanimate,
pronominal animate/inanimate, etc.)
7. Plurality: Is the direct object (if present) singular or plural?
8. Lexical Aspect (Aktionsart): What is the lexical aspect of the predicate? (e.g. durative,
punctual, telic, stative, etc.)
2.1 A Note on Temporal Reference
In their analysis of data from Mexican and Peninsular Spanish, Schwenter and Torres Cacoullos
(2008) make use of the factor group Temporal Reference in order to provide a quantifiable means of
determining the ‘types’ of temporal situations generally instantiated by these forms. The individual
factors are listed below, along with representative examples.
(15) Hodiernal (today)
Te he oído (PP) mencionar algo relativo al dictado de clases, me contaste (PRET) que habías
dictado clases en inglés, ¿qué puedes decirnos de tu experiencia docente? (Caravedo
(16) Hesternal (yesterday)
*There were no attested cases of hesternal past reference in the Lima corpus.
y ayer fuimos (PRET) Maripi y yo (Marcos Marín 1992:BCON014B)
(17) Prehesternal (before yesterday)
Bueno, yo ingresé (PRET) a la... universidad el ano... sesenta y...ocho exactamente.
(Caravedo 1989:30)
(18) Irrelevant temporal reference (cannot ask ‘when?’ to disambiguate)
a. Siempre has vivido (PP) ahí, o... ¿a dónde? (Caravedo 1989:73)
b. E, derecho, pero jamás llegó (PRET) a ejercer el derecho (Caravedo 1989:35)
(19) Indeterminate temporal reference (analyst, possibly interlocutor, cannot resolve temporal
distance of past situation; can ask ‘when?’ to disambiguate)
a. ¿Cuál ha sido (PP) tu experiencia con ellos? (Caravedo 1989:31)
b. Mira, vida de San Eugenio tuvo (PRET) una una peculiaridad interesante,
porque...por alguna razón...un poco exótica…(Caravedo 1989:91)
The distribution of hodiernal and hesternal tokens is quite limited in this sample of the Lima data,
with only 29 tokens of hodiernal (today) reference and no tokens of hesternal (yesterday) reference.
Of these 29 hodiernal tokens, 11 were PPs and 18 were Preterits. Interestingly, many of the hodiernal
cases were uttered by the interviewer as mentions of something previously said by the interviewee. As
a result, the verbs used in hodiernal contexts were by and large verba dicendi (e.g. decir, contar, etc.).
Schwenter and Torres Cacoullos’ main claim is that the PP in Peninsular Spanish has become the
default form of perfective past reference (without necessarily discarding the prototypical PP uses),
whereas the Mexican Spanish PP is restricted to the hallmark uses of a prototypical PP (e.g.
continuative and experiential). As evidence for this claim, they point to the increased use of the PP in
Peninsular Spanish in contexts of indeterminate temporal reference (such as in (19)). Crucially, these
are contexts in which the temporal location is potentially unidentifiable by the interlocutor, allowing
for any number of possible temporal locations in the past. For the Lima data, it is hypothesized that the
PP will represent an intermediate case between the Mexican and the Peninsular varieties, suggesting
another possible pathway of semantic change for the Spanish PP as it takes on more past perfective
3. Results
Table 2 presents the overall frequency counts for the use of the Preterit and the PP in three
different dialects: Madrid (as reflected in the COREC corpus [Marcos Marín 1992] utilized by
Schwenter and Torres Cacoullos 2008), Lima (our own counts from Caravedo 1989), and Mexico City
(Habla Culta [Lope Blanch 1971] and Habla Popular [Lope Blanch 1976]).
Table 2. Overall Distribution of Preterit/PP in three dialects of Spanish
Lima HC
Mexico HC/HPb
73.6% (1470)
85.2% (1903)
26.4% (526)
14.8% (331)
χ2=729.32, p < .00, df =2; (a = Marcos Marín (1992), b = Lope Blanch (1971 & 1976))
As shown in Table 2, there is a significant difference across the three dialects in terms of the
overall distributions of the PP and Preterit. The highest frequency of the PP, and also the only dialect
where the PP predominates over the Preterit, is Madrid, followed by Lima, and lastly Mexico City,
where the rate of the PP is nearly exactly one-half that found in Lima (14.8% versus 26.4%). Separate
pairwise χ2 tests (i.e. Lima HC/COREC and Lima HC/Mexico HC/HP) also indicate that the
PP/Preterit distribution in Lima is distinct from both the Peninsular and Mexican sample.
Multivariate analysis of the Lima data was carried out using GoldVarbX (Sankoff, Tagliamonte,
& Smith 2005), resulting in the configuration of significant factor groups presented in Table 3 below.
Out of the eight factor groups coded, only two had a statistically significant effect on the distribution
of the PP/Preterit in Lima: Temporal Reference and Plurality of the Direct Object2.
Table 3. Multivariate analysis of the factors contributing to the choice of the PP over the Preterite
in Lima (Habla Culta Corpus)
Total N = 1802, p = 0.007, Corrected mean: .16 (26% PP)
Log likelihood: -630.974
2 None of the remaining factors—i.e. Temporal Adverb, Clause type, Aktionsart, Presence of ya, and
Transitivity—were selected as significant in the multivariate analysis. See below for an explanation regarding the
effects of Polarity.
% PP
% Data
Temporal Reference
Before Today
Plurality of DO
None (Intrans.)
4. Analysis & Discussion
The first point to note is that the three significant factors listed in Table 3 are the same as the three
top-ranked factors in the analysis of Schwenter and Torres Cacoullos (2008) for the Mexican and
Peninsular samples. Temporal Reference, with range of 68, shows the greatest magnitude of effect,
followed by Plurality of the Direct Object (25). This suggests clear overlap among the grammars of
these varieties for determining the choice between the PP and the Preterit. While this result is not
necessarily surprising, it does not give a complete picture of the distribution of the Lima data in
comparison to the Mexican and Peninsular cases. It should be noted that in the original Varbrul run,
the factor group Polarity also displayed a significant effect. Cross-tabulation with Temporal Reference,
however, reveals a considerable amount of interaction. Given that irrelevant contexts are much more
likely to be negated than other types of temporal reference and, moreover, that irrelevant contexts are
also frequently coded with the PP, this interaction was not surprising. In the analysis presented in
Table 3 above, the factor group Polarity has therefore been omitted.
4.1 Temporal Reference
As expected, the results in the Temporal Reference factor group in Table 3 above indicate that the
PP/Preterit distinction in the Lima data is intermediate between the Mexico City and Madrid samples.
On the one hand, as in these other dialects, in Lima the PP is favored by both irrelevant and
indeterminate temporal reference. But the PP is heavily disfavored (.20) in before today contexts
(recall, again, that there were very few tokens of hodiernal and hesternal past contexts).
Following Schwenter & Torres Cacoullos 2008, we performed an additional one-level binomial
analysis of the data, in order to control for BOTH factor weight and input probability. This kind of
analysis is called for when there are widely disparate overall frequencies between different samples,
such as that found for the PP rates across different Spanish dialects (see Table 2 above). The results of
this analysis, furthermore, allow for comparison of factor probabilities across independent runs
(Poplack & Tagliamonte, 1996:84).
The most relevant, and most revealing, result of the one-level analysis can be seen in Table 4,
which offers a comparative view of the Temporal Reference FG across the three dialects:
Table 4. Combined effect of corrected mean plus factor weight for Temporal Reference FG in
three Spanish dialects (Mexico City and Madrid data from Schwenter & Torres
Cacoullos 2008)
Mexico City
Input + Weight
Input + Weight
Input + Weight
As noted by Schwenter & Torres Cacoullos, once input probability is taken into account, the PP in
Mexico City is not favored in ANY context, while in Madrid, the PP is favored in EVERY context
except for prehodiernal temporal reference.
How is Lima PP use similar/different from these other dialects? As in both Mexico City and
Madrid, in Lima the PP is heavily disfavored in prehodiernal contexts (there are insufficient hodiernal
tokens for clear comparison). In Irrelevant and Indeterminate contexts, on the other hand, Lima favors
The Mexico City results include all tokens with specific past temporal reference, since there was very little
difference between distinct degrees of remoteness for rates of PP vs. Pret. There were no tokens of the PP in
prehodiernal contexts (versus 16% PP in prehodiernal contexts in the Madrid data), and only 10% PP in hodiernal
contexts (versus 96% PP in hodiernal contexts in Madrid).
the PP, like Madrid but unlike Mexico City. However, the favoring effect in Lima is not as strong as in
Madrid, where both Irrelevant and Hodiernal temporal reference contexts are both virtually categorical
environments for the PP.
One possible explanation for this similarity between the Lima and Madrid data could be the
increased epistemic (i.e. evidential) uses of the PP noticed in some Andean varieties of Spanish,
namely those in contact with Quechua (see Escobar 1997, among others). In the example given in (20),
the speaker describes a (discrete) past event using the PP. While this type of contact effect is attributed
largely to the speech of bilingual speakers, there are analogous cases in the Lima corpus—which
consists solely of monolingual Spanish speakers—that are suggestive of similar patterns of usage (see
example (10) above and (20) and (21) below).
(20) yo he venido (PP) de allá el año 72 / o sea ya estoy un poquito tiempos acá [más de 15 años] /
… / después que he venido (PP) m’ (he) ido (PP) de entre [después de] ocho años / siete años
/ habré ido por allí / y así estuve allá / de allí todavía hasta ahora no voy (Escobar 1997:863)
(21) Trece hermanos hemos nacido (PP) en esa clínica. (Caravedo 1989:61)
Observe also that reference to hodiernal and prehodiernal contexts with the PP in the Lima corpus
are virtually non-existent, as is also the case with the sample from Mexico City. If the lack of these
types of ‘specialized’ types of past reference can be taken to indicate the persistence of certain
functions typical of prototypical PPs, as is argued by Schwenter and Torres Cacoullos, then we can
also describe the results from the Lima corpus in these terms. That is, if the PP in the Lima sample
were following the same path of grammaticalization as that found for Peninsular Spanish, then we
would not necessarily expect to find the full range of perfect ‘types’. This, however, is not the case—
see again examples (1)-(4) above.
4.2 Plurality of DO
The results for the Plurality of DO factor group in the Lima corpus (see Table 3) are precisely
what would be expected of a PP that conserves its prototypical perfect functions. Plural direct objects
are certainly compatible with perfect interpretations, specifically when expressing durative and
iterative meanings. In (22) below the combination of the stative predicate tener with the plural direct
object todos los cambios gives rise to the interpretation that the speaker either continues to experience
changes or has gone through them iteratively.
(22) E... yo he tenido (PP)... o sea... como persona... todos los cambios de... estado (Caravedo
In general, plural objects have an atelicizing effect in such cases, producing continuative
interpretations. Squartini and Bertinetto (2000:412) note a similar effect for negation (see also Smith
1991). These results are very similar to what Schwenter and Torres Cacoullos found in their analysis
of the Madrid and Mexico City data, where Temporal Adverb and Plurality of DO (what they called
“Noun number”) were also the second- and third-most favored factor groups.
4.3 The (non)effects of Temporal Adverbials
In analyzing the tokens for the effects of different types of temporal adverbials, we used a three-
level factor group comprised of atelic adverbials (including adverbs of frequency and duration),
adverbials denoting a specific time, and absence of an adverbial. We originally hypothesized that
durative (see example (23) below) as opposed to specific temporal adverbs would favor the PP, while
the absence of any adverb would neither favor nor disfavor the PP.
(23) y también he vivido (PP) poco tiempo en cada uno de ellos ¿no? (Caravedo 1989:141)
The results of the multivariate analysis, however, revealed considerable skewing caused by widely
different overall frequencies. Almost 75% (N = 1330) of all the tokens occurred without an explicit
adverbial. Also, a separate analysis was run in which adverbials such as siempre and nunca were coded
independently from adverbs of frequency and duration, thereby creating an additional factor within the
Temporal Adverbial Factor Group, albeit one that only comprised 5% of the data overall. Though this
particular Factor Group was determined to be significant in this configuration of the multivariate
analysis, considerable crossover effects in the atelic tokens (i.e. those with siempre, nunca, etc.)
suggested additional skewing of the data. Despite the fact that the PP was disfavored with atelic
adverbials in this Varbrul run (Factor Weight = .40), over half of the 82 atelic tokens (55%) co-
occurred with the PP. With durative/frequency adverbials, only 33% occurred with PPs, resulting in a
factor weight of .68. In the results reported in Table 3, we have factored out the skewing effects caused
by differentiating the atelic adverbs from the durative/frequency adverbials, and as a result this factor
group has not been selected as significant.
5. Conclusions
This paper has addressed the distribution of the PP and the Preterit in a sample of South American
Spanish from Lima, Perú, from a variationist perspective. It has been argued that, despite the increased
overall frequency (in comparison to, for example, Mexican Spanish), the Lima PP has not followed the
same trend of grammaticalization as its Peninsular counterpart. Evidence for this claim has been
presented in the form of comparison of cross-dialectal trends in constraint interaction. A summary of
the conclusions following from this analysis is provided below.
First, an analysis that takes into account only overall frequency as the only quantitative heuristic
(see DeMello 1996) does not provide an adequate picture of the types of factors that contribute to the
PP/Preterit distinction across dialects. The functional locus of variation has been shown to be related to
the type of reference intended—e.g. indeterminate, irrelevant, hodiernal, etc. Thus, these findings are
consonant with those of Schwenter and Torres Cacoullos (2008) who argue that analyzing temporal
reference is crucial for describing the processes of semantic change characteristic of forms of past
reference in Spanish.
Moreover, if the PP in Andean Spanish is indeed developing functions previously attributed only
to the Preterit, then we must conclude, given these results, that (i) there is more than one possible
grammaticalization pathway for perfect > perfective in Spanish and that (ii) it may be possible to
discern a specific set of features that are characteristic of the types of semantic change noted for PPs in
Spanish. Concerning the latter point, Dahl (1985) notes that in the shift from archetypical perfect
meaning to perfective, the French passé composé passed through the same stages of development as
has been noted for the PP in Peninsular Spanish. Thus, the prediction that follows is that “the features
of hodiernal reference and narrative functions exhibited by the Peninsular perfect are necessarily
concomitant with the process of aorist drift in Romance [as defined by Squartini and Bertinetto 2000]”
and that “[t]he corollary to this claim is that cases that appear to be analogously perfective, as with the
Peruvian [PP], are in fact not undergoing the attested Romance development” (Howe 2006:204). This
analysis, then, offers additional evidence for the claim that there are multiple pathways of PP
grammaticalization in Spanish.
This type of multivariate quantitative analysis of natural speech data, coupled with supporting
evidence from qualitative survey data, provides a clear overall picture of general trends in the use of
PPs and Preterits across dialects of Spanish. The observations cited above concerning the effects of
factors such as temporal reference are possible only through quantitative approaches that utilize
corpora of naturally-occurring speech. Thus, reliance on any typology of perfect types as a measure of
either dialect variation or variable semantic change does not make for a particularly consistent or
transparent methodology.
