PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

This chapter employs a range of different corpora to examine pragmatic variation within the same language in some detail. However, before we begin the corpus work, it is worth exploring the study of linguistic variation in general. The first point of note is that the study of language variation has traditionally focused on phonological, lexical and syntactical levels. The systematic study of variation at a pragmatic level is a relatively recent development (see Schneider and Barron, 2008a; Cheshire, 2016). This has been flagged as a serious concern, since any description of language that focuses solely on language as a system and ignores language as a living, emergent, negotiated , co-constructed enterprise is 'not only incomplete, but inadequate' (Schneider and Barron, 2008b: 3; see also McCarthy and Clancy, 2019). The second point of note is that, again traditionally , the focus has been on historical variation, or variation over time, and geographical variation, variation over space. However, as the focus has shifted from phonology, lexis and syntax, so too has the type of variation being studied, and there is now a lot more emphasis placed on variation in social space in addition to the more traditional historical and geographical variation.
Content may be subject to copyright.
CHAPTER 7
Pragmatics and language
variation
7.1 PRAGMATICS AND LANGUAGE VARIATION
This chapter employs a range of different corpora to examine pragmatic variation within the same
language in some detail. However, before we begin the corpus work, it is worth exploring the study
of linguistic variation in general. The first point of note is that the study of language variation has
traditionally focused on phonological, lexical and syntactical levels. The systematic study of vari-
ation at a pragmatic level is a relatively recent development (see Schneider and Barron, 2008a;
Cheshire, 2016). This has been flagged as a serious concern, since any description of language
that focuses solely on language as a system and ignores language as a living, emergent, nego-
tiated, co-constructed enterprise is ‘not only incomplete, but inadequate’ (Schneider and Barron,
2008b: 3; see also McCarthy and Clancy, 2019). The second point of note is that, again tradition-
ally, the focus has been on historical variation, or variation over time, and geographical variation,
variation over space. However, as the focus has shifted from phonology, lexis and syntax, so too
has the type of variation being studied, and there is now a lot more emphasis placed on variation in
social space in addition to the more traditional historical and geographical variation.
Variation in social space can be examined from two different perspectives. We can look at
social variation from a macro-social perspective – that is, the influence of factors such as age,
gender, ethnicity, social class, etc. – and from a micro-social perspective – that is, variation influ-
enced by more ‘local’ factors such as the degree of social distance between participants (e.g.
strangers, friends, family), power (an employee talking to her or his boss) or register (see Chap-
ter 8 for more on register). These types of variation – historical, geographical and social – can be
examined on a number of levels (Schneider and Barron, 2008b: 20–21), illustrated in Table 7.1 .
Table 7.1 Levels of variation (Schneider and Barron, 2008b: 20–21)
Level of variation Research focus
Formal level Linguistic forms such as discourse markers, stance markers, hedges, etc.
Analysis of this type can be classified as form-to-function (see Chapter 3 )
Actional level Speech act realisation and modification. Analysis of this type can be classified
function-to-form (see Chapters 3 and 6 )
Interactional level Sequential patterns such as adjacency pairs, interactional exchanges or
interactional phases (such as openings or closings)
Topic level Content-related questions such as ‘What topics are considered taboo?’ ‘What
topics are considered suitable for small talk?’, etc.
Organisational level Turn-taking phenomena such as overlaps, interruptions, backchannels, etc.
(see Chapter 8 )
15034-2314.indb 14515034-2314.indb 145 9/10/2019 4:32:11 PM9/10/2019 4:32:11 PM
146 PRAGMATICS AND LANGUAGE VARIATION
The levels described in Table 7.1 all have an influence on intralingual pragmatic choice (i.e.
pragmatic variation within the same language), both for native and non-native speakers.
From a non-native perspective, the obvious benefit of highlighting the influence of these
factors lies in the pedagogic realm. Ren and Han (2016: 425) point out, specifically in rela-
tion to speech acts, that it is of ‘paramount importance that teachers and learners are well
informed about the intralingual pragmatic variation found in different varieties of English’.
However, it is also worth stressing here that understanding intralingual variation is of equal
importance to the native speaker. Lippi-Green (1997: 30) points out that ‘[language] varia-
tion isn’t without consequences’. Wolfram and Schilling-Estes (2006: 100–101) argue that
conflict within different social and ethnic groups in modern American society is due to a
failure to understand that ‘different groups have different language-use conventions which
might have nothing to do with the intentions that underlie particular language uses’. This
statement has, of course, ramifications far beyond contemporary American society.
The study of pragmatic variation within the same language has, in the main, concen-
trated on those languages described as pluricentric – that is, languages that have more
than one standard variety due to the fact that they are often used in a number of different
countries. Therefore, English is a pluricentric language, as there are a number of different vari-
eties of English used in a number of different countries – this chapter explores variation
within and between Irish, British and American English. Other examples of pluricentric lan-
guages include, but are not limited to, Spanish, French, German, Portuguese, Arabic and
Swahili. Although the focus of this chapter is on pragmatic variation within the English
language, we acknowledge that English is by no means the benchmark for intralingual
pragmatic variation. For example, Placencia and García (2007) detail the considerable work
that has been conducted on regional pragmatic variation in Spanish. Schneider and Bar-
ron’s (2008a) publication, one of the first to address the dearth of studies into pragmatic
variation, contains studies that examine five different languages – English, Dutch, German,
Spanish and French – twelve different national varieties (e.g. Plevoets et al . (2008) explore
variation between Netherlandic Dutch and Belgian Dutch) and two sub-national varieties
of Ecuadorian Spanish (Placencia, 2008). Barron and Schneider’s (2005) publication The
Pragmatics of Irish English was one of the first publications of its kind to focus on intra-
language pragmatic variation, in this case within the national variety of Irish English, and
this has paved the way for many of the more recent publications (see e.g. Aijmer, 2013;
Amador-Moreno et al ., 2015).
With the exception of Aijmer (2013) and selected chapters from Barron and Schneider
(2005), Schneider and Barron (2008a) and Amador-Moreno et al . (2015), not all variational
research has been done using a corpus linguistic methodology. There is, however, as this
book demonstrates at a broader level, a fast-emerging corpus pragmatic field, with selected
research focused on pragmatic variation between languages from a historical, geographi-
cal or social viewpoint and at a number of levels of pragmatic analysis. A selection of this
research includes O’Keeffe and Adolphs’ (2008) study of response tokens in CANCODE
and LCIE; Rühlemann’s (2007) and Bednarek’s (2008) division of the BNC into different
registers in order to explore a variety of pragmatic phenomena such as co-construction,
deixis and humour (Rühlemann, 2007), and language and emotion (Bednarek, 2008); Clan-
cy’s (2016) examination of the sub-corpus of intimate spoken language from LCIE and his
focus on both macro- and micro-social variation, and Vaughan et al .’s (2017) comparison
15034-2314.indb 14615034-2314.indb 146 9/10/2019 4:32:11 PM9/10/2019 4:32:11 PM
PRAGMATICS AND LANGUAGE VARIATION 147
of the frequency of occurrence of vague category markers (see Section 8.4) in LCIE and
CANCODE. This is merely the tip of the iceberg in relation to intralanguage pragmatic
variation research that represents an ongoing and, indeed, burgeoning trend in the study of
pragmatics in general.
As is the case for corpus linguistics in general, any analysis of intralingual variation
is improved through a comparative process. Our introduction to the analysis of pragmatic
variation involves the comparative use of a number of language corpora (for descriptions of
individual corpora see Appendix). The corpora used encompass three varieties of English –
American, British and Irish – and include a range of different spoken and written context
types allowing a sufficient base for a comparative analysis of variation within a language, in
this instance English. Our focus in this chapter is on pragmatic items with a high frequency
across varieties of English. Our examination of response tokens, pragmatic markers, vague
language and speech acts begins with a deliberately broad focus across regional varieties;
however, we then seek to narrow the analysis down in order to account for both similarities
and differences in geographical, historical and social variation in English.
7.2 RESPONSE TOKENS AND VARIATION
Our first foray into pragmatic variation within the same language is in relation to response
tokens (see also Chapter 9 ). We use the term response token to refer to a range of linguistic
items that occupy turn-initial position, but which, in occupying this position, have a particular
pragmatic function: that of the process of engaged listenership (see e.g. McCarthy, 2002).
Response tokens can be categorised as both minimal and non-minimal. Minimal response
tokens are often referred to as backchannels , monosyllabic or monomorphemic forms such
as mm or yeah that assume a role as inter-turn feedback in an extended speaker turn
(Peters and Wong, 2015), as is evident in extract 7.1.
(7.1)
[Context: Speakers are numbered according to the order in which they occur in the extract.]
<$1> Well of course it’s you know it’s it one of the last few things in the world you’d ever
want to do you know unless it’s just you know really you know for and for their uh
you know for their own good
<$2> Yes yeah
<$1> I’d be very very careful and uh you know checking them out uh our had to place my
mother in a nursing home she had a rather massive stroke about uh
<$2> Um-hum
<$1> Uh six eight months ago I guess and uh we were I was fortunate in that I was
personally acquainted with the uh people who uh ran the nursing home in our little
hometown
<$2> Ye s
<$1> So I was very comfortable you know in doing it when it got to the point that we
had to do it but there’s well I had an occasion for my uh mother-in-law who had fell
and needed to be you know could not take care of herself anymore was confined
15034-2314.indb 14715034-2314.indb 147 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
148 PRAGMATICS AND LANGUAGE VARIATION
to a nursing home for a while that was really not a very good experience uh it had
to be done in a hurry I mean we didn’t have you know like six months to check all
of these places out and it was really not not very good uh deal we were not really
happy with the
<$2> Yeah
<$1> Nursing home that we finally had fortunately she only had to stay a few weeks and
she was able to to return to her apartment again but it’s really a big uh big decision
as to you know when to do it
<$2> Yeah
(OANC Switchboard: File sw2005-ms98-a-trans)
In this extract we can see that the tokens yes , yeah and um-hum , marked in bold, are
functioning as ‘continuers’ (Schegloff, 2000) in that, although interpersonally important as
they are demonstrating listenership, the listener does not attempt to take the floor from the
speaker. Instead, the listener seeks to encourage the speaker to continue with his or her
turn. This leads to an extended narrative about the placing of older people in nursing homes.
These response tokens are often described as having a floor-yielding function. In this way,
according to Tottie (1991: 255), backchannels ‘grease the wheels of the conversation’. As
the extract demonstrates, the tokens can occur singly, um-hum , or in clusters, yes yeah .
Response tokens are also prone to repetition and this, coupled with their tendency to clus-
ter, makes them very frequent in spoken language. Non-minimal response tokens are, on
the other hand, frequently made up of adjectives, adverbs, short phrases or clauses such as
lovely, wow, absolutely, that’s great, not at all or what a pity .
In extract 7.2, in addition to minimal tokens such as uh-huh and mm (all marked in bold),
we can see the use of the non-minimal response token that’s + adjective in the form of that’s
good . Again, the tokens cluster – Oh that’s good and Oh, OK . The extract demonstrates the
use of both minimal and non-minimal response tokens in what appears to be a radio or televi-
sion talk show setting. The interviewer, <$1>, uses a response token at the beginning of each
of their speaker turns in order to refer to each of the interviewee’s answers to the questions
before <$1> proceeds to ask another question. In these instances, <$1> is offering positive
feedback and social support; for example, the response Oh that’s good on hearing that they
are interviewing a person whose surname is Polk and lives in Polkton. Therefore, we can see
that response tokens can occupy an entire speaker turn as in extract 7.1, but can also be
associated with the taking of the speaker turn as in extract 7.2. Response tokens such as
these are often attributed with having a floor-grabbing function. In this way, response tokens
reflect the importance of the turn-initial slot in the architecture of conversational turn-taking.
1
(7.2)
<$1> Welcome back to our show! OK. This is Maria, and I don’t know your last name.
<$2> Polk.
<$1> Oh that’s good . From Polkton, Maria Polk. OK. And where did you grow up? Were
you born in this area here?
<$2> I was born in Cottonville. Right outside of Norwood in Stanley County.
<$1> Oh, OK . When you were little, did your mom read you books, or did somebody read
you books in your house?
15034-2314.indb 14815034-2314.indb 148 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
PRAGMATICS AND LANGUAGE VARIATION 149
<$2> My mama read to me and my sister.
<$1> Uh-huh . What kind of books were your favourites? Do you remember any that
were maybe a real favorite of yours?
<$2> Mm , I kind of liked them all. Um, I didn’t really have a favorite.
(OANC face-to-face: File PolkMaria)
To conduct research into response tokens, there are various methodological options. We
could take the forms already identified in the literature and conduct form-to-function analysis
by searching for these items. Alternatively, we could take a function-to-form approach: by
identifying all turn-opening forms and then sifting through these to identify response token
items (see Chapter 3 for a description of form-to-function and function-to-form approaches
in corpus pragmatics). In this case we opted for the former approach. Word frequency lists for
the spoken OANC face-to-face data and the OANC switchboard data were generated and
the items that have been identified in the literature as response tokens, both minimal and
non-minimal, were extracted from the frequency lists and presented in Table 7.2 . These fre-
quency results reflect the raw frequencies for each item and are not indicative of the items’
turn position (we expand further on this point in relation to our more detailed treatment of uh
and well ). Therefore Table 7.2 represents a list of response token candidates.
Table 7.2 Frequency results for response tokens candidates in OANC (face-to-face) versus OANC
(switchboard) corpora (normalised per million words)
OANC (face-to-face) OANC (switchboard)
Item Frequency Item Frequency
1 um 9790 uh 19,895
2 uh 8690 yeah 13,773
3 well 3375 well 6600
4 really 3275 um 6060
5 yeah 2375 oh 5400
6 no 1915 right 5099
7 oh 1880 uh-huh 4685
8 right 1820 um-hum 4579
9 ok 1490 really 4312
10 uh-huh 1270 good 2362
In the OANC face-to-face corpus there are seven minimal response tokens – um, uh,
yeah, no, oh, ok and uh-huh – whereas in the switchboard corpus there are six – uh, yeah,
um, oh, uh-huh and um-hum . The other items in the table include high frequency discourse
markers such as well , really and right . Good occurs in tenth position in the switchboard
corpus data, it does not occur in the top ten items in the face-to-face data and adjectives
such as good have been reported on in the previous literature as having higher interper-
sonal import due to its predominant use in positive and supportive engagement (see e.g.
McCarthy, 2003, 2015). In general, as Table 7.2 demonstrates, the frequencies are higher
in the switchboard corpus than in the face-to-face corpus due to the nature of the speech
15034-2314.indb 14915034-2314.indb 149 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
150 PRAGMATICS AND LANGUAGE VARIATION
event. Face-to-face conversation allows us the possibility of paralinguistic or kinaesthetic
response, whereas response has to be verbal on the telephone. Furthermore, frequent
clustering of response tokens occurs in telephone calls where there are more pre-closing
and closing routines (see e.g. Antaki, 2002). In these cases, the response tokens function
to signal that the conversation is entering this phase while simultaneously functioning to
maintain interpersonal relationships (see also Schegloff and Sacks, 1973; Jefferson, 1973;
Button, 1987; Hartford and Bardovi-Harlig, 1992; Carter and McCarthy, 2006).
As Table 7.2 demonstrates, in the face-to-face corpus there is a notable drop in fre-
quency after the first and second response token items, um (9790 occurrences) and uh
(8690) respectively, to well (3375). Similarly, in the switchboard corpus there is a notable
decrease in frequency from the most frequent item uh (19,895) to yeah (13,773) to well
(6600). This drop-off in frequency has been noted by other studies in different varieties of
English. For example, McCarthy (2015) notes a similar pattern in a comparison of response
tokens between Irish (LCIE) and British (CANCODE) Englishes. For the purposes of fur-
ther comparison, we decided to compare two of the most frequent minimal and non-minimal
response tokens, um and well respectively, from the face-to-face data to the occurrences of
these tokens in the switchboard data. The combined frequency of both items is approximately
the same in both datasets – they account for 13,165 occurrences in face-to-face and 12,660
in the switchboard data (see Table 7.2 ). However, the frequency difference, across tokens, is
much higher in the face-to-face data; um has 6415 more occurrences than well , whereas
in the switchboard data well has 540 more occurrences than um . Tottie (2015) treats uh
and um as variants of one variable UHM but in Table 7.2 they are listed separately and are
not included in the counts for uh-huh and um-hum which are treated as distinct items. Tottie
(2015: 381) also labels uh and um as planners , ‘which is indicative of their use to give speak-
ers time for online planning of their contributions to the conversation without necessarily
implying uncertainty or dysfluency’. Well , on the other hand, working beyond the lexical sphere
of, for example, its use as an adverb in I know him well , is often associated with a range of lin-
guistic items used to preface disagreement, such as I mean or I don’t know . These items have
been shown to typically mitigate or soften disagreement by positioning it further back in the
turn (see Kotthoff, 1993; Holtgraves, 1997). Schiffrin (1987: 126) highlights the importance
of well to the maintenance of discourse coherence due to the fact that it often occurs at a
point where ‘upcoming coherence is not guaranteed’, such as the refusal of an offer.
In order to get an impression of the pragmatic patterning of um and well , we used
a sampling approach (see Chapter 3 , Section 3.3). We randomly generated a sample of
100 occurrences for each of the items in both sub-corpora. Therefore, for each item, we
examine 200 occurrences; 100 in the face-to-face data and 100 in the switchboard data
for each item. The high frequency of these items means that an examination of each indi-
vidual occurrence is, perhaps, beyond the scope of the chapter but this sample might point
towards some interesting future research directions. The occurrences of um and well in
the downsample were first generated in concordance lines and then examined individually
in context. Through this iterative sifting process, we were able to identify those items that
occurred in a turn-initial position (in general, coding for turn position in the turn-taking pro-
cess is not a straightforward matter; see Tottie (2015) for more on this issue).
Table 7.3 provides the frequency for each item in turn-initial position, subdivided into
a comparison between their occurrences in the face-to-face data and their occurrences in
the switchboard data.
15034-2314.indb 15015034-2314.indb 150 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
PRAGMATICS AND LANGUAGE VARIATION 151
Interestingly, the 13 occurrences of um in initial position in the face-to-face data, which corre-
sponds to 13 per cent of occurrences in the downsample, is very similar to the results from other
corpus studies. Kjellmer (2003), using the COBUILD corpus, also records a 13 per cent sentence
initial position, Rühlemann (2007), using Spoken BNC1994 data, records a 14 per cent figure and
Tottie (2015) documents turn-initial UHM at 15 per cent in the SBCSAE. In the OANC switch-
board data, um has a higher frequency, 24, at turn-initial position perhaps due to the fact that more
questions are asked in general in the switchboard data as opposed to the face-to-face data, where
questions often appear to have been designed to elicit longer responses from participants. In rela-
tion to well , the figures are a little different. Well occurs in turn-initial position on 16 occasions in
face-to face conversation and 34 in the switchboard data. In order to more closely examine these
results, both um and well were further examined in relation to their function in turn-taking.
Table 7.3 Occurrences of um and well at turn-initial position in the OANC sub-corpora based on
100-item downsample for each
um well
face-to-face switchboard face-to-face switchboard
Turn-initial position 13 24 16 34
TASK 7.1 UM AND TURN POSITION
Figure 7.1 illustrates 20 randomly generated concordance lines for the occurrence of
um in the Spoken BNC2014.
1) Based on these concordance lines, what initial hypotheses might be generated in
relation to the position of um in turns in contemporary spoken British English?
2) How do these initial, tentative results compare to those in Table 7.3 ?
Figure 7.1 20 random concordance lines for um in the Spoken BNC2014
15034-2314.indb 15115034-2314.indb 151 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
152 PRAGMATICS AND LANGUAGE VARIATION
In order to step up the analysis a little, in Table 7.4 , the turn-initial instances of both um
and well were examined as to whether they had a floor-yielding role (i.e. as a backchannel
rather than being used by the speaker in an attempt to take over the floor), or a floor-grabbing
function (i.e. acknowledging the previous speaker turn but then followed by a contribution
which ‘takes’ the conversational floor) (see extracts 7.1 and 7.2).
Table 7.4 The turn-taking function of um and well in the OANC spoken sub-corpora based on 100-item
downsample for each
um well
face-to-face switchboard face-to-face switchboard
Floor-yielding 2 5 2 3
Floor-grabbing 11 19 14 31
Although admittedly working from a small sample, it appears that both um and well ,
when in turn-initial position, function predominantly in both sub-corpora to signal inter-
locutor engagement with the previous turn and that this engagement is followed by a
contribution that necessitates the taking of the floor. In other words, both tokens seem
to have a ‘turn-grabbing’ function. This function is evident in the use of well (in bold) in
extract 7.3.
(7.3)
<$1> And, uh, so I guess you’re a big Yankee fan.
<$2> Big time Yankee fan. It’s been a good couple of years for us Yankee fans here
lately. And, uh, we were down on Fifty-Third Street, which is five blocks away from
Central Park, and, um, my fiancé had just passed her boards and it was our anni-
versary, so I decided to surprise her with walking down to Central Park off of Fif-
ty-Eighth. And, um, it kind of just worked out perfect. I proposed to her. I was lucky
enough for her to say, “Yes.” And, um, so, yeah, she’s been up north a couple of
times and in the PA area as well the Poconos where you hear the honeymooners
always go.
<$1> Uh-huh.
<$2> And, um, so
<$1> Well it sounds like you have a pretty bright future ahead of you.
<$2> Not too bad. It’s getting brighter and brighter, it seems, every day. And, um, it’s going
to be a new adventure for us going out to Denver, CO, with me being a pharmaceu-
tical sales representative and my fiancé being in the medical profession as well,
being a nurse. And, um
<$1> I’ve heard that’s a beautiful area.
(OANC face-to-face: File QuelerAdam)
15034-2314.indb 15215034-2314.indb 152 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
PRAGMATICS AND LANGUAGE VARIATION 153
In this extract, <$1> uses well in turn-initial position as part of an extended turn that func-
tions to signal to <$2> to continue with their answer and provide some additional informa-
tion. This use of well comes after a turn where, arguably, um has been involved in yielding
the turn to <$1> ( um in this instance serves to illustrate the difficulties in coding for turn
position in that in <$2>’s turn And, um, so , it is moot as to whether um is in turn-initial,
medial or final position). Overall, response tokens such as well and um , and really, yeah,
right, etc., have a role to play in the determining of speaker intention. When a speaker uses
a response token such as well , it is subject to interpretation by the other conversational
participants as a signal of intention (see Tottie, 2015). In the case of um and well , this inten-
tion may be that, while acknowledging what has been said in the previous turn, the speaker
now wishes to hold the conversational floor for a period of time before they yield it again,
perhaps ironically, through the use of a token such as um .
TASK 7.2 WELL , TURN POSITION AND FUNCTION
Figure 7.2 illustrates 20 randomly generated concordance lines for the occurrence of
well in the Spoken BNC2014.
1) Based on these concordance lines, what hypotheses might be generated in rela-
tion to both the turn position and function of well in contemporary spoken British
English?
2) How do these results compare to those in Table 7.4 ?
3) Return to Task 7.1: what are the functions of um ? Can any connection be made
between turn position and function based on the instances of well and um in these
tasks?
Figure 7.2 20 random concordance lines for well in the Spoken BNC2014
15034-2314.indb 15315034-2314.indb 153 9/10/2019 4:32:12 PM9/10/2019 4:32:12 PM
154 PRAGMATICS AND LANGUAGE VARIATION
Reflecting on the methodology here, we used existing research on the forms that are
associated with turn openings to get a broad view of the process of engaged listenership
across our two sub-corpora of American English. Then, we used a downsampling approach
(of 100 items from each dataset) to look very closely at turn-openers. This micro-analysis
gives us much more insight and ultimately it provides indicative results that can be further
tested on a number of levels. For example, the concordance analysis could be extended to
every occurrence of uh and well in both sub-corpora in order to build a complete picture
of the patterning of these items. Alternatively, this approach could be extended to include
other items in Table 7.2 , such as yeah or really . Finally, these results could be compared to
other corpora of spoken English across different varieties in order to investigate the pat-
terning of this fundamental feature of the turn-taking system which fulfils a very particular,
but nonetheless vital, pragmatic function.
7.3 PRAGMATIC MARKERS AND VARIATION
The term pragmatic marker (PM) is used as an umbrella term for a large number of linguis-
tic items that operate outside of the structural limits of the clause. Since they are clause
independent, one of their defining characteristics is their optionality which makes them
ideal bedfellows in the study of the interpersonal realm, as they are part of language as
discourse rather than language as system (see McCarthy and Clancy, 2019). Although
there is much debate on issues such as the terminology used to refer to PMs, their defi-
nition and their polysemous nature, it is generally accepted that, through a process of
grammaticalisation, they have acquired functions that are both textual in that they organise
discourse (often referred to as discourse marking) and interpersonal in that they encode
aspects such as speaker attitude or involvement in some way depending on the context
(often referred to as pragmatic marking) (see e.g. Schiffrin, 1987; Östman, 1995; Brinton,
1996; Fraser, 1996; Aijmer, 2002, 2013). Aijmer (2015) describes PMs as a category
in constant flux with new items being added constantly. Hence, Amador-Moreno et al .’s
(2015) further broadening of the categorisation of PMs to include, among other linguistic
items, tag questions and vocatives. We use the term pragmatic marker here to encom-
pass both their textual and interpersonal spheres of meaning (other studies use the terms
discourse marker and pragmatic marker interchangeably or refer to these items as dis-
course-pragmatic markers or D-PMs). There is a considerable body of PM research work-
ing at the interface of corpus linguistics and pragmatics which highlights the influence of
context on the meaning and function of PMs. Existing research has investigated PMs in
relation to their use in a number of language varieties (see e.g. Holmes’ (1986) study of
PMs in New Zealand English, Andersen’s (2001) analysis of London teenage English, and
Aijmer’s (2013) work on the ICE suite of corpora which includes Australian, Canadian and
Singapore English among others). PMs have also been examined cross-culturally (see
e.g. Müller, 2005; Fung and Carter, 2007) and from a diachronic perspective (see Brinton,
1996; McCafferty and Amador-Moreno, 2012; Andersen, 2016). As discussed in Chapter
3 , PMs are often the focus of corpus pragmatic research, as they can be recalled relatively
easily through form-to-function processes (see also Chapter 9 where we explore PMs
pedagogically).
15034-2314.indb 15415034-2314.indb 154 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
PRAGMATICS AND LANGUAGE VARIATION 155
In order to illustrate the variability of PMs within the English language, in this section we
compare their use in the written and spoken domains. We have chosen to focus on the British
Academic Written English (BAWE) corpus in order to represent the written sphere, whereas
spoken language is represented by the Limerick Corpus of Irish English (LCIE). Here we
take a form-to-function approach, based on multi-word units. The majority of the literature on
PMs has been devoted to the occurrences of single items such as like , just or actually , or two-
word items such as you know , I mean or I think . In order to expand upon this, we have utilised
WordSmith Tools Version 7.0 (Scott, 2017) to generate the most frequent two-, three- and
four-word clusters 2 in both corpora, and the results are evident in Tables 7.5 and 7.6 . These
results represent the raw frequency counts for each item. In these instances, the counts are
not normalised. The default minimum frequency setting for chunks in WordSmith Tools is five
occurrences which we adopted as our cut-off point, but which proved to be unnecessary given
the frequencies of the top five two-, three- and four-word clusters in each corpus.
First, let us consider the written British English as represented by the BAWE – if we
are to apply our definition of a PM to the results shown in Table 7.5 , primary amongst these:
their optionality, their textual and interpersonal functions, and the fact that they have little or
no semantic meaning, then the four-word cluster on the other hand (highlighted) is the only
item to meet the criteria as a PM within the top five most frequent items. All the other items
listed in Table 7.5 are syntactic fragments frequently used in the construction of a phrase,
clause or sentence.
Extract 7.4 illustrates the use of on the other hand in an undergraduate linguistics
essay. As we can see, it is used as a linking adverbial to ‘in some way mark incompatibility
Table 7.5 Most frequent two-, three- and four-word clusters in BAWE
BAWE Two-word Frequency Three-word Frequency Four-word Frequency
1 of the 63,624 in order to 3899 on the other hand 836
2 in the 36,694 as well as 2364 as a result of 724
3 to the 24,102 due to the 2335 in the case of 608
4 it is 16,473 one of the 2010 the end of the 582
5 and the 16,111 the use of 1868 it is important to 568
between information in different discourse units’ (Biber et al ., 1999: 878), in this case
two sentences where ideas about the use of indefinite pronouns among male and female
speakers are contrasted. In this way the adverbial acts as both a structural device signalling
that a contrast/concession unit is forthcoming, and also in an interpersonal sense in that
the writer is signalling to the reader that, based on their reading, they have evaluated the
use of interpersonal pronouns as a difference between the speech of men and women.
(7.4)
In this assignment it is clear that females use a far greater quantity of indefinite pro-
nouns and in the majority of cases these are seen to be used as a device to distance
themselves from making direct claims. They are mainly used as a result of a lack of
confidence and awareness that they do not want to be seen as overly self-assured and
15034-2314.indb 15515034-2314.indb 155 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
156 PRAGMATICS AND LANGUAGE VARIATION
therefore females are often presumed to be unconfident, hesitant speakers. Males on
the other hand , are seen to use less indefinite pronouns, indicating the confidence
in their speech in that they are more able to talk directly about matters. However, as
women become more socially equal to men, their patterns of lexical use are likely to
change and therefore it is debateable that these features will be used in such great
quantity in the future.
(BAWE: File 6120b)
In contrast, the results for the most frequent two-, three- and four-word units in spoken
Irish English, represented by LCIE, reveal three items that meet our criteria for categorisa-
tion as PMs: the two-word you know and the three-word I don’t know and do you know . You
know has been shown to be the third most frequent two-word unit in the Spoken BNC1994
and I don’t know the most frequent three-word cluster (Adolphs and Carter, 2013). Here, it
should be acknowledged that these three items have potentially semantic meaning – where
know functions as a lexical verb, for example, Do you know who saw him? and can also
occur in fixed phrases such as Better the devil you know . Similarly, I don’t know and Do you
know? have literal, semantic meaning connected to factuality and (un)certainty. Items with
semantic meaning have not been eliminated from the counts in Table 7.6 .
Table 7.6 The most frequent two-, three- and four-word clusters in LCIE
LCIE Two-word Frequency three-word Frequency Four-word Frequency
1 you know 4406 I don’t know 1212 you know what I 230
2 in the 3435 do you know 769 know what I mean 215
3 of the 2354 a lot of 522 do you know what 208
4 do you 2332 you know what 379 I don’t know what 134
5 I don’t 2200 do you want 373 do you want to 121
The instances of multi-word units shown in Table 7.6 that have purely semantic, as opposed
to pragmatic, meaning were not eliminated from the counts, as to attribute these frequency
results purely to the lexical system would be to miss the interpersonal element that is asso-
ciated with them and, by extension, their pragmatic role in the structure of discourse and
the establishment and maintenance of relationships. For example, the three linguistic items
highlighted in Table 7.6 are associated with either you or I , demonstrating the interactive
nature of many of the most frequent units in spoken language. Second, you know and I don’t
know have been associated with the realm of linguistic politeness. Extract 7.5 demonstrates
the role of you know by two young female participants in an informal, interview setting.
(7.5)
[ Context: informal interview between two female students - <$1> = third-level student;
<$2> = second-level student.]
<$2> Are there many tourists around?
15034-2314.indb 15615034-2314.indb 156 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
PRAGMATICS AND LANGUAGE VARIATION 157
<$1> Many tours? Oh tourists. Oh yeah Killarney’s like the capital of tourists. I suppose
like em French German like you know American like especially. You get every-
thing here like you know .
<$2> Do you ever talk to them?
<$1> Em some all right like you know but like I suppose I’d be really interested in the
money you know .
<$2> Do you get lots of tips?
<$1> What kind of tips they offer like. They’re all right like you know . Good craic.
<$2> Who give the best tips?
<$1> Em to be honest with you I’d say em the English do really. You see em the American’s
like are usually like they’re fairly old like you know . The way it is the old ones like you
know they don’t give you nothing like. So that’s about it like you know .
<$2 > What do you want to do when you leave school?
<$1> Hopefully I’d say I’ll go to college anyway. You see most of my brothers and sisters
went to U L so I’d like to go there as well like you know cos they’ll be up there
maybe and there’s some good craic like you know .
(LCIE)
In this extract <$1> uses you know on ten occasions over relatively few utterances, all of
which are optional. On each occasion you know is not used in a lexical sense, nor is it used
to convey uncertainty. Instead, it serves a different pragmatic function. Of the five <$1>
utterances, you know occurs in turn final position in four of them, thus signalling that she
is turning the floor over to <$2>, not because <$1> is unwilling to continue speaking, but
because the norms of politeness of the situation, an interview, necessitate it (see Fox Tree
and Schrock, 2002). You know is also seen to cluster with like , in the particular order like +
you know , in each of these instances. Aijmer (2002) argues that the clustering of certain
pragmatic markers indicates that they share a similar function and also serves to underline
their phatic nature (see also the work of Diani (2004) in relation to I don’t know and I mean ).
TASK 7.3 CLUSTERS IN ACADEMIC SPOKEN ENGLISH
Table 7.7 illustrates the top ten most frequent two-, three- and four-word units in BASE.
Using Tables 7.5 and 7.6 as reference points, discuss the similarities and/or differences
between spoken academic British English versus written academic English and/or Irish
English casual conversation.
Table 7.7 The ten most frequent two-, three- and four-word clusters in BASE
BASE two-word Frequency three-word Frequency four-word Frequency
1 of the 9958 going to be 975 the end of the 218
2 in the 7766 one of the 880 at the end of 195
3 going to 4184 a lot of 872 is going to be 191
4 you know 3722 in terms of 733 if you want to 170
5 and the 3659 I’m going to 649 to be able to 167
15034-2314.indb 15715034-2314.indb 157 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
158 PRAGMATICS AND LANGUAGE VARIATION
Finally, and as we pointed out in Chapter 1 , individual items such as know and think
provide many of the building blocks for longer clusters that are of importance in the interper-
sonal realm. For example, from Table 7.6 you know has the potential to form a syntactic frame
for a six-word PM, you know do you know do you know what do you know what I
do you know what I mean . Do you know what I mean occurs on 104 occasions in
LCIE, making it by far the most frequent six-word unit, and on 120 occasions in the Spoken
BNC1994 (remember: LCIE is a one-million-word corpus and the Spoken BNC1994 is a
ten-million-word corpus). The interpersonal nature of do you know what I mean is evident in
extract 7.6 where it acts to mark shared knowledge between speaker and listener(s).
(7.6)
[ Context: Charity committee meeting; speakers numbered in order of appearance in the
extract.]
<$1> But ano= but what I think would happen when we get maybe Mike or Chris or
somebody helping, it’s not that they’re taking work off Wendy, but that we will
do more physical checks, do you know what I mean , with the stock control.
Wendy will do the computer bit and they’ll do the counting bit. That kind of thing.
And erm somebody like Mike or Chris is perfectly capable of <unclear> get their
bits of paper to check to tick off <unclear> things like that but Wendy will have
to check it and put it you know they could do some of the legwork but not yet.
<$2> No.
(Spoken BNC1994; File J9P)
In this extract we see that <$1> does not hand the turn over to <$2> but instead keeps
talking after the use of do you know what I mean which occurs in turn medial position.
Here, the pragmatic function of the PM is slightly different to that in extract 7.5. <$1>
appears to be discussing the changing of work practices, an often troublesome area in
the workplace, and uses the PM to invite <$2> to think about what has been suggested.
Interestingly, the transcriber has separated the PM from the other utterance comment
using commas, which might suggest a short pause on the part of the speaker both before
and after the PM. This is due to the fixed nature of the phonology of multi-word units. In
other words, these units have to be said quickly in a single intonational unit (O’Keeffe
et al ., 2007). Therefore, the commas represent the probable need for the speaker to take
a breath both before and after the multi-word unit. Do you know what I mean does not
BASE two-word Frequency three-word Frequency four-word Frequency
6 to be 3612 we’re going to 605 if you look at 152
7 sort of 3601 this is the 509 in terms of the 136
8 if you 3595 if you like 494 at the same time 134
9 to the 3471 you can see 494 the way in which 129
10 this is 3053 and this is 492 going to talk about 123
Table 7.7 Continued
15034-2314.indb 15815034-2314.indb 158 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
PRAGMATICS AND LANGUAGE VARIATION 159
appear to receive an immediate verbal response in extract 7.6, although a non-verbal one
may have occurred.
The importance of the study of pragmatic markers is evidenced by their promi-
nence on spoken corpus word frequency lists. For example, we can see from Table 7.6
that items with the potential to function as PMs such as you know and I don’t know are
the most frequent two-word and three-word units respectively. Therefore, corpus word
frequency evidence highlights a considerable number of high frequency PMs that play
a crucial role in the organisation and management of spoken discourse. In terms of
variation, there appears to be some evidence that frequency patterns may be relatively
consistent across some varieties of English; Adolphs and Carter (2013), for example,
have found that I don’t know is also the most frequent three-word unit in the Spoken
BNC1994. These findings have clear implications for pedagogical awareness and inter-
vention strategies. We should be aware of features of naturally occurring speech such
as PMs and they should be added to classroom vocabulary lists due to both their fre-
quency of occurrence and their importance to successful interaction (see e.g. O’Keeffe
et al ., 2007; Martinez and Schmitt, 2012). The tentative finding that there may be some
consistency of occurrence with regard to similarity in the frequency of two-word and
three-word (or, indeed, six-word) units across different varieties of English is also of
potential pedagogic value in that learners can be quickly familiarised with a set number
of everyday spoken language routines key to successful spoken interaction across geo-
graphical space.
7.4 VAGUE LANGUAGE AND VARIATION
Vague language involves the purposeful use of words or phrases with general meaning to
refer to items in a non-specific, imprecise way (see also Chapters 8 and 9 ). This non-spe-
cific, imprecise use of language does not, however, signal that the language user is being, in
any way, sloppy or lazy in their use of language; instead vague language has been shown to
be highly interactive, prioritising interpersonal involvement above actual explicitness. Vague
language can be subdivided into a number of categories, including, but not limited to (see
also Channell, 1994):
Vague additives – for example, approximators ( about , ish , etc.) and, the focus of this
section, vague category markers ( and things like that, or whatever , etc.);
Lexical vagueness – quantifiers ( piles of , a few , etc.) and expressions such as yoke ,
thingy or whatchamacallit ;
Vagueness by implicature –
A: Can your mother help? She lives nearby.
B: She lives nearby but she’s in her seventies now.
3
(Speaker A wants Speaker B’s mother to help out domestically but Speaker B implies,
through the use of the vague expression she’s in her seventies , that she may not be in
a position to help.)
15034-2314.indb 15915034-2314.indb 159 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
160 PRAGMATICS AND LANGUAGE VARIATION
TASK 7.4 IDENTIFYING VAGUE LANGUAGE ITEMS
Extract 7.7 is part of the transcript from a cabinet meeting held in the White House on
12 February 2019 (source: https://factba.se/transcript/donald-trump-remarks-cabi-
net-meeting-february-13-2019 ). Identify the vague language items used in the extract.
(7.7)
[ Context: <$1> = US President Trump.]
<$1> They’ve already announced, in some cases – and in many cases, they have
announced – they’re moving back into the country. They want to be a part of the
United States. It’s like a miracle in the United States, what’s happening. But we
have a lot of companies that have left. In many cases, they left our country and
they’re moving back. And that means a lot of jobs. Speaking of jobs, we have to
have more people coming into our country because our real number is about 3.6,
3.7. It took a little blip up during the shutdown and went up to 4. And 4 – any
country would take a 4. But we’re about 3.7; probably going lower. We need peo-
ple. So we want to have people come into our country, but we want to have them
come in through a merit system, and we want to have them come in legally. And
that’s going to be happening. We’re doing very well in that regard. But we have
tremendous numbers of companies. And you’ve been reporting on them. A lot of
car companies are coming back to the United States. We want to keep the job
boom going strong, and we must protect our economy.
(Cabinet Meeting, White House, 12 February 2019)
Similar to pragmatic markers, vague language has, through a process of grammatical-
isation, acquired textual and interpersonal functions which are distinct from the items’ origi-
nal meaning. In order to access vague reference, listeners rely on shared context; therefore,
vague language tends to be heavily context dependent. It relies on conversational participants
for interpretation and is therefore a strong indicator of shared knowledge and marker of
in-group membership. In order to examine pragmatic variation we will focus on a subcategory
of vague additives that are referred to as vague category markers (VCMs). VCMs are, typically,
a set of expressions, often clause or turn final, that consist of a conjunction, and or or , fol-
lowed by a noun phrase. VCMs that begin with and are traditionally referred to as adjunctives ,
whereas those that begin with or are known as disjunctives (see Overstreet and Yule, 1997a,
1997b).
4 In order to do this, we take a form-to-function approach and draw upon existing
research to identify six frequent examples of VCMs (see e.g. Walsh et al ., 2008; Vaughan et al .,
2017) and compare them across two corpora: the Limerick Corpus of Intimate Talk (LINT)
and the British Academic Spoken English (BASE) corpus. The tokens chosen for analysis
are: the adjunctive VCMs (and) (all) (that) kind of thing ; (and) (all) (that) sort of thing ; (and) (all)
(that) type of thing ; and the disjunctive VCMs or whatever ; or something and or anything . The
normalised frequency results for these selected adjunctives are presented in Table 7.8 .
15034-2314.indb 16015034-2314.indb 160 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
PRAGMATICS AND LANGUAGE VARIATION 161
Table 7.8 shows that what have traditionally been referred to in the literature as adjunctives
are, overall, more common in spoken academic discourse than in the discourse of family
and close friends. Vague language in spoken academic discourse has a role to play in the
presentation and organisation of knowledge. It may also play a part in inducting students
into the community of practice of their chosen discipline. In addition, adjunctives have been
shown to be used to invite solidarity and to stress in-group membership and social similarity
(see e.g. Overstreet, 1999; Aijmer, 2013).
Extract 7.8 demonstrates the use of the VCM and that sort of thing (marked in bold) in
BASE. Other vague language items, primarily kind of and sort of , and the PM you know (see
Section 7.3) have also been highlighted in the extract.
(7.8)
[ Context: History of art seminar; <$1>, <$2>, <$4> = students; <$3> non-student.
Speakers numbered according to the order in which they occur in the extract.]
<$1> thank you and we thought that we we’ve divided his work into sort of periods kind
of the first one is has to do with parcels and the whole idea of wrapping things up
in a kind of rough manner like across like the bridge here and the and and in a
kind of surrealist yeah like that one like wrapping things up and then the second
period is all about beauty and that kind of renaissance you know sort of period
of beautiful drapery and and monumentality like like that one you see the that’s
kind of isn’t it orange
<$2> yeah
<$1> yeah it’s sort of islands and that sort of thing so he moves away from the what
we have called the parcel era
<$3> the what
<$1> parcel era it kind of
<$4> the parcel era
<$2> well we we did take it into this he started off by wrapping all sorts of objects
anything you know so a cup a can and these are all kind of in in articles
like you know later works which you know cost millions and are are are
you know are are really being done on a big scale he he we discussed his we
we you know his modus oper of you know he has this fixation for wrapping
things he likes small things and then he goes on and he starts wrapping like the
Table 7.8 Comparative frequencies of selected adjunctive VCMs in the LINT and BASE corpora
(normalised per million words)
LINT BASE
(and) (all) (that) kind of thing 73 43
(and) (all) (that) sort of thing 10 83
(and) (all) (that) type of thing 2 7
TOTAL 85 133
15034-2314.indb 16115034-2314.indb 161 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
162 PRAGMATICS AND LANGUAGE VARIATION
the Chicago museum and it’s just you know it it’s and so and he says oh I’ll just
wrap bigger things
(BASE: File ahsem007)
In extract 7.8 <$1> uses the VCM with the noun phrase islands in islands and that sort of
thing . The students are discussing the artist Christo (and his later collaborations with Jeanne-
Claude), famous for wrapping everyday objects in fabric in order to turn them into sculptural
items. He started with everyday objects such as telephones, but later moved on to larger
collaborative projects such as the Reichstag in Berlin and the Pont Neuf in Paris, but also
islands and stretches of coastline. In this extract, <$1> uses islands and that sort of thing to
acknowledge that their fellow students have this shared knowledge and so there is no need
to say, for example, islands and the Reichstag and the Pont Neuf and stretches of coastline , as
it is assumed that this knowledge is implicit and shared. In doing this, <$1> is constructing
solidarity, in-group membership and social similarity among their peers. <$1> also frequently
alternates between kind of and sort of markers that have been flagged to reduce social dis-
tance and to express a desire for a relaxed relationship between speakers and listeners (see
e.g. Holmes, 1993; Aijmer, 2002). Finally, you know is used on multiple occasions by <$2>
in the final turn in the extract, where it again clusters with like (see also extract 7.5) on one
occasion. On this occasion you know may function in a similar way to the vague items used
by <$1> in that the speaker may not feel the need to expand specifically on their ideas given
the shared knowledge that exists between the classmates (cf. Clancy, 2016).
TASK 7.5 ADJUNCTIVES IN COCA
Table 7.9 illustrates the frequency results for the search item * that type of thing in the
COCA corpus overall and in the individual spoken, fiction, magazine, newspaper and
academic sections. As can be seen, * that type of thing occurs predominantly in the spo-
ken component of COCA, with only one occurrence of the adjunctive in the academic
component. This appears to contradict the findings of Table 7.8 .
1. How might this apparent contradiction be explained? Pay particular attention to
corpus design when considering the answer to this question.
Table 7.9 Frequency counts for the item * that type of thing in COCA
15034-2314.indb 16215034-2314.indb 162 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
PRAGMATICS AND LANGUAGE VARIATION 163
Table 7.10 Comparative frequencies of selected disjunctive
VCMs in the LINT and BASE corpora (normalised per million
words)
LINT BASE
or whatever 174 237
or something 745 226
or anything 269 53
TOTAL 1188 516
In addition, disjunctives are more than twice as frequent in the speech of intimates as they
are in spoken academic discourse. In their study, Overstreet and Yule (1997a, 1997b)
found that while adjunctives were present in the conversation of non-familiars, disjunc-
tives were relatively rare. In extract 7.9, the speakers are discussing various liquid nutri-
tional supplements and <$1> is searching for the name of a particular brand. <$2>
offers a range of suggestions, including Ensure , which she states is like a milkshake or
something . Here, the or something is used as a mitigator or softener to indicate that there
are other, alternative options to what she has suggested and that what she has proposed
is not definitive.
(7.9)
[ Context : Two female speakers talking about a sick relative. <$1> is in the 40 –50 age
group and <$2> is in her twenties.]
<$2> There’s a few. There’s Complan and then there’s the Build-up.
<$1> There’s there’s something else like that now.
<$3> What’s the what’s the <$O> name of that one? </$O>
<$2> <$O> Ensure </$O> There’s Complan or Build-up or Ensure. Ensure like Ensure
is like a drink. Like a milkshake or something .
<$1> Ensure.
<$2> Ensure or Ensure <$O> Plus </$O>
<$1> <$O> Sip? </$O>
<$2> Fortisip?
<$1> What?
<$2> Fortisip
<$1> That’s the one.
(LCIE)
The disjunctive or something has been shown to function to mark utterance content
as inaccurate or approximate, or to indicate alternative options and express tentativeness
In contrast to Table 7.8 , Table 7.10 shows that disjunctives are notably more frequent
than adjunctives in both the discourse of intimates and spoken academic discourse.
15034-2314.indb 16315034-2314.indb 163 9/10/2019 4:32:13 PM9/10/2019 4:32:13 PM
164 PRAGMATICS AND LANGUAGE VARIATION
in relation to offers, proposals or requests (Overstreet, 1999; Clancy, 2016). This use of or
something as an indicator of the existence of alternative options is supported by <$1>’s
continued efforts to remember a particular brand Fortisip which the participants eventually
manage to negotiate together. In spoken academic settings there may be a reluctance,
especially on the part of students, to indicate inaccuracy, approximation or tentativeness
in their spoken interactions, given that their tutors/lecturers may be present and that they
might be graded on content. This may account for the lower frequency of the disjunctive
VCMs in Table 7.10 .
TASK 7.6 DISJUNCTIVES IN COCA
Table 7.11 illustrates the frequency results for or whatever in the overall COCA corpus
and its component parts. Although the frequency counts appear to support the hypothe-
sis that disjunctives are more frequent in spoken discourse than in academic discourse,
there is a much greater frequency difference between or whatever in the spoken and
academic components of COCA than is evident in Table 7.10 . How might this frequency
discrepancy be explained?
Table 7.11 Frequency counts for the item or whatever in COCA
Vague language has been extensively studied and markers of vague language are
seen as central to everyday, efficient discourse, especially in the case of spoken language
(see e.g. Channell, 1994; Cutting, 2007; Cheng and O’Keeffe, 2015; Haselow, 2017).
There is a variety of linguistic resources available to speakers to mark formality or infor-
mality, closeness or distance, or politeness or impoliteness. Vague language plays a role in
all of these social intricacies given, for example, its interpersonal role in softening expres-
sions, marking in-group membership or simply as a representation of a speaker attending
to a listener’s needs by not being pedantic. As Carter and McCarthy (2006: 202) remark,
vague language should not be viewed as a sign of careless thinking or sloppy expression
but rather of the ‘sensitivity and skill’ of a speaker. It is for these reasons that vague cate-
gory markers have been referred to as ‘exemplars of pragmatic encoding par excellence’
(Vaughan et al ., 2017: 212).
15034-2314.indb 16415034-2314.indb 164 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
PRAGMATICS AND LANGUAGE VARIATION 165
7.5 SPEECH ACTS AND VARIATION
In order to examine speech acts and variation within the same language, we have decided
to focus on gratitude. Without dwelling too long on something that we have previously
addressed in Chapter 3 , the use of corpora in the examination of speech acts has been
much debated. However, one certain way of overcoming the form-to-function mismatches
that characterise the automatic retrieval of linguistic phenomena such as speech acts from
corpora is the use of lexical hooks to search corpora (Rühlemann, 2010; Vaughan et al .,
2017; Clancy, 2018; Rühlemann and Clancy, 2018). In the specific case of speech acts,
these hooks are often referred to as illocutionary force indicating devices (IFIDs; see Chap-
ters 3 and 6 ). Therefore, this section will focus on three IFIDs commonly associated with
expressions of gratitude: thank , thanks and cheers .
TASK 7.7 SPEECH ACTS AND VARIATION
Based on your observations and language intuition, what are the three most commonly
used expressions of gratitude in English? Go to the Google N-Gram Viewer (https://
books.google.com/ngrams) and search for these items by typing each of the three, sep-
arated by a comma, into the search box. This task can, of course, be done using other
languages – this tool allows the user to search a variety of language corpora, such as
Chinese, French, German or Hebrew.
1) How might the results generated from Google N-Gram Viewer , based on your
choice of expressions of gratitude, be interpreted?
2) Do they confirm or refute your intuitions about the most commonly used expres-
sions of gratitude?
In order to offer a little rationale for our choice of items, and to respond to Task 7.7,
thanks, thank you and cheers were entered into the Google Ngram Viewer (https://books.
google.com/ngrams). Without wishing to debate the pros and cons of the use of this corpus
tool (see e.g. O’Keeffe 2018), what we can see on an exploratory level from Figures 7.3 and
7.4 is that since approximately the 1980s, thanks, thank you and cheers have been enjoying
something of a renaissance in both American and British English, in that their frequency of
use has begun to rise.
While we freely admit that this chapter is predominantly focused on spoken language
and expressions such as thanks or cheers may be used in informal speech contexts, rep-
resentations of which are not widely contained in the Google books corpus, or, in the case
of cheers , may have changed its use as a speech act of toasting to one of gratitude (see
Schauer and Adolphs, 2006), neither, to borrow Scott’s (2017) keyword analogy, are we
comparing apples with phone boxes. Therefore, in order to examine more thoroughly
the results from Figures 7.3 and 7.4 , we have compared the frequency of thank , thanks and
cheers in the Spoken BNC1994 to the Spoken BNC2014. The counts for thank include
thank you, thank you very much and thank you (ever) so much and those for thanks include
15034-2314.indb 16515034-2314.indb 165 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
166 PRAGMATICS AND LANGUAGE VARIATION
thanks very much, thanks (ever) so much, thanks for that and thanks a million . The frequency
counts were generated using Lancaster University’s online Corpus Query Processor or
CQPweb interface (https://cqpweb.lancs.ac.uk/).
Figure 7.4 Frequency of use of thanks, thank you and cheers in British English, 1800 to 2008
Figure 7.3 Frequency of use of thanks, thank you and cheers in American English, 1800 to 2008
TASK 7.8 SPEECH ACTS AND VARIATION OVER TIME
Table 7.12 presents the frequency results, normalised per million words, for thank , thanks
and cheers in the Spoken BNC1994 and the Spoken BNC2014. 5
1) How have the frequency results for the use of the items changed over the 20-year
time period between the Spoken BNC1994 and the Spoken BNC2014?
2) Why might this be the case?
15034-2314.indb 16615034-2314.indb 166 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
PRAGMATICS AND LANGUAGE VARIATION 167
As can be seen, Table 7.12 demonstrates that the frequencies for both thank and thanks
have in fact dropped, though in the case of thanks not by very much, in the years between
the collection of the Spoken BNC1994, completed in 1994, and the Spoken BNC2014,
collected between 2012 and 2016. In contrast, the frequency of use per million words of
cheers has almost tripled. This approach to the study of variation is admittedly a broad-based
one, where all speakers in both corpora have been treated as a homogeneous whole and no
variation within the corpora has been addressed, at least not yet, nor have the items been
disambiguated for non-gratitude use; for example, We heard loud cheers coming from the
stadium or We had Sue to thank for that or She managed it no thanks to them . In order to
investigate more closely the possible reasons for these frequency changes, we present the
ten most frequent collocates for each of the three items in the Spoken BNC2014 in order
to give us a more contemporary picture of the present behaviour of these items. We are,
after all, interested in why the use of cheers has increased so dramatically over recent years.
Collocation is, as we have described in Chapter 1 , an approach to the study of a lin-
guistic item that considers the likelihood of words occurring next to or near one another.
The CQP corpus software allows us to generate lists of collocates and accompanies those
lists with log-likelihood scores. Therefore, the results for thank, thanks and cheers represent
collocation by significance – the higher the log-likelihood score, the stronger the evidence
that the items collocating is not purely due to chance. The top ten collocates of thank are
illustrated in Table 7.13 . The default parameters automatically set by CQP were used for
Table 7.12 Frequencies of thank, thanks and cheers in the Spoken BNC1994 versus the Spo-
ken BNC2014 (normalised per million words)
Spoken BNC1994 Spoken BNC2014
Item Frequency Item Frequency
thank 512.55 thank 348.26
thanks 140.95 thanks 110.92
cheers 13.52 cheers 35.89
Table 7.13 Top ten collocates for thank in the Spoken BNC2014
Spoken BNC2014
N Collocate Log-likelihood
1 you 11,603.608
2 thank 3102.55
3 much 1815.444
4 very 1593.37
5 oh 439.593
6 UNCLEAR 432.154
7 please 389.254
8 thanks 388.794
9 okay 364.749
10 god 345.414
15034-2314.indb 16715034-2314.indb 167 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
168 PRAGMATICS AND LANGUAGE VARIATION
Table 7.14 Top ten collocates for thanks in the Spoken BNC2014
Spoken BNC2014
N Collocate Log-likelihood
1 for 473.827
2 thank 386.088
3 much 210.342
4 cheers 205.895
5 alright 202.285
6 okay 197.589
7 oh 192.166
8 fine 183.963
9 ‘m 177.755
10 no 166.338
Tables 7.13 to 7.15 – collocates are within -/+3 items of the node. Punctuation marks have
been excluded from the collocation lists for each item but extralinguistic information such
as UNCLEAR in Table 7.13 and NAME in Table 7.15 has been included.
The collocates of thank reveal quite a lot about the patterns that occur with it. For example,
you is the most frequent collocate, the pronoun providing evidence of the interactivity of the
item, and is closely followed by much and very . Hence, the fixed phrase thank you very much
which is the third most frequent four-word unit in the Spoken BNC1994 (Adolphs and Carter,
2013). We can also see that items such as thank (and, indeed, thanks and cheers ) are prone
to repetition and reciprocation, as both thank and thanks are collocates. Schauer and Adolphs
(2006), for example, have demonstrated how the repeated and reciprocal use of expressions
of gratitude can be extended over several conversational turns, as discussed in Chapter 3 .
Please is also a collocate and, as you might expect, occurs most frequently in the speaker
turn that immediately precedes the turn containing thank . There are two response tokens oh
and okay (see Section 7.2) – these most frequently precede thank , such as in Oh thank you or
Okay thank you . However, I’m okay thank you , while not very frequent in the corpus (there are
12 occurrences), is, nonetheless, an example of a refusal rather than gratitude. Schauer and
Adolphs (2006: 129) emphasise the importance of knowing how to politely refuse an offer,
suggesting that ‘the ability to express gratitude and at the same time to refuse a proposition is
one of the main skills that students might need to possess in a native speaker context’.
Table 7.14 demonstrates the top ten collocates for thanks in the Spoken BNC2014.
Thanks shares a number of similar traits to thank but there are also some differences that
are also worthy of note.
Similar to thank , thanks is prone to repetition and reciprocation and collocates with
both thank and also, in this case, cheers (in position 4). There are also other items we first
encountered in Table 7.13 such as much , oh and okay . In contrast to thank , for is the most
frequent collocate and this item is associated with a range of fixed phrases such as
thanks
for that , thanks for the heads up, thanks for asking or thanks for getting back to me . There is,
of course, evidence of other phrases present in the collocation frequency list, such as the
obvious I’m fine thanks . Interestingly, there is no pronoun in the top ten – the item thanks in
itself perhaps embodying interactivity due to its inherent informality. Finally, thanks is also
15034-2314.indb 16815034-2314.indb 168 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
PRAGMATICS AND LANGUAGE VARIATION 169
associated with the speech act of refusal with no occurring in position 10 – 60 per cent of
these occurrences of no directly precede thanks .
There are a couple of elements that distinguish cheers from thank and thanks . The first of
these is that, as well as being a noun, it has lexical meaning as a verb which means that it can
be used with a third person -s . However, the Spoken BNC2014 contains only one example of
the use of cheers as a verb. The second element is that it can be used as a toast rather than as
an expression of gratitude and we further discuss this in relation to its collocates here. The third
element is the frequency issue – as Table 7.12 has shown. In the past 20 years or so the use
of thank and thanks has become slightly less frequent, whereas the use of cheers has almost
tripled in its frequency of use. The top ten collocates for cheers are illustrated in Table 7.15 .
Table 7.15 Top ten collocates for cheers in the Spoken BNC2014
Spoken BNC2014
N Collocate Log-likelihood
1 cheers 41,74.703
2 thank 318.2
3 mate 231.132
4 thanks 204.799
5 NAME 89.882
6 dears 73.992
7 bye 69.538
8 mm 52.783
9 alright 43.368
10 guys 42.204
Again, Table 7.15 shows these items to be collocates of one another, demonstrating
their tendency towards repetition and reciprocity indicating their interactivity. Cheers is its
most frequent collocate, though of note here is that within the collocation window of -3 to 3,
cheers shows a relatively even spread of frequency across all positions which basically
means that cheers does not necessarily show a pattern of immediately preceding or follow-
ing itself. What is most striking about cheers , in contrast to thank and thanks , is its tendency
to collocate strongly with terms of address (see Chapter 5 ): mate (position 3), NAME (posi-
tion 5), dears (position 6) and guys (position 10). Of these, dear is a term of endearment , and
mate and guys belong in the category of familiarisers ; both categories are on the informal
end of the cline of terms of address and are used to bring people together through the
creation of a shared membership (see Leech, 1999).
In Chapter 3 we discussed the methodological approach of using the available meta-
data from a corpus to drill down into IFIDs such as cheers in order to provide more fine-
grained analysis. For example, in the Spoken BNC2014, the occurrences of cheers guys
and cheers (my) dears , although too few to allow for any type of generalisation, are used pre-
dominantly by females in the 19- to 29-year-old age group. The rhyming present in cheers
(my) dears is echoed by another phrase present in the Spoken BNC2014, cheers big ears ,
and is evidence of creative play with language. According to Carter (2004: 108), linguistic
creativity and inventiveness is almost always contextually embedded ‘in so far as it pertains
to the social relations which obtain between participants’ – the data gathered for the Spoken
15034-2314.indb 16915034-2314.indb 169 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
170 PRAGMATICS AND LANGUAGE VARIATION
BNC2014 is predominantly from informal speech contexts (Love et al ., 2017) or, as detailed
on the BNC2014 website, ‘recorded in informal settings (typically at home) and took place
among friends and family members’, and, in the case of cheers mate and cheers guys, (my)
dears and big ears , the participants appear to be on a mainly similar sociocultural footing.
TASK 7.9 CHEERS AND SOCIOLINGUISTIC VARIATION
In the Spoken BNC2014, cheers mate occurs 20 times. Access the concordance lines
using the CQPWeb interface and, using the corpus metadata, determine the predomi-
nantly gender, age and socioeconomic bracket of the users of this phrase.
Finally, it should also be noted that in Table 7.15 cheers collocates with bye and this
connection, also noted by Schauer and Adolphs (2006), is a problematic one and highlights
the challenges associated with the corpus pragmatic analysis of IFIDs as well as prag-
matic annotation (see Chapters 3 and 6 ). In the case of cheers , it can be difficult to ascer-
tain whether it is an expression of gratitude or a discourse marker signalling leave-taking.
This does, however, point towards the possibility of cheers being involved in three different
speech act types: toast, gratitude and leave-taking. This, coupled with cheers ’ tendency
towards heightened interactivity, as evidenced through its collocating with terms of address,
and with the spoken language characteristic of the Spoken BNC2014 (see Love et al .,
2017), may account for its rise in frequency in Table 7.15 . Indeed, we might speculate
that the data collection devices, personal smart phones utilised for the Spoken BNC2014,
allowed for more interactivity given that smart phones are far less intrusive than the devices
employed as part of the collection of the demographically sampled component of the Spo-
ken BNC1994. The rise in frequency of use of cheers does, however, again raise a core
issue in this chapter – different varieties of the same language have different strategies,
both linguistic and non-linguistic, at their disposal in order to accomplish successful prag-
matic interaction. However, what emerges from the chapter, and intralingual pragmatic
research in general, is that users of different varieties have different perceptions of which
strategy is appropriate in which situation.
7.6 FURTHER READING
Aijmer, K., 2013. Understanding Pragmatic Markers: A Variational Pragmatic Approach .
Edinburgh: Edinburgh University Press.
In her book, Aijmer marries the traditional subset of pragmatic markers – items such as well ,
actually or in fact – with the more recent approach of variational pragmatics, i.e. pragmatic
variation at a macro- and micro-level. Therefore, the book focuses on variation in pragmatic
markers at varietal, text and activity level. Using the ICE suite of corpora, Aijmer demon-
strates how the examination of PMs at these levels broadens our understanding of the
categorisation and function of these key pragmatic items.
15034-2314.indb 17015034-2314.indb 170 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
PRAGMATICS AND LANGUAGE VARIATION 171
Flöck, I., 2016. Requests in American and British English . Amsterdam: John Benjamins.
This volume examines the use of request strategies across both cultural and methodologi-
cal dimensions. From a cultural point of view, the structure of request strategies is compared
in naturally occurring conversation in British and American English. From a methodological
pint of view, these strategies are compared in non-elicited data in the form of ICE-Great
Britain and the Santa Barbara Corpus of Spoken American English and elicited data in the
form of DCTs (see Chapter 2 ). The validity of the current use of DCTs as a method of col-
lecting data for pragmatic research is challenged, and suggestions aimed at modifying and
improving this data collection technique are posited.
McCarthy, M., 2015. ‘Tis mad yeah’: Turn openers in Irish and British English. In
C. Amador-Moreno, K. McCafferty and E. Vaughan (eds), Pragmatic Markers in Irish
English . Amsterdam: John Benjamins, pp. 156–175.
This chapter demonstrates how corpus linguistics can be used to highlight the import-
ant connection between the turn-taking system and its pragmatic function. Building on
his (2002) work in relation to single-word lexical response tokens in British and North
American English, McCarthy encourages the broadening of what are traditionally con-
sidered pragmatic markers to include turn-initial, non-minimal lexical response tokens
such as right, lovely, grand , etc. What emerges is that varieties of English share much in
common in terms of the items that realise pragmatic functions at turn openings; how-
ever, each variety does show a preference for a distinct core of items at initial position
in the turn.
Vaughan, E., McCarthy, M. and Clancy, B. 2017. ‘Vague category markers as turn final
items in Irish English.’ World Englishes , 36(2), 208–223.
Although the title suggests a focus on one variety of English, this article compares intimate
corpus data from Irish and British English. In particular, the focus is on vague category
markers – (and) things/stuff (like that), and/or whatever, and so forth – which have been
identified in the previous literature as frequently occurring. These items are then used as
linguistic hooks to search the corpora. The findings show that VCMs in final position fre-
quently trigger speaker change but that their use as a trigger is more common in British
English than in Irish English which tends to favour more traditional pragmatic markers such
as like and you know in turn final position.
NOTES
1 The term ‘architecture’ is attributed to the work of Seedhouse (2005).
2 Throughout the book, we mostly refer to these items as multi-word units . However, in the
WordSmith Tools software, these items are referred to as clusters .
3 Example adapted from Cheng and Warren (2003).
4 Vague category markers have also variously been referred to in the literature as general
extenders (Overstreet and Yule, 1997a, 1997b), generalised list completers (Jefferson,
1990), tags (Ward and Birner, 1993), terminal tags (Dines, 1980), extension particles
(Dubois, 1993), vague category identifiers (Channell, 1994) and vague extenders (Stubbe
15034-2314.indb 17115034-2314.indb 171 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
172 PRAGMATICS AND LANGUAGE VARIATION
and Holmes, 1995; Cheshire, 2007; Tagliamonte and Denis, 2010; Parvaresh et al ., 2012;
Parvaresh, 2018).
5 These frequency differences are statistically significant. To check the statistical significance
of comparative frequency results, an online tool such as the ‘Log-likelihood and effect size
calculator’ (http://ucrel.lancs.ac.uk/llwizard.html) can be used.
15034-2314.indb 17215034-2314.indb 172 9/10/2019 4:32:14 PM9/10/2019 4:32:14 PM
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.