ResearchPDF Available

Watt, D. & Gunn, B. (2016). 'The sound of 2066: A report commissioned by HSBC'. 26th September 2016.

  • City, University of London

Abstract and Figures

Report commissioned by HSBC as a tie-in to the integration of voice biometrics technology in their telephone banking operations.
No caption available
No caption available
No caption available
No caption available
No caption available
Content may be subject to copyright.
The sound of 2066
A report commissioned by HSBC
Written by Dominic Watt and Brendan Gunn
An introduction 3
The homogenisation of English? 5
‘Informalisation’ of English: talking
to machines and listening to Americans 6
Sounds of the city 7
Conclusions 14
Acknowledgements 15
The sound of 2066
An Introduction
HSBC is launching voice biometrics as an element
of its digital banking services.
The system verifies a caller’s identity using leading-edge
voiceprint technology, allowing customers access to their
accounts using a simple universal ‘pass phrase’.
As time goes on, voice-activated systems of this kind will be
an ever more central part of our lives. 50 years from now, in
2066, we will only rarely interact with machines by pressing
buttons, and the keyboard will have become obsolete.
Almost everyone can talk faster than they can type, and
talking is the most natural communication system we
possess. Speech recognition tools like Siri and Cortana are
already part of our everyday lives, but these are only the
beginning. Over the next decades the successors to these
systems will become ever more reliable and ‘smarter’,
as they take advantage of the boundless potential of the
internet to train themselves to anticipate users’ needs and
to respond efficiently to our commands.
Our current speech technologies perform well under difficult
conditions. They can cope with high levels of background noise, or when the speaker has a head
cold or a sore throat. Strong regional or foreign accents don’t affect their performance because
the systems are trained to compensate for the numerous ways in which our speech varies. And
impressive as these tools already are, they are improving all the time. In the future, our devices
will understand everything we tell them. The way we interact with machines will converge on
how we talk to other people, to the point where there will be no obvious differences between
the two.
Balthazar Cohen, author of the ‘Totes Ridic-tionary’, described the internet as the place ‘where
language goes to die’. In reality it’s just the opposite. The web is an inexhaustible wellspring of
new words and phrases. Already we see how easily internet-inspired abbreviations like ‘LOL
(laugh(ing) out loud), ‘FOMO’ (fear of missing out), ‘FOLO’ (fear of living offline), and ‘brb’ (be
right back) have been turned into words (LOL to rhyme with ‘doll’, ‘brb’ with ‘curb’). These aren’t
just confined to the speech of the young, either, as shown recently by the jokingly vengeful use
of ‘LOL’ by a Scottish judge as he passed down a prison sentence. Emojis have been embraced
as part of written English, to the extent that the Oxford Dictionaries UK Word of the Year in 2015
was the ‘Face with Tears of Joy’ symbol. We will find ways of integrating them into our speech
too. There is even the possibility that in the near future, our computers will themselves invent
new words and phrases, ones which we’ll start to use ourselves because they seem especially
useful or pithy.
We tend to think of computers as things that sit on our desks or that we carry around in
our pockets, but they are of course already all around us: in car engines, inside our washing
machines, or controlling the heating in our homes. Very soon all these systems will be connected
together. The era of the ‘internet of things’ is all but upon us. Our homes, workplaces and means
of transport will be ever more interconnected, with each appliance communicating with the other
devices in its local network, and with the wider world via the web. In a sense, we ourselves will
become elements of that network, while keeping executive control over the important decisions.
Smart technologies will learn and adapt by tracking how we humans change in our preferences
and our habits, and because we will give instructions using our voices they must of course keep
The sound of 2066
pace with changes in our speech and language.
Languages change constantly, and they do so whether or not we want them to. New words
replace old ones, grammatical rules arise and fade away, and the ways we pronounce vowels
and consonants are always shifting and mutating. English has changed enormously over its
1,500-year history. Even in the last 50 years we have seen big changes in the accents and
dialects of the language, including Standard English. This leads us to ask: what will English be
like 50 years from now?
In this report, we make a number of predictions about how some key accents of British English
might sound in half a century’s time. Some of the changes we identify have in fact already
started. In other cases we’re being more speculative, but by looking at how English has changed
over the last 50 years, we can identify patterns that seem to repeat. For one thing, people tend
to like to make talking as easy for themselves as they can, but without making life too hard for
the hearer. So they knock off sounds at the ends of words (‘tex’ for ‘text’, ‘vex’ for ‘vexed’), they
simplify complicated sequences of consonants (hardly anyone says ‘syoot’ for ‘suit’ any more),
and they rub the sharp corners off sounds by making them ‘softer’. For example, although we
say electric with a hard /k/ on the end, we say electricity with an /s/, and electrician with a ‘sh’
Languages also change when they come into contact with one another. English has borrowed
thousands of words from other languages: mainly French, Latin and Greek, but there are ‘loan
words’ from dozens of other languages in the mix. For instance, we wouldn’t say we’d spilled
chutney and shampoo on the veranda of the bungalow without first having borrowed these
words from Hindi.
Our speech and language patterns are absolutely central to our individual identities, and we
exercise ‘consumer choice’ over which new linguistic trends we buy into, much as we do when
choosing music or clothing. We adopt new ways of saying things because they’re fashionable
or cool, or because we want to sound like we’re a member of a particular group of people.
We use language to tell others something about ourselves in a way that costs nothing and is
very immediate: uttering just a few syllables can be enough to signal where you come from,
and what kind of social groups you identify with or admire. Young people often try very hard
to sound different from people of their parents’ generation. Using the right sort of words and
pronunciations can be an enormously powerful symbol of belonging, of being cool, of having the
right sort of knowledge, of being ‘now’. However, in time what was once the height of linguistic
fashion comes to seem stale, staid, and conventional, and so new trends must be followed by
those who want to seem the most up-to-date and street-smart.
We must always allow for the unexpected, too: by 2066 English may have altered in ways we
hadn’t seen coming. This endless cycle of innovation and renewal is what makes the study of
language change so fascinating.
The sound of 2066
The homogenisation of English?
We can think of the dialect map of the UK as a jigsaw in which the pieces were once very small.
Individual districts, towns and villages had their own dialects. Over the last century or so, the
jigsaw pieces have grown larger, as dialects have become more focussed on the bigger urban
centres such as Newcastle or Manchester. These days it can be harder to tell where someone
is from on the basis of his or her speech than it was a couple of generations ago: the dialect
distinctions between Yorkshire and Lancashire, or between Merseyside and north Wales,
are becoming more blurred. This is usually put down to greater mobility, with people moving
sometimes quite large distances to other towns and cities to study or find work, or relocating
from the cities into the countryside in search of a better quality of life or more affordable housing.
But it isn’t the case that we’re all starting to sound alike. As we’ll see below, new varieties
are taking root in different parts of the country. It’s mainly the traditional rural dialects that are
becoming less distinct from one another.
We’re not all becoming more standard in our speech, either. Over the last 50 years we have
also seen Standard English and Received Pronunciation (‘Queen’s English’) lose some of their
status. Where once it was more or less obligatory to speak these for anyone wishing to enter
the professions, the clergy, the upper ranks of the military, acting, or broadcasting, these days,
non-standard accents and dialects are much more widely accepted. We’ve come to realise
that speaking in such-and-such a way isn’t necessarily a sure sign of someone’s intelligence
or competence. This improves opportunities for people from a wider variety of social and
educational backgrounds. It’s sometimes forgotten that even the standard forms of English are
always changing. Today we laugh at the way announcers spoke in TV news programmes from
the 1960s because it seems so stiff and old-fashioned. It would sound odd if someone born in
1966 – say, David Cameron – were to speak like someone of his grandfather’s generation. We
don’t expect young members of the Royal Family to speak in the same way as old ones do. The
Queen’s English spoken by Prince George as he grows up is not going to be the same as the
Queen’s English spoken by the Queen.
Looking more globally, Chinese and Spanish seem set to become yet more influential worldwide,
leading to large numbers of words and phrases from these languages coming into mainstream
use in English. Other major languages, such as Japanese, Portuguese, Arabic or Russian, may
boost English vocabulary by donating names for new concepts.
The sound of 2066
‘Informalisation’ of English: talking to machines
and listening to Americans
As we’ve seen, high technology is a very rich source of new words in English. In turn, English
provides other languages with new terms they need in this area. Young people everywhere now
use the English words app, troll, or hashtag rather than the equivalents in their own languages.
English is the language of the latest trends in social media, and computer users know that being
in command of the latest terms will allow them to participate in a globally connected world.
Though the science that underlies systems such as Twitter and Facebook is advanced and
hugely complex, the innovators and designers behind these brands want to keep the image of
social media as relaxed and informal as possible. The terms that are used for common functions
and ways users can interact (like, friend, follow, retweet, block) are therefore short, simple
and memorable ones. The fact that so many innovations in computing come from California is
undoubtedly linked to this relaxed and unpretentious approach.
A preference for informal, chatty and jokey language in the technological and scientific domains
is a recent phenomenon, but it’s one which makes these areas seem more accessible and less
po-faced, and we are likely to see more and more of it. After all, there’s really no good reason
we shouldn’t name features on the surface of Pluto and its moon Charon after characters from
Star Wars, Star Trek or The Lord of the Rings, or call underground bacteria snottites because
they look like nasal mucus dangling from cave roofs, or name an Antarctic research vessel Boaty
McBoatface, just for the fun of it. A glance at the online Urbandictionary testifies to the endless
creativity and humour of English speakers. Freeing ordinary language users up to invent and
share new words and phrases like this is a mark of how much more democratic and liberated our
linguistic lives have become.
With all of these factors in mind, we turn now to ask what the English of 2066 might sound like
in different cities around the country.
The sound of 2066
Sounds of
the city
It’s often said that traditional working-class London speech
– Cockney – has more or less died out. We can now hear
a hybrid accent known as ‘Estuary English’ (EE), which
combines older London features with more standard-like
speech forms. EE is recognisably south-eastern, but it can
be very hard to locate a speaker within that region. It also
seems to blur the class divide, leading to accusations that
some middle-class speakers – politicians such as Nigel
Farage and celebrities like Jamie Oliver – ‘dumb down’ their
speech so as to conceal a privileged upbringing or to sound
more like they are ‘one of the people’. EE has similarities to
another newcomer on the UK dialect scene, ‘Multicultural
London English’ (MLE). MLE incorporates pronunciations
from Englishes spoken by people from ethnic minority
groups, particularly from the Caribbean, West African
and Asian communities. Given this mix, and the status
of London as the linguistically most influential city in the
English-speaking world, we can expect to see significant
changes between now and the middle of the century.
For example, there are signs that /h/ is being restored. Generations of Londoners have dropped
/h/ from the beginnings of words like hat, Highgate, Harrods, Hampstead Heath, or Henry
Higgins. Another feature of London speech is the treatment of the two ‘th’ dental consonants,
as in words like thin and this. We see either ‘TH-stopping’ (dis and dat) or ‘TH-fronting’ (fink for
‘think’, muvver for ‘mother’). In future we’re likely to see the standard ‘th’ sounds being lost
altogether. Fin and thin will no longer be distinguished even in careful speech, and bother will
always rhyme with hover. This may come as a relief to foreign learners of English, who struggle
with the dentals more than any other pair of sounds.
Saying dook for ‘duke’ or nooze for ‘news’ is already pretty firmly established in London, but
this habit, known as ‘yod-dropping’, may continue so that even words like cute or beauty are
affected, as they are in East Anglia, where they’re pronounced the same as coot and booty.
Simplifying clusters of consonants like this is one way English has changed over its history. We
don’t say the /k/ at the beginning of ‘knee’ or ‘knight’ any more, or the /w/ that used to occur at
the beginning of ‘wrong’ (these letters are now silent, but we haven’t ever bothered to change
the spelling). We’ve lost some other great consonant clusters since the earliest days of English:
the word for ‘to sneeze’ in Old English, for example, had a very sneezy-sounding /fn/ sequence at
the beginning.
/w/ and /r/ are already very similar for many southern English talkers (e.g. Roy Hodgson, Chris
Packham, Jonathan Ross), so the two may collapse together completely, so that wed and red are
no longer distinct. We may also see consonant+/r/ clusters smushing together into sounds more
like ‘ch’ and ‘j’, so trees and cheese, or dress and Jess, sound more alike.
At the ends of words, /r/ was dropped centuries ago, and /l/ is likely to follow suit by turning
into a vowel. So words like Paul, paw and pool could be indistinguishable, as they already are
in Cockney. Lastly, the glottal stop pronunciation of /t/ – a brief catch in the throat rather than a
sound which involves the tongue tip closing against the roof of the mouth – will be the default
pronunciation. People in 2066 will be mystified as to why Tony Blair, Ed Miliband and George
Osborne were slammed so mercilessly by the press for having been caught saying voters
without using a ‘proper’ /t/ in the middle.
The sound of 2066
The Liverpool accent is highly distinctive but it’s not an especially old one. It mixes local
Lancashire features with ones imported from Ireland during the 19th century. The influence of
Liverpool speech is wide: there are towns on the coast of north Wales in which people speak
with accents which are strongly coloured by Scouse. All the same, Liverpool speech will probably
start to fall into line more closely with the accents of other major northern cities. The ‘tapped’ /r/
sound in words like green and brown, or four and five, is likely to go the way of this consonant in
Scottish or Yorkshire English.
One of the very distinctive things about Scouse is the way that /k/ and the other ‘stopped’
consonants /p/ and /t/ are produced. At the end of back you’ll hear a ‘ch’ sound like the one in
Scottish loch or German Bach. A lot of people say they dislike this habit, but it’s actually a very
natural sound change, and quite common across other languages. It’s quite possible that we’ll
see more of this softening of the stop consonants not just in Liverpool but in other accents
around the country.
Liverpool, like all the other northern cities, has an accent in which pairs of words like put and
putt are pronounced alike. A great number of the changes we see in current English involve a
levelling out of local differences, however, and it’s possible that by 2066 the northern accents
will have come into line with the global norm for these vowels. At present there are many
northerners who would wince at the thought of saying cup or bus anything like southerners or
Americans do, so as a compromise they may start to use some intermediate ‘fudged’ vowel in
these and other putt-class words instead. The very suggestion that the north and the south could
converge linguistically always meets with heated argument, but it’s not so outlandish an idea – in
fact, the process has already been happening for many centuries.
The sound of 2066
In Glasgow, and lowland Scotland generally, English sits at one end of a language spectrum. At
the far end is the Scots dialect, which is so different from most sorts of English that some call
Scots a full-blown language in its own right. It seems clear, though, that the urban Scots spoken
in Glasgow is on the wane. Surveys of Scottish schoolchildren show that they aren’t familiar with
many of the Scots words and phrases that their parents and grandparents would use (bampot,
clarty, glaikit, stooshie, and thousands of others). Some of the dialect words will remain, though
it’s impossible to say which will survive. Pronunciations like gless ‘glass’, hame ‘home’, bane
‘bone’, or fit ‘foot’ may soon come to seem too old-fashioned for young people to use.
Dropping of /r/ after a vowel is already well underway among working-class Glaswegians,
meaning that pairs of words like hut and hurt can now be hard to tell apart. As in London, word-
final /l/ is also disappearing (so Paul and paw are more alike), and the ‘th’ consonants are turning
into /f/ and /v/.
On the other hand, if a second independence referendum were to go in favour of Scotland’s
separation from the UK, the picture could be very different in the Glasgow of 2066. Because
language and identity are so closely tied together, it might be that the Scots language lobby
would step their efforts up a few gears, as a way of highlighting the separateness of Scotland’s
culture and heritage. Making the language of the new state seem as distinctive as possible is
exactly what the Norwegians did when they split from Denmark a hundred or so years ago.
One of the big unknowns when trying to map out how languages will develop in the future is
the effect of political upheavals. The history of English is full of these: think of the arrival of the
Vikings, or the Norman Conquest.
The sound of 2066
British people tend to nominate one of two accents when they’re asked which is the hardest
to understand. Glaswegian is one, and Geordie is the other. There are some in the north-east of
England who claim that Geordie and the dialect of Northumbria are the closest forms of English
to Anglo-Saxon. Though this is an exaggeration, there are features of Geordie which hark back to
when Middle English was spoken (hoose for ‘house’, neet for ‘night’, and so on).
These are becoming scarcer, though. The general pattern is for Geordie to sound more like other
northern dialects. The characteristic pronunciations of ‘face’ and ‘coat’ (‘fee-uss’, ‘coo-ut’) are
much less common than they were two or three generations back. These days, more generic
northern-sounding vowels are preferred. Over the next 50 years we predict that they will sound
close to what is found in southern England. The characteristic ‘hiccuping’ Geordie pronunciation
of /p/, /t/ and /k/ in words like caper, waiter, and baker may go the same way.
Geordies used to pronounce the vowel in words like ‘nurse’ as an ‘aw’ sound, so that shirt
sounded the same as short. Words like ‘talk’ were pronounced ‘taak’. These differences are
the basis of the story in which a Geordie with an injured leg goes to see the doctor. The doctor
bandages the Geordie’s leg and says, “Now then, do you think you can walk?” The Geordie
replies, in disbelief, “Walk? Ah can hordly waak!” (= “Work? I can hardly walk!”). These
pronunciations can still be heard when you’re oot and aboot in the Toon, but they now have an
old-fashioned flavour. ‘Walk’ now tends to rhyme with ‘fork’, and ‘work’ with ‘jerk’. However,
there’s a change going on in which the ‘jerk’ vowel is moving forward in the mouth. It seems to
be linked to the habit of pronouncing the ‘coat’ vowel as something like ‘er’. So we find jokey
spellings like ‘turtle’ for ‘total’, ‘terst’ for ‘toast’, ‘jerk’ for ‘joke’, ‘serp on a rerp’, and ‘The Perp’
(that’s the head of the Catholic church).
The sound of 2066
Some of the same changes that we’ll see in Newcastle are also liable to take place in
Manchester. ‘Turtle’ for ‘total’ has spread westward through urban Yorkshire and already seems
to have crossed the Pennines into Manchester. The iconic vowel pronunciation at the end of
Manchester (something like ‘Manchest-or’) seems fairly new, but whether it will last is an open
question. Not all sound changes stick. Another feature of Manchester and other parts of the
north-west (though not Liverpool) is the vowel at the ends of words like happy and city. At the
moment, in Manchester it’s more ‘eh’-like than ‘ee’-like. The vowel in many British accents is
now firmly an ‘ee’ sound – happ-ee, rather than happ-ih. Mancunians may in time start to use the
‘happ-ee’ option, making them sound more like Scousers in this respect.
As mentioned earlier, the Liverpudlian habit of producing /k/ as the Scottish-like ‘ch’ is a very
natural thing to do, phonetically speaking. So is saying /t/ as an ‘s’-like sound, so that ‘mat’ and
‘mass’ sound very alike. It’s conceivable that Mancunians could start producing these sounds
the same way. This convergence might seem improbable, what with Mancs claiming to despise
Scousers and vice versa, but in reality the rivalry between the two cities isn’t necessarily a barrier
to their dialects becoming more similar. There are pairs of cities around the country in which
people say they loathe one another (e.g. Derby and Nottingham), but the dialects spoken in them
may become so alike that they’re hard to tell apart.
The sound of 2066
By virtue of being the closest to London of the cities listed above, Birmingham is likely to adopt
the new trends in London speech before the others do. Examples might include the following.
If we are right about the restoration of /h/ in London, we might expect this to trickle down to
Birmingham, so that by 2066 it’s being used in Brum with at least some consistency. Glottal stop
for /t/ will be the default pronunciation (except at the beginnings of words; tea will still need a
/t/, but won’t won’t!). TH-fronting (fing for ‘thing’, bovver for ‘bother’) has a firm foothold in the
Midlands already, and a /w/-like pronunciation of /r/ is also common. These forms will increase
in frequency, and the other features listed for London may also come to dominate Brummie
We could see the phasing out of localised features like the ‘velar nasal plus’, where an audible
/ɡ/ is produced at the end of sing and wrong, and where singer (‘sing-guh’) and finger rhyme.
This habit is common in the West Midlands and in north-western cities including Manchester
and Liverpool. People in these areas often say that they think they’re using the correct, standard
way of saying ‘ng’ at the ends of words and syllables. In fact, it isn’t the way Standard English
speakers pronounce these words. Brummies are probably being influenced by the spelling here,
and so believe that the ‘proper’ pronunciation involves a sequence of two sounds at the end of
sing instead of just one.
As with the northern varieties described above, we may see a split between the words of the
put and putt sets, bringing the vowel system more closely into alignment with southern accents.
The sound of 2066
Over the course of the next fifty years, our lives will be transformed by technology at least as
much as they were over the past fifty years.
We may see the rate of change accelerate, with each decade bringing an ever wider range of
technologies to make our social and working lives easier, safer, and more efficient. The impact
of these developments on society will result in new ways of using language. We will need to
coin new terms for new inventions and concepts at a rapid pace, of course, but we will also
interact with one another, and with the machines that will surround us in all areas of our lives, in
ways that may at first feel unfamiliar. The era of voice-activated computer systems, which are
faster, smarter and more secure than ever before, is already upon us. These will not force us into
particular ways of speaking, because they are designed to be responsive to our vocal patterns.
They are not judgemental about how we speak and make no distinctions between accents
or dialects: to them, all languages and their subvarieties are equal, and there is no ‘correct’
or ‘incorrect’ way of speaking. We can talk to them however we please. In short, the latest
generation of secure voice biometrics systems will let you be you.
The sound of 2066
The sound of 2066
We would like to thank the following people for their input: Maciej Baranowski, David Britain,
Georgina Brown, Urszula Clark, John Coleman, Karen Corrigan, Volker Dellwo, Holly Dunnett,
Shivonne Gates, Philip Harrison, James Hoyle, Paul Kerswill, Adrian Leemann, Kirsty Malcolm,
Alan Reading, Richard Rhodes, Devyani Sharma, Jane Stuart-Smith, Kim Witten, and Jessica
Dominic Watt,
Author of the report
Senior Lecturer
Department of Language and Linguistic Science
Dominic Watt was appointed Lecturer in Forensic Speech
Science in 2007, and teaches mainly on its new MSc
programme in that subject.
Watt has an MA (Hons) from Edinburgh and a PhD from
Newcastle, and has held teaching and research positions in
phonetics, speech acoustics and audiology, phonology and
sociolinguistics at universities in Germany and around the
UK, including York (2000-2002) and Aberdeen, where I was
Director of the Phonetics Laboratory for five years.
Brendan Gunn
Co-author of the report
Brendan Gunn holds an MA and a PhD in linguistics. He
began working as a Dialogue and Dialect Coach in 1986
after leaving the University of Ulster where he was a
Lecturer in Linguistics.
Robert De Niro, Brad Pitt, Edward Norton, Aidan Quinn,
Cate Blanchett, Jim Sturgess, Heather Graham, Rupert
Grint, Julia Roberts, Richard Gere, Natalie Portman,Daniel
Day Lewis, Penelope Cruz, Saoirse Ronan, Colin Farrell and
Stephen Rea are just some of the actors who have worked
with world renowned dialect and dialogue coach, over the
last 25 years.
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.