Content uploaded by Aleksandra Dobrego
Author content
All content in this area was uploaded by Aleksandra Dobrego on Jul 28, 2020
Content may be subject to copyright.
UK Cognitive Linguistics conference,
27-29.07.2020
CONSENSUS
IN AN INTUITIVE CHUNKING TASK
Consensus in an intuitive
chunking task
Dobrego A., Konina A.,
Vetchinnikova S., Williams N.,
Mauranen A.
Speech
segmentation
Sounds/syllables
(Cutler & Norris 1988)
Words/sentences
(Christiansen & Chater 2016)
Spontaneous speech ?
Speech is processed in chunks
? Do you segment the speech into smaller parts while listening?
Probably yes, due to working memory limitations ~ 4 units*
*Cowan 2001
•How can we measure segmentation choices?
•Do we agree on these segments?
Our study
Aim:
to investigate speech segmentation
into large high-level units
Our study
Aim: to investigate speech segmentation
into large high-level units
45 speakers of English
66 short audio extracts
Method: intuitive chunking task
•32 females, aged 20-39
•no prior knowledge of linguistics
Our study
Aim: to investigate speech segmentation
into large high-level units
45 speakers of English
66 short audio extracts
Method: intuitive chunking task
•ELFA corpus*
•20-40 seconds each
*Mauranen 2008
Our study
Aim: to investigate speech segmentation
into large high-level units
45 speakers of English
66 short audio extracts
Method: intuitive chunking task
Method:
ChunkitApp
Listen and see the text
Mark boundaries by tapping the screen
Background/feedback questionnaire
Yes/no comprehension questions
I’d ~like ~to ~point ~out ~that ~er ~er ~the ~profiles ~user
~profiles ~and ~the ~reasons ~for ~communicating ~are ~ a ~
little ~bit ~different ~er ~in ~the ~Honiara ~internet ~café ~
and ~the ~nowadays ~internet ~centre ~than ~what ~they ~
are ~in ~the ~rural ~ e-mail ~stations ~but ~the ~main ~
thing ~is ~to ~communicate ~with ~family ~members ~
relatives ~mhm ~mhm ~the ~er ~family ~and ~kinship ~it’s ~
still ~central ~thing ~in ~the ~Solomon ~Islands ~societies ~
in ~all ~of ~them ~all ~these ~communities ~are ~very ~
social ~and ~very ~communal ~there ~ isn’t ~ much ~privacy ~
there ~ isn’t ~
The output
•Between every two words:
•If marked -> 1 -> boundary
•If unmarked -> 0 -> non-boundary
Analysis 1
:
observed
agreement
*Landis and Koch 1977
Number of raters in agreement
as a proportion of total possible
number of raters
0.904 => strong*
Perhaps it’s random?
*Landis and Koch 1977
Analysis 2
:
chance
agreement
*Landis and Koch 1977
The difference between the observed and
chance agreement divided by the agreement
attainable above chance
Fleiss’ Kappa (κ)
κ = 0.45 => moderate*
However…
*Landis and Koch 1977
95% CI [0.451, 0.453]
However…
Skewed distribution?*
Solution: compare to null distribution
* Feinstein & Cicchetti 1990
Other
languages?
Konina et al forthc.
Finnish κ = 0.43
Swedish κ = 0.41
Russian κ = 0.40
Therefore, our method captures consensus
in different languages
Other
conditions?
INSTR κ = 0.55
NORM κ = 0.54
N OF SPEAKERS: κ = 0.53
Therefore, our method captures consensus
in different conditions
Dobrego et al forthc.
Konina et al forthc.
Other
examples of κ?
Lab-based
American English
κ = 0.51
Online
Indian English, κ = 0.23
American English, κ = 0.43
Aim: to look at prosodic boundaries
in different cohorts of annotators*
*Cole et al 2017
Other
methods?
Ventsov & Kasevich, 1994
Paper and pencil
working on a tablet is much faster
-> a better window into online processing
Other
methods?
Cole, Mahrt, & Roy 2017
Language Markup and Experimental Design Software
aims to elicit naive prosodic analysis
-> instead, our method captures natural process
Therefore,
Therefore,
We can operationalize speech segmentation
through agreement on segmentation choices
made by listeners
We can use the intuitive chunking task for this
purpose
The values for each language are quite close to
each other, but still vary across the sample –
perhaps language-specific differences?
Further directions…
•Explore intuitive chunking on L1 / L2 speakers
•Explore other ways to calculate agreement
•Explore individual differences in several languages
Thank you!