Content uploaded by Marian C Brady
Author content
All content in this area was uploaded by Marian C Brady on Jan 09, 2024
Content may be subject to copyright.
PLEASE SCROLL DOWN FOR ARTICLE
This article was downloaded by:
[Glasgow Caledonian Univ.]
On:
8 February 2011
Access details:
Access Details: [subscription number 773513259]
Publisher
Psychology Press
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-
41 Mortimer Street, London W1T 3JH, UK
Aphasiology
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713393920
Transcription-less analysis of aphasic discourse: A clinician's dream or a
possibility?
Linda Armstronga; Marian Bradyb; Catherine Mackenziec; John Norried
a Perth Royal Infirmary, UK b Nursing, Midwifery and Allied Health Professions Research Unit,
Glasgow Caledonian University, Glasgow, UK c University of Strathclyde, Glasgow, UK d University of
Aberdeen, UK
To cite this Article Armstrong, Linda , Brady, Marian , Mackenzie, Catherine and Norrie, John(2007) 'Transcription-less
analysis of aphasic discourse: A clinician's dream or a possibility?', Aphasiology, 21: 3, 355 — 374
To link to this Article: DOI: 10.1080/02687030600911310
URL: http://dx.doi.org/10.1080/02687030600911310
Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
#2007 Psychology Press, an imprint of the Taylor & Francis Group, an informa business
http://www.psypress.com/aphasiology DOI: 10.1080/02687030600911310
Transcription-less analysis of aphasic discourse: A clinician’s
dream or a possibility?
Linda Armstrong
Perth Royal Infirmary, UK
Marian Brady
Nursing, Midwifery and Allied Health Professions Research Unit,
Glasgow Caledonian University, Glasgow, UK
Catherine Mackenzie
University of Strathclyde, Glasgow, UK
John Norrie
University of Aberdeen, UK
Background: Discourse analysis as a clinical tool in speech and language therapy
remains underused, at least partly because of the time-consuming nature of the process
of transcription that currently precedes it. If transcription-less discourse analysis were
valid and reliable, then there would be the clinical opportunity to use this method in
order to describe a person’s communication impairment (for example aphasia), to help
plan therapy and to measure outcomes.
Aims: This study aimed to address the potential of transcription-less discourse analysis
as a valid and reliable procedure for the measurement of gesture use, topic use, turn
taking, repair, conversational initiation, topic initiation, and concept use.
Methods & Procedures: Ten individuals with aphasia were audio- and video-recorded
participating in a number of discourse tasks from three different discourse genres
(conversation, procedural, and picture description). With the same analytical frame-
works, the resulting data were compared using transcription-based discourse analysis
and a transcription-less method in which the analysis was made directly from the
recordings.
Outcomes & Results: Validity was measured by comparing transcription-based and
transcription-less analyses. Overall the results from that comparison demonstrated the
potential of the latter method—none of the measures gave significant differences
between scores from the two methods. The main (non-significant) disparities related to
Address correspondence to: Dr Marian Brady, Nursing, Midwifery and Allied Health Professions
Research Unit, Faculty of Health Building, Glasgow Caledonian University, Cowcaddens Road, Glasgow
G4 0BA, UK. E-mail: m.brady@gcal.ac.uk
Thanks are extended to the following organisations and people for their various valuable contributions
to this study: Department of Health, New and Emerging Technology Programme for funding; the
participants with aphasia; the transcription-less raters (Claire Higgins, Dorothy Russell, Kirsty
McLaughlan, Lesley Garret, and Sharon Nelson); Caitriona Hutton for additional transcription; and
the Speakability group in Forth Valley for their help in producing an aphasia-friendly information sheet
and consent form.
APHASIOLOGY, 2007, 21 (3/4), 355–374
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
some aspects of gesture use and repair. The inter-rater reliability of the transcription-less
method was also acceptable in general. Reliability was measured by the intraclass
correlation coefficient (ICC) for the continuous measurements: it was strongest for the
gesture totals and varied among the attributes of turn taking and repair. For the
categorical measures (topic and conversation initiation and concept analysis) the
percentage agreement was very good.
Conclusions: These results indicate the potential availability of a valid and reliable
transcription-less approach to analysis that speech and language therapists can apply to
analyse their clients’ discourse.
Discourse can be defined as ‘‘continuous stretches of language or a series of
connected sentences or related linguistic units that convey a message’’ (Cherney,
1998, p. 2). Furthermore, ‘‘discourse IS functional communication’’ (Cherney, 1998,
p. 5). Discourse analysis (DA), as a tool in speech and language therapy (SLT)
research and clinical intervention, is a relatively recent development when compared
to other forms of analysis, e.g., articulatory and syntactic. It developed from work in
sociolinguistics, e.g., Labov’s study of speech styles in New York in the 1960s and
work by Gumperz in the 1980s on interactional sociolinguistics. This branch of
linguistics has lagged behind others in finding a clinical application for several
reasons, including the need for portable and good-quality recording equipment, the
‘‘long-standing tendency of linguistic theory to idealize away from the data of
everyday speech’’ (Lesser & Milroy, 1993, p. 49) and the tendency of linguistic theory
and clinical practice to rely on abstracted, decontextualised samples of speech at
single-word or single-sentence levels. According to Lesser and Milroy (1993), it has
also suffered from a reliance on and trust in the quantitative paradigm. Cherney
(1998, p. 5) lists some of the uses that clinicians can make of DA as a tool that is
sensitive to change over time: objective and measurable impairment description; help
in underlying cognitive or linguistic process identification; and assistance in therapy
planning. DA may be based on either monologue or conversation. The use of DA in
SLT remains largely limited to research and within academic settings, mainly as a
result of the time-consuming nature of transcription on which DA is currently based:
‘‘the time required to transcribe and analyse lengthy conversations puts conversa-
tional discourse analysis … out of reach for most practising clinicians’’ (Boles &
Lombard, 1998, p. 547); ‘‘it is often not the assessment of choice due to its apparent
time-consuming nature and the overwhelming number of options available’’
(Togher, 2001, p. 131).
Based on the second author’s research experience of DA, one minute of discourse
can take up to an hour to transcribe, depending on the severity of an individual’s
communication impairment and the depth of the planned analysis. In contrast, Boles
(1998, p. 270) reports in his study that ‘‘transcription and analysis took … 6–10
minutes for every 1 minute of live interaction’’. His analysis used software (rather
than the more usually clinically available manual analysis) to calculate a limited
number of variables (repairs, total words, and utterances per minute of conversa-
tion). Togher (2001) finishes her paper by describing the challenge ahead of making
discourse analysis resources ‘‘practical and time efficient, and, therefore, easily used
in the assessment and management of people with neurological language
impairments’’ (p. 146).
Previous investigations of discourse ‘‘have frequently been limited to examining
one aspect of discourse within one discourse genre as produced by a small group of
356 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
participants’’ (Sherratt, 2004, p. 1), thus limiting a more general appreciation of how
people with communication disorders manage to produce spoken discourse which
implicates many processes and levels of language processing (but see Sherratt, 2007
this issue). Shadden (1998) suggests that a discourse sample should include tasks that
‘‘sample a diverse range of behaviors’’ (p. 9) but ‘‘what constitutes a representative
sample is still controversial’’ (Armstrong, 2000, p. 876).
There is now evidence available on the reliability of transcription-based DA—e.g.,
Brady, Mackenzie, and Armstrong (2003); Brady, Armstrong, and Mackenzie (2005)—
from which the work described below may be considered as a natural development.
Our recent research in relation also to turn taking in participants with dysarthria
(Comrie, Mackenzie, & McCall, 2001) and involving the use of gesture in people with
right hemisphere brain damage (Brady & Mackenzie, 2001) has indicated the potential
of a transcription-less method of DA. In Comrie et al. (2001), the turn-taking analysis
undertaken directly from audio recordings was based on ‘‘slow careful listening to the
recordings’’, as this approach was deemed to be ‘‘more viable in standard clinical
settings’’ (p.387). However, transcription was used when analysis was difficult (i.e.,
when there were overlapping turns). Both intra-rater and inter-rater reliability of this
transcription-less method were measured using subsets of the conversational samples.
For the former a respectable mean of 90% agreement (range575 – 96.7%) was achieved
over seven aspects of turn taking, while for the latter the result was slightly lower
(mean586.7%, range572.6 – 97.5%). The lowest agreement for both sets of reliability
data was for frequency of within-turn pauses. Brady and Mackenzie (2001) profiled
gesture use following right hemisphere damage directly from video recordings. They
report intra-rater reliability at between 88% and 99%.
A transcription-less approach to DA would make this method of analysis more
accessible to SLTs working in clinical practice with people with aphasia (or indeed
with other communication-disordered client groups). The utility of analysis of
disordered communication beyond single-word or sentence level is now well
recognised and promoted (e.g., Royal College of Speech and Language Therapists,
2005). Increased accessibility of DA within everyday clinical settings would in turn
drive more functionally relevant outcomes, i.e., better identification of deficits evident
in everyday inter-personal interactions as well as more appropriate and finely tuned
targeting of therapy interventions and of evaluation of the effectiveness of those
interventions. The decreased time required for transcription-less DA approaches
would also facilitate the inclusion of greater numbers of participants in SLT clinical
experimental investigations, thereby potentially increasing the statistical power to
detect smaller treatment effects, which might still be worthwhile clinically.
The aim of this study was to begin to address the question of whether
transcription-less discourse analysis is valid and reliable. Specifically the objective of
this study was to compare transcription-less and transcription-based analyses of the
same discourse samples, using the same measures, elicited from people with aphasia.
METHOD
Participants
A total of 12 people with post-stroke mild–moderate language expression difficulties
associated with aphasia were recruited to the study, via their previous or current
contact with the first author’s SLT service. Any individuals with significant hearing
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 357
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
or visual impairments were excluded, but there were no additional exclusion or
inclusion criteria. Two participants’ data (participants 4 and 11) were excluded from
the study before analysis, as their degree of language expression difficulty was
greater than anticipated on recruitment. As the focus of this study was the method of
analysis rather than the nature of discourse in people with aphasia, there was no
requirement for age- and education-matched non-aphasic control participants. The
age range was 45 – 81 years (mean 64.1 years) and time since stroke 4–43 months
(mean 15.4 months). All left school at the minimum leaving age, with the exception
of participant 12 who had some additional further education. All were pre-morbidly
right-handed. Further details are outlined in Table 1. Their aphasia description
derived from a subjective evaluation of assessment results and case-note records by
the first author.
Procedure
Transcription-based and transcription-less raters. The second and third authors
undertook the transcription-based analyses. Five final-year undergraduate SLT
students from the Speech and Language Therapy Department, University of
Strathclyde, were recruited to undertake the transcription-less analysis. They
undertook a 5-hour training programme, delivered by the second author, during
which the methods of transcription-less analysis were explained and practice given.
One student subsequently withdrew and did not complete the analysis process. This
student population seemed appropriate to include in the study as transcription-less
raters, since they are potential users of this method and have the same, fresh
knowledge of discourse analysis. After these raters had completed their work, they
were invited to provide written feedback on their experience of transcription-less
discourse analysis. This was analysed qualitatively using the method described by
Miles and Huberman (1994, p. 58): ‘‘Initial data are collected, written up, and
reviewed line by line … categories or labels are generated … labels are reviewed and
… a slightly more abstract category is attributed to several incidents or
observations.’’
Discourse samples. The discourse elicitation sessions were recorded in private in
either a hospital clinic room or the participant’s own home (see Table 1), whichever
was preferable to and practical for the participant. Each session was both audio- and
video-recorded using digital equipment. The first author was the constant discourse
partner. The running order of the data collection sessions and their content are
outlined in Table 2.
The data-collection sessions began and ended with some samples from free
conversation. ‘‘Intro’’ was started with an opportunity to ask questions about the
procedure. ‘‘Close’’, the one at the end of the recording session, started with how the
participants felt the session had gone, but then broadened naturally. Five other
specific samples were elicited from each of the 10 participants. These represented
three different genres of discourse (see Table 2), conversation, and procedural and
picture description, and so a variety of discourse genres were used in this study. The
tasks used had a range of stimulus variables and cognitive complexity, and a variable
degree of personal relevance (see Table 2). Very brief interludes of conversation
358 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
TABLE 1
Participant characteristics
No Sex Age Onset Lesion, based on CT report Aphasia Setting Ed
1 M 55 10 Left hemisphere large intra-cranial haemorrhage evacuated. Arterio-venous malformation later removed. Anomia H 1
2 F 76 4 Left parietal region infarct. Wedge-shaped area of reduced attenuation involving white and grey matter.
Ventricular system symmetrical and normal in size for age.
Anomia C 1
3 M 68 9 Ventricular system symmetrical and normal in size. No ventricular or mid-line displacement. Area of
reduced attenuation to left parietal region extending to involve the cortex in keeping with an infarct in the
left middle cerebral artery territory. No other focal cerebral abnormality. No evidence of intracranial
haemorrhage.
Anomia C 1
5 F 45 10 Some effacement in the left cerebral hemisphere with large area of reduced attenuation in the left temporo-
parietal region in keeping with acute middle cerebral artery territory infarct. Ventricular system normal in
size and position. No acute intra-cranial haemorrhage.
Mild fluent
aphasia
H1
6 F 81 13 Left middle cerebral artery territory infarct. Generalised cerebral atrophy. Perpendicular areas of reduced
attenuation, particularly adjacent to frontal horns of lateral ventricles in keeping with changes of
cerebrovascular disease. On left side there is area of reduced attenuation deep to the insular extending
lateral to the left lateral ventricle in keeping with infarct in the left MCA territory. The left middle cerebral
artery shows slightly increased attenuation, which would be in keeping with thrombosis of the left middle
cerebral artery.
_H1
7 M 59 19 Infarct in the left parietal lobe posterior to the central sulcus. Anomia C 1
8 M 42 43 Left fronto-parietal infarct. Left carotid artery dissection with left middle cerebral artery infarction H 1
9 M 74 35 Extensive calcification in vertebral arteries and ICAs. Multiple areas of rather ill-defined low attenuation
in deep white matter adjacent to left lateral ventricle and also lateral to head of caudate nucleus on that
side. Appearances in keeping with recent infarction. No evidence of intracranial haemorrhage. Ventricles
normal and symmetrical.
Anomia C 1
10 F 63 4 Infarct of grey and white matter of left frontal lobe. Anomia.
[and dyspraxia]
C1
12 F 78 7 Generalised atrophy (age related). Large middle cerebral artery infarct. Peripheral branch middle cerebral
artery hyperdense. ?Acute thrombosis.
Moderate Fluent
Aphasia
H2
No5study number; Sex5M5male; F5female. Age5age in years at time of assessment.
Onset5time since onset in months. Aphasia5general classification of participants’ aphasia.
Setting5setting for assessment; H5home; C5Clinical.
Ed5Education where 15left school at minimum school leaving age; 25additional education; 35third level education leading to a degree.
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 359
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
separated each target sample. Therefore the samples included both spontaneous and
semi-spontaneous speech (Prins & Bastiaanse, 2004).
The number of discourse tasks used to create the discourse samples in this study
was carefully considered so as to provide samples of some length (the elicitation
sessions lasted at least 10 minutes and in some cases much longer). Whether or not
this meets the criterion of representativeness, the range of discourse tasks and the
time spent during the recording sessions will have provided participants with the
opportunity to use a wide variety of discourse behaviours.
Discourse features. In this study several aspects of discourse produced by the 10
participants were examined in the three genres of conversation, procedure, and
picture description. The features of discourse measured were gesture use, topic use,
turn taking, repair, conversational initiation, topic initiation, and concept use (see
Table 3 for a summary of measures, their component subcategories and the relevant
tasks). These mainly reflect those that have been used in our previous research
(primarily with neurologically normal people and people with right hemisphere brain
damage) and found to be reliable (e.g., Brady et al., 2003; Mackenzie, Brady, Norrie
& Poedjianto, 2007 this issue). They also represent different levels of processing and
are some of the discourse features that SLTs would commonly be interested in
measuring. Repair was included because of its clinical interest, but this aspect has
not been examined in our previous research. Further details on the discourse features
and the procedure for their documentation by the raters are available from the
second author.
Transcription-based and transcription-less analysis. The samples of discourse from
the ‘‘family’’, ‘‘day’’, ‘‘sandwich’’, ‘‘light’’, and ‘‘cookie’’ tasks and from ‘‘intro’’ and
‘‘close’’ were transcribed by the second author using a system based on the
Jeffersonian transcription system (Psathas & Anderson, 1990). The first author
checked the accuracy of the transcripts. This check revealed few discrepancies:
approximately 1% of total words. In almost all cases disagreement was concerned
with minor adjustments within or between words. The transcripts were analysed by
the second author using a transcription-based approach with analysis forms mostly
derived from those used in previous research (the repair analysis form was devised
for this study). Each discourse feature had its own form. Inter-rater reliability of this
approach was measured, involving 30% of the transcripts and ratings by the second
and third authors, as was intra-rater reliability (of the second author’s ratings).
TABLE 2
Discourse tasks
Discourse task Task instruction abbreviation
Introductory conversation None – free conversation intro
Two conversational tasks ‘‘Tell me about your family’’ family
‘‘How do you pass your day?’’ day
Two procedural tasks ‘‘How do you make a cheese sandwich?’’ sandwich
‘‘Tell me how to change a light bulb’’ light
Picture description ‘‘Tell me everything you see happening in the picture’’
(cookie theft - Goodglass, Kaplan, & Barresi, 2001)
cookie
Closing conversation None – free conversation close
360 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
Measurement of intra-rater reliability also involved 30% of the transcripts and here
the second analysis was made at least a week after the first one. The transcription-
less method raters analysed the discourse samples directly from the audio-visual
recordings, independently and in separate locations, reviewing the recordings as
often as they wished. They used the same analysis forms as the transcription-based
raters.
Statistical analysis. To assess the agreement between the transcription-based and
transcription-less analyses as a test of the latter’s validity, for continuous
measurements (i.e., gesture use, topic use, turn taking, and repair) we calculated
the difference between the two analyses and a 95% limit of agreement, i.e., ¡1.96
standard deviations of the difference (Altman, 1991, p. 399ff). We also report the
TABLE 3
Summary of discourse features analysed
Measure Subcategories Tasks applied to
Gesture use (based on LeMay,
Davis, & Thomas, 1988)
Illustrative use of gesture which classifies
gestures as:
family
Nbatons (emphasis) day
Nideographs (thought in progress) sandwich
Ndeictic movements (pointing) light
Nkinetographs (physical action)
Npictographs (descriptive)
Npictographs (locational)
Npictographs (quantitative)
Nall above totalled
Topic use (based on Mentis &
Prutting, 1991)
Nmain topic family
Nsubtopic day
Nsub-subtopic sandwich
Nsub-subsubtopic light
Ntotal of sub-divisions cookie
Turn taking (based on Comrie,
1999; see Comrie et al., 2001)
Nturn duration intro, close
N% of major turns
N% of major turns within participants’
contribution
Nminimum length of major turn
Nmaximum length of major turn
Nmean length of major turn
Nmean length of turn-taking delay
Repair (based on Milroy &
Perkins, 1992)
Number of: family
Ntrouble sources day
Nself-initiated repair sandwich
Nother-initiated repair light
Nself-repair cookie
Nother-repair
Nfailed repairs
Conversational initiation Frequency of initiation intro, close
Topic initiation Frequency of initiation intro, close
Concept analysis (Nicholas &
Brookshire, 1995)
Number of concepts: cookie
Naccurate and complete
Naccurate but incomplete
Ninaccurate
Nabsent
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 361
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
mean (and standard deviation) for the two analyses, and a pvalue for a test of
whether the mean of the differences is zero. For the categorical measurements (i.e.,
conversational and topic initiation, concept analysis) we calculated the percentage in
agreement between the transcription-based and transcription-less analyses using the
simple kappa statistic (see Fleiss, 1981, for a readable description of the kappa
statistic).
Inter-rater reliability of the transcription-less DA method (continuous measure-
ments) was assessed by calculating the Interclass Correlation Coefficient (ICC)
(Shrout & Fleiss, 1979) for the four raters’ measurements on each of the discourse
features produced by the 10 participants during the various discourse tasks. For the
categorical measurements, we calculated the percentage agreement and the simple
kappa statistic. Agreement using the latter was classified according to Landis and
Koch (1977): ,0.2 ‘‘poor’’, ‘‘.0.2 – 0.4 ‘‘fair’’, .0.4 – 0.6 ‘‘moderate’’, .0.6 – 0.8
‘‘good’’ and .0.8 – 1 ‘‘very good’’. We repeated similar analyses on subsets of three
samples to assess inter-rater reliability and intra-rater reliability for the transcrip-
tion-based method.
RESULTS
Reliability of the transcription-based method
For both intra-rater and inter-rater reliability of the transcription-based analysis
there were no statistically significant differences found between the original and
second analysis for the subset of samples re-analysed, nor between the original
analysis and that carried out by the second rater for the subset of samples analysed
(data not shown here, but available on request from the second author). This finding
confirms our earlier research, which demonstrated the reliability of this approach.
Validity and reliability of the transcription-less method
Results relating to the validity and reliability of the transcription-less method are
reported separately for each of the discourse features measured: gesture use in Table 4;
topic use in Table5; turn taking in Table6; repair in Table7; conversational and topic
initiation in Table 8; and concept analysis in Table 9. Overall the results establish the
validity and inter-rater reliability of a transcription-less approach to DA. None of the
measures gave significant differences between scores from the two methods, thus
demonstrating validity. The main non-significant disparities related to some aspects of
gesture use and repair. The inter-rater reliability of the transcription-less method was
also acceptable in general: it was strongest for the gesture totals and varied among the
attributes of turn taking and repair. For the categorical measures (topic and
conversation initiation and concept analysis) the percentage agreement was very good.
Tables 4–7 show the results for the continuous measurements. For each of the
measures (column 1), the means and standard deviations are provided from both the
transcription-less DA (N540: 4 raters610 participants) and the transcription-based
DA (N510: 1 rater610 participants). These calculations are followed by those for
validity in the next columns and inter-rater reliability in the final column. Inter-rater
reliability results for the continuous measurements for gesture use, topic use, turn
taking, and repair are also shown graphically in Figures 1–4. In these figures the
mean scores for the transcription-based and transcription-less analyses are shown on
362 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
the y and x axes respectively. Intraclass coefficients are shown in square brackets.
Also shown is the line of equality (x5y). In Figure 3, mean scores on turn-taking
measures for the transcription-based and transcription-less methods have been
standardised to the transcription-less method standard deviation to allow them to be
shown graphically, as the results vary in scale according to the turn-taking measure
(e.g., duration was measured in hundreds and minimum length of major turns less
than 10).
TABLE 4
Gesture use: Validity and reliability measures for the transcription-less method
Measures and tasks
T-less N540 Trans N510 Trans–T-less mean
(95% limit of
agreement)
p value for
mean ICCMean(SD) Mean(SD)
TOTALS
Family 15.3(11.6) 25.1(17.8) 9.8(27.3,27.0) ,0.0001 0.83
Day 16.0(16.9) 33.4(44.3) 17.5(238.1,73.0) ,0.0001 0.90
Sandwich 10.9(11.5) 19.1(24.8) 8.2(219.2,35.6) ,0.0001 0.85
Light 13.4(9.3) 24.8(19.2) 11.4(211.1,33.9) ,0.0001 0.80
COMPONENTS
Family
Baton 10.9(8.4) 16.1(10.7) 5.1(210.2,20.5) ,0.0001 0.72
Ideograph 2.0(2.5) 2.3(1.9) 0.3(23.8,4.5) 0.61 0.28
Deictic 1.1(1.8) 1.6(2.0) 0.4(22.9,3.7) 0.33 0.31
Kinetograph 0.2(0.5) 0.4(1.0) 0.2(21.9,2.4) 0.024 0.39
Picto descrip 0.3(0.6) 0(0) 20.3(21.4,0.8) 0.0067 0.62
Picto location 0.4(0.9) 3.8(6.1) 3.4(27.0,13.8) ,0.0001 0.75
Picto quantitative 0.3(0.8) 0.9(1.2) 0.6(21.5,2.6) 0.0026 0.27
Day
Baton 9.4(10.5) 23.2(34.0) 13.8(234.4,62.0) ,0.0001 0.82
Ideograph 2.9(4.2) 4.0(6.8) 1.1(28.9,11.0) 0.14 0.65
Deictic 1.4(1.9) 2.7(5.8) 1.3(27.5,10.1) ,0.0001 0.72
Kinetograph 1.6(2.5) 2.5(4.9) 0.9(24.6,6.4) 0.0019 0.82
Picto descriptive 0.3(0.7) 0.1(0.3) 20.2(21.8,1.3) 0.11 0.51
Picto locational 0.1(0.5) 0.5(0.9) 0.4(21.6,2.4) 0.0015 0.20
Picto quantitative 0.2(0.5) 0.4(0.7) 0.2(21.1,1.4) 0.037 0.69
Sandwich
Baton 3.4(3.9) 6.2(6.7) 2.8(26.3,11.9) ,0.0001 0.70
Ideograph 1.5(2.0) 4.4(5.5) 2.9(25.0,10.9) ,0.0001 0.66
Deictic 0.1(0.7) 0.2(0.4) 0.1(21.2,1.3) 0.67 20.02
Kinetograph 4.6(8.4) 6.4(12.1) 1.8(29.5,13.1) 0.17 0.70
Picto descriptive 0.5(0.8) 0.2(0.4) 20.3(21.6,1.0) 0.10 0.17
Picto locational 0.4(0.9) 1.4(2.8) 1.0(24.2,6.2) 0.0002 0.00
Picto quantitative 0.4(0.9) 0.3(1.0) 0(21.3,1.2) 0.78 0.50
Light
Baton 4.5(6.4) 10.8(13.6) 6.3(29.9,22.5) ,0.0001 0.74
Ideograph 1.7(2.6) 4.3(6.6) 2.6(28.3,13.6) ,0.0001 0.29
Deictic 1.6(2.0) 1.1(1.1) 20.5(24.4,3.5) 0.24 0.56
Kinetograph 5.1(4.0) 7.2(6.8) 2.0(26.4,10.5) 0.0029 0.66
Picto descriptive 0.3(0.8) 0.3(0.7) 0(21.9,1.9) 0.93 20.02
Picto locational 0.2(0.5) 1.1(1.4) 0.9(21.5,3.3) ,0.0001 0.02
Picto quantitative 0.0(0.2) 0(0) 0(20.2,0.1) 0.55 0.00
ICC – Intraclass correlation coefficient; Picto – pictograph, T-less – transcription-less method of DA;
Trans – transcription-based method of DA.
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 363
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
In Table 4 the results for gesture use are shown. When examined overall, gesture
use had good agreement (as measured by the intraclass correlation coefficient)
between the transcription-based and transcription-less raters (ICC between 0.80
and 0.90 for ‘‘family’’, ‘‘day’’, ‘‘sandwich’’, and ‘‘light’’ totals of gesture use).
Although there was good agreement between the methods, the transcription-based
rater consistently scored higher on the gestures (and with apparently more
variability) than the four transcription-less raters. The agreement between the two
analyses varied across the individual gesture types, with baton (ICC 0.72–0.82)
good, and kinetograph (0.39–0.82) reasonable, the ideograph and deictic mixed
(with no real pattern emerging), and the three pictographic-type gestures
(description, locational, and quantitative) poor for ‘‘sandwich’’ and ‘‘light’’ but
reasonable for the ‘‘family’’ and ‘‘day’’ tasks. Overall, none of the 95% individual
TABLE 5
Topic use: Validity and reliability measures for the transcription-less method
Measures and
tasks
T-less N540 Trans N510 Trans–T-less mean
difference (95% limit of
agreement)
p value for
mean ICCMean(SD) Mean(SD)
TOTALS
Family 3.65(2.49) 4.60(2.72) 20.95(23.87,5.44) 0.061 0.51
Day 3.83(2.70) 4.40(3.20) 0.58(24.09,5.24) 0.23 0.63
Sandwich 2.28(1.80) 1.20(1.48) 21.08(24.00,1.82) 0.0010 0.58
Light 1.48(1.32) 1.10(0.99) 20.38(22.93,2.18) 0.25 0.26
Cookie 3.03(2.14) 3.10(2.02) 0.08(24.23,4.40) 0.86 0.41
COMPONENTS
Family
Main topic 1.10(0.30) 1.40(0.70) 0.30(20.97,1.57) 0.0057 0.13
Subtopic 2.45(1.01) 2.60(1.07) 0.15(21.96,2.26) 0.38 0.32
Sub-subtopic 0.95(1.28) 1.40(1.35) 0.45(22.42,3.32) 0.060 0.35
Sub-subsubtopic 0.25(1.01) 0.60(1.26) 0.35(21.56,2.26) 0.028 0.31
Day
Main topic 1.20(0.52) 1.20(0.42) 0.00(21.33,1.33) 1.00 0.46
Subtopic 2.38(1.58) 2.90(2.23) 0.53(22.68,3.73) 0.049 0.46
Sub-subtopic 1.40(1.61) 1.50(2.27) 0.10(25.00,5.20) 0.81 0.54
Sub-subsubtopic 0.05(0.22) 0.00(0.00) 20.05(20.48,0.38) 0.16 20.03
Sandwich
Main topic 1.00(0.00) 1.00(0.00) 0.00(0.00,0.00) N/A N/A
Subtopic 1.75(1.33) 1.20(1.48) 20.55(23.06,1.96) 0.0098 0.41
Sub-subtopic 0.50(0.93) 0.00(0.00) 20.50(22.33,1.33) 0.0016 0.09
Sub-subsubtopic 0.03(0.16) 0.00(0.00) 20.03(20.33,0.28) 0.32 0.00
Light
Main topic 1.00(0.00) 1.00(0.00) 0.00(0.00,0.00) N/A N/A
Subtopic 1.33(1.16) 1.10(0.99) 20.23(22.64,2.19) 0.25 0.14
Sub-subtopic 0.15(0.48) 0.00(0.00) 20.15(21.10,0.80) 0.057 20.08
Sub-subsubtopic 0.00(0.00) 0.00(0.00) 0.00(0.00,0.00) N/A N/A
Cookie
Main topic 1.00(0.00) 1.00(0.00) 0.00(0.00,0.00) N/A N/A
Subtopic 2.35(1.42) 2.30(1.25) 20.05(23.12,3.02) 0.84 0.43
Sub-subtopic 0.68(1.40) 0.80(1.23) 0.13(22.65,2.90) 0.58 0.11
Sub-subsubtopic 0.00(0.00) 0.00(0.00) 0.00(0.00,0.00) N/A N/A
ICC – Intraclass correlation coefficient; T-less – transcription-less method of DA; Trans –
transcription-based method of DA.
364 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
limits of agreement (mean of differences between methods ¡1.96 SD of these
differences) excluded a zero difference. Thus we can say that the interval did not
rule out the methods being equal. Figure 1 highlights some of these findings,
showing the mean scores for the two methods for the gesture totals (first four rows
of figures within Table 4).
In Table 5, topic use results are shown. In terms of agreement between the DA
methods (as a measure of validity), there was no tendency for them to differ
systematically (with the exception of the ‘‘sandwich’’ task, which produced lower
scores on the transcription-based analysis). The topic use measures on the ‘‘family’’
task tended to be scored higher on transcription-based analysis in comparison with
transcription-less, but not always significantly so. The ‘‘light’’, ‘‘cookie’’, and
‘‘day’’ tasks were more mixed, with little discernible pattern evident. Again,
overall, none of the 95% individual limits of agreement excluded a zero difference.
Figure 2 illustrates some of these findings. The reliability of the transcription-less
analysis was reasonable (see Table 5). ICC for ‘‘family’’, ‘‘day’’, ‘‘sandwich’’,
‘‘light’’, and ‘‘cookie’’ task overall totals for topic use were concentrated in the
range 0.41 to 0.63, with one at 0.26. When looking at the individual components of
topic use (main topic, subtopic, sub-subtopic, and sub-subsubtopic), several of
theseshowedzerovariabilityandsoICCandpvalues for the mean differences
were not calculable.
On only 1 of the 14 turn-taking measures (see Table 6 and Figure 3) was there a
significant difference found in the mean of the differences of the transcription-based
and transcription-less raters: the minimum length of major turns in ‘‘intro’’. Inter-
rater reliability for the transcription-less analysis was mixed, ranging from ICC of
1.00 and 0.99 for ‘‘intro’’ and ‘‘close’’ duration, through to zero ICC for mean length
TABLE 6
Turn-taking: Validity and reliability measures for the transcription-less method
Measures and tasks
T-less N540 Trans N510 Trans–T-less mean
(95% limit of
agreement)
p value for
mean ICCMean(SD) Mean(SD)
Introduction
Duration 280(100) 277(106) 1(225,27) 0.83 0.98
% major turns in intro 49(5) 50(3) 1(212,14) 0.66 0.20
% major turns by participant 77(14) 78(8) 1(230,31) 0.77 0.42
Mean length major turns 16(9) 14(6) 22(215,12) 0.38 0.41
Min length major turns 2.3(1.5) 1.7(1.0) 21.3(24.2,1.7) 0.0016 0.13
Max length major turns 48(29) 49(21) 0(247,47) 0.97 0.50
Mean length delay 0.6(1.3) 0.4(1.3) 20.2(23.4,3.1) 0.75 20.02
Closing
Duration 351(147) 346(167) 0(249,49) 1.00 1.00
% major turns in intro 50(9) 51(5) 2(214,17) 0.46 0.17
% major turns by participant 79(15) 83(10) 5(224,33) 0.23 0.14
Mean length major turns 32(34) 24(14) 28(257,41) 0.32 0.37
Min length major turns 3.8(3.6) 2.2(1.0) 21.6(28.3,5.1) 0.10 0.14
Max length major turns 88(80) 101(95) 13(2119,145) 0.39 0.54
Mean length delay 2.3(8.0) 1.2(1.8) 21.0(211.7,9.7) 0.64 20.05
ICC – Intraclass correlation coefficient; T-less – transcription-less method of DA; Trans –
transcription-based method of DA.
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 365
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
delay. Again, overall, none of the 95% individual limits of agreement excluded a zero
difference.
For repair measures (see Table 7), in terms of agreements between the two DA
methods, once again none of the 95% individual limits of agreement excluded a zero
difference. The three measures of numbers of trouble sources, self-initiated repairs,
and self-repairs were all consistently higher on the transcription-based than
transcription-less analysis (see also Figure 4 which illustrates these results). There
was a varied pattern of reliability for the transcription-less method, with high ICCs
for trouble sources at approximately 0.7, and poor (typically around 0.1–0.2) for
other-initiated repairs.
TABLE 7
Repair: Validity and reliability measures for the transcription-less method
Measures and tasks
T-less N540 Trans N510 Trans–T-less mean
difference (95% limit
of agreement)
p value
for mean ICCMean(SD)Mean(SD)
Family
Trouble sources 6.0(3.7) 13.5(8.3) 7.5(26.9,21.9) ,0.0001 0.42
Self-initiated 5.7(3.6) 13.4(8.3) 7.8(26.6,22.1) ,0.0001 0.38
Other-initiated 0.4(1.0) 0.1(0.3) 20.3(21.9,1.4) 0.26 0.14
Self-repair 4.7(3.3) 12.3(8.4) 7.6(26.5,21.7) ,0.0001 0.34
Other-repair 0.4(0.8) 0.3(0.5) 20.1(21.4,1.1) 0.49 0.15
Fail 0.7(1.1) 0.9(0.8) 0.2(21.3,1.7) 0.28 0.69
Day
Trouble sources 5.1(3.3) 17.7(12.3) 12.6(26.9,32.2) ,0.0001 0.65
Self-initiated 4.7(3.3) 17.6(12.3) 12.9(27.0,32.8) ,0.0001 0.52
Other-initiated 0.2(0.6) 0.1(0.3) 20.1(20.8,0.6) 0.50 0.47
Self-repair 4.2(2.9) 16.7(12.1) 12.5(26.9,32.0) ,0.0001 0.64
Other-repair 0.2(0.6) 0.3(1.0) 0.1(20.9,1.0) 0.44 0.69
Fail 0.7(1.0) 0.6(0.7) 20.1(21.6,1.5) 0.81 0.40
Sandwich
Trouble sources 2.8(2.7) 6.7(7.5) 4.0(26.2,14.2) ,0.0001 0.82
Self-initiated 2.6(2.6) 6.7(7.5) 4.1(26.3,14.6) ,0.0001 0.75
Other-initiated 0.2(0.5) 0(0) 20.2(21.0,0.6) 0.14 0.05
Self-repair 1.8(2.3) 6.1(7.3) 4.3(26.1,14.7) ,0.0001 0.34
Other-repair 0.4(0.6) 0.1(0.3) 20.3(21.2,0.7) 0.038 0.82
Fail 0.6(1.2) 0.5(0.9) 20.1(22.0,1.9) 0.75 0.59
Light
Trouble sources 3.1(2.3) 6.4(5.4) 3.6(25.0,12.3) ,0.0001 0.72
Self-initiated 2.9(2.1) 6.7(5.4) 3.8(24.5,12.2) ,0.0001 0.72
Other-initiated 0.2(0.6) 0(0) 20.2(21.1,0.7) 0.16 0.14
Self-repair 2.3(1.8) 6.2(5.3) 3.9(24.2,12.0) ,0.0001 0.61
Other-repair 0.3(0.6) 0.3(0.7) 0(21.5,1.6) 0.68 0.49
Fail 0.4(0.7) 0.2(0.4) 20.2(22.1,1.7) 0.075 0.72
Cookie
Trouble sources 4.8(3.8) 11.8(9.1) 7.1(25.8,19.9) ,0.0001 0.71
Self-initiated 4.6(3.8) 11.7(8.8) 7.1(25.3,19.5) ,0.0001 0.74
Other-initiated 0.1(0.4) 0.1(0.3) 0(20.8,0.8) 0.82 20.07
Self-repair 3.6(3.5) 10.4(7.7) 6.8(24.4,18.1) ,0.0001 0.71
Other-repair 0.3(0.5) 0.2(0.4) 20.1(20.5,0.3) 0.26 0.79
Fail 0.8(1.2) 1.2(2.4) 0.4(23.1,3.8) 0.11 0.48
ICC – Intraclass correlation coefficient; T-less – transcription-less method of DA; Trans –
transcription-based method of DA.
366 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
Table 8 shows the results for conversational and topic initiation, and Table 9
for concept analysis. Validity on these categorical measures was established by
percentage agreement between the transcription-based and transcription-less
analyses. Inter-rater reliability for the transcription-less analysis is shown in the
final column. Conversational and topic initiation (see Table 8) showed a
reasonable percentage of agreement for both the reliability of the transcription-
less method and in terms of agreement between the two methods (although the
kappa statistics for both comparisons were low). For concept analysis (see
Table 9), there was a consistently high percentage of agreement across the seven
concepts for both the reliability of the transcription-less analysis and for the
comparison between the two types of analysis. For the comparison of methods,
the kappa statistics were still low (less than 0.13) but for the reliability of the
transcription-based method ranged between 0.60 and 0.90 (i.e., between ‘‘good’’
and ‘‘very good’’ agreement).
Transcription-less rater feedback
Representative examples of the transcription-less raters’ comments are shown in
Table 10. Their comments, although often differing in content, appeared to fall
within three different categories: ease/difficulty of analysis; problems with analysis;
and clinical usefulness. This analysis reflects individual variation within the raters’
experience of this new method of DA, but also some common themes. These will be
used in the discussion to compare the quantitative results with the raters’ perception
of their analysis of the various discourse features.
DISCUSSION
The discussion that follows examines in detail the results from this study, with
particular focus on measures of validity and reliability for the transcription-less
method of DA. The experience of transcription-less DA as described by the raters
receives some further attention before the strengths and weaknesses of the study are
outlined. We finish with some suggestions for clinical implications and further
research.
TABLE 8
Conversational and topic initiation: Validity and reliability measures for the transcription-less
method
Type Time
Initiated Not initiated % agreement
between
methods*
% agreement within
t-less**T-less Trans T-less Trans
Conversation Intro 11(28%) 0(0%) 29(73%) 10(100%) 73% 65%
Close 9(23%) 1(10%) 31(78%) 9(90%) 83% 68%
Topic Intro 24(60%) 6(60%) 16(40%) 4(40%) 70% 57%
Close 24(60%) 9(90%) 16(40%) 1(10%) 65% 47%
*kappa coefficients for t-less ,t-based agreement ranged from 20.24 to 0.13.
** kappa coefficients for agreement between transcription-less raters ranged from 20.05 to 0.13.
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 367
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
TABLE 9
Concept analysis: Validity and reliability measures for the transcription-less method
Concept
Accurate and complete Accurate but incomplete Inaccurate Absent % agree
between
methods*
% agree within
t-less**T-less Trans T-less Trans T-less Trans T-less Trans
Woman washing 13(33%) 4(40%) 7(18%) 3(30%) 3(8%) 0(0%) 17(43%) 3(30%) 78% 83%
Sink overflowing 18(45%) 5(50%) 11(28%) 5(50%) 10(25%) 0(0%) 1(3%) 0(0%) 68% 53%
Boy on stool 20(50%) 4(40%) 5(13%) 2(20%) 7(18%) 2(20%) 8(20%) 2(20%) 90% 90%
Boy stealing 20(50%) 5(50%) 1(3%) 0(0%) 5(13%) 1(10%) 14(35%) 5(50%) 85% 85%
Stool tipping 17(43%) 5(50%) 5(13%) 0(0%) 5(13%) 1(10%) 13(33%) 4(40%) 75% 75%
Girl reaching 14(35%) 4(40%) 6(15%) 0(0%) 1(3%) 0(0%) 19(48%) 5(50%) 78% 72%
Woman not
noticing
6(15%) 3(30%) 2(5%) 0(0%) 3(8%) 0(0%) 29(37%) 7(70%) 85% 88%
*kappa coefficients for t-less ,t-based agreement ranged from 20.07 to 0.06.
**kappa coefficients for agreement between transcription-less raters ranged from 0.60 to 0.90.
368 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
Transcription-based vs transcription-less discourse analysis
The validity and reliability results overall do indicate the potential for a
transcription-less method of discourse analysis, but some of the discourse features
measured produced more encouraging results than others. Each is discussed
separately below, with quantitative results also being examined in the light of
comments made by the transcription-less raters.
Gesture use. The results relating to the validity and reliability of a transcription-
less method of analysing gesture were positive. However, two of the gesture types
(baton and locational pictograph) produced higher counts in the transcription-based
Figure 2. Topic use (totals)
Figure 1. Gesture use (totals)
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 369
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
than the transcription-less method. This finding is not surprising, as the movements
involved in baton and locational gestures are often rapid repetitive movements that
are difficult to count accurately. The provision of additional training, including
clearer instruction of how to distinguish between a gesture used as a repeated baton
or a repeated locational pictograph, might have been helpful. The use of the
transcript to clearly identify the start and end of a sequence of gestures might also
likely have aided a higher count of these rapid movements by the transcription-based
rater. While the measures made by the transcription-based rater were overall similar
Figure 4. Repair (trouble sources, self initiated, and self repair)
Figure 3. Turn-taking (duration, %major turns at intro, % major turns participant, mean min & max
length major turns and mean length delay for both intro and close)
370 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
TABLE 10
Transcription-less raters’ comments
Discourse
feature Comments
Gesture use Ease/difficulty
‘‘relatively easy to complete for some patients and very difficult for others’’ (rater 1)
‘‘found this the most time-consuming, need to constantly refer to the definitions of each
gesture’’ (rater 2)
Problems with analysis
‘‘fuller explanation of the categories would have been useful … often felt … no available
category seemed to fit the gesture’’ (rater 1)
Clinical usefulness
‘‘this was an effective way of analysing gesture use. I would definitely use this in clinic’’
(rater 4)
Topic use Ease/difficulty
‘‘I didn’t find this section quite as difficult as I had anticipated, although I did encounter
some problems’’ (rater 1)
‘‘I found this part quite difficult’’ (rater 3)
Turn-taking Ease/difficulty
‘‘I found this section fairly straightforward’’ (rater 1)
‘‘difficult to be accurate regarding no. of words per turn’’ (rater 2)
‘‘this was very time consuming’’ (rater 3)
Problems with analysis
‘‘It was also harder to count words in longer turns and in participants who spoke rapidly’’
(rater 1)
Clinical usefulness
‘‘it did give a helpful insight to how much content the speaker was using’’ (rater 3)
‘‘counting the number of words per participant’s major turn was definitely much quicker
than carrying out a full transcription’’ (rater 4)
Repair Ease/difficulty
‘‘some instances of repair were very obvious while others were harder to identify and
more ambiguous’’ (rater 1)
‘‘became easier over time’’ (rater 2)
‘‘this was straightforward’’ (rater 3)
Problems with analysis
‘‘difficult to assess whether the person has stopped to think or paused due to a word
finding difficulty’’ (rater 2)
Clinical usefulness
‘‘a lot quicker than traditional transcription methods’’ (rater 3)
Initiation
(topic/
conversational)
Ease/difficulty
‘‘I was unsure about this section and did not feel confident when the participant had
started a new topic’’ (rater 1)
Concept
analysis
Ease/difficulty
‘‘with such clear guidelines and examples this was the easiest part’’ (rater 3)
Clinical usefulness
‘‘I found this a very useful form for analysing the cookie theft picture. I think this would
be really easy to use in clinic’’ (rater 4)
General
comments
Usefulness of the follow-up session
‘‘the examples [topic coherence] given in the follow up session were very useful’’ (rater 1)
Forms and guidelines
‘‘stricter guidelines [for turn-taking analysis] would also have been useful’’ (rater 1)
Analysis time
‘‘all analyses took longer than the 12 hours. However if carrying out in future would be
less time-consuming … likely to be more accurate on next attempt also!’’ (rater 2)
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 371
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
to the transcription-less raters (thus reflecting the possible confusion between batons
and pictographs), there was consistency in rating among the transcription-less raters.
It is interesting to note that the transcription-less raters varied in how they described
their experience of analysing gesture, from it being the ‘‘most interesting part of the
analysis’’ to some difficulty in finding categories for gestures made. Thus one
possible explanation for the lack of agreement between the two types of analysis in
some of the individual gesture types might be individual variation in analysis ability
for this discourse feature, which could be addressed via further focused training
especially on particular gesture types.
Topic use. For several of the topic use measures there was no variation between the
transcription-based and transcription-less methods. This reflects the nature of the
variable (main topic) where raters were consistently identifying the one main topic
within discourse samples across tasks and genres. As with gesture use, some of the
measures produced more agreement between the different methods than others.
There is less evidence on this feature of the transcription-based rater noting more
than the transcription-less raters. The transcription-less raters again varied in their
perception of the difficulty of this analysis.
Turn-taking. The one significant difference found between the two DA methods
for turn-taking measures (on minimum length of major turns during the
introductory conversation) highlights a possible area where more training would
be helpful for future raters using the transcription-less method. While the difference
on this measure was not significant for the closing conversation, it did produce a p
value for the mean of 0.10. This reinforces the possible need for revisiting the
training for this measure. One of the transcription-less raters did identify a problem
in being accurate in counting the number of words per turn.
Repair. The comments shown in Table 10 in relation to gesture use may again be
used to explain some of the discrepancies in the analysis of repair. However
individual variation can also be found in the comments for features on which the two
types of analysis were very similar (e.g., turn taking). Alternative explanations for
the pattern of agreement in the analyses of gesture use and repair include that these
particular aspects of DA may require more training than that offered, or that these
are in some way more ‘‘fine-grained’’ analyses than some of the other discourse
features included in the study, and so amount of experience may be fundamental to
higher ratings by the transcription-based rater compared to the transcription-less
raters. Indeed one of the raters noted that this analysis became easier over time.
Categorical measures. Measures of initiation of topic and conversation produced
reasonable reliability and validity results, but these aspects of discourse may be more
appropriately measured in less-structured discourse sampling. The transcription-less
raters found the concept analysis both easy to analyse and clinically useful. These
comments are reflected in the reliability and validity measures of this analysis.
The experience of transcription-less discourse analysis
The comments made by the transcription-less raters provided some rich qualitative
information relating to their experience during this study. These can be used with the
372 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
quantitative data to determine how to further examine the potential of transcription-
less DA as a tool for use in everyday clinical practice. As mentioned above, the
comments show some individual variation in the ease or difficulty with which the
analysis of the different measures was undertaken. They also indicate some areas in
which the training may be enhanced, either in terms of training materials or
guidelines. Generally comments relating to the perceived clinical usefulness of this
method were positive, with the proviso that it too was time-consuming for the
inexperienced clinician.
Strengths and limitations of the study
This study included a range of discourse genres and discourse features as well as
an adequate sample length and demonstrated further evidence of reliability of
the transcription-based method, some initial evidence on inter-rater reliability of
transcription-less DA, as well as validity of the transcription-less method. The results
therefore do indicate the skills of SLTs as expert listeners and observers of
communicative interaction. Combining these skills with the use of low-level
technological audio and visual digital recordings (using digital cameras) demons-
trate the potential availability and validity of a transcription-less method with which
to analyse their clients’ discourse. The main limitations of this study were (a) the
small number of raters involved in the analysis and (b), arguably, that different
raters were used for the two types of DA. These factors might restrict the
generalisability of findings from the study. However it did achieve its aim, to begin to
address the question of whether transcription-less discourse analysis is valid and
reliable, and its objective, to compare transcription-less and transcription-based
analyses of the same discourse samples, using the same measures, elicited from
people with aphasia.
Clinical implications and implications for future research
This study has therefore shown the potential of transcription-less DA. Results were
promising over a range of discourse features and measures that were analysed from
the same samples. These findings imply that in the future DA could be used as an
everyday clinical tool, as the need for time-consuming task of transcription prior to
analysis could be abolished. That is not to say, however, that clinicians would
necessarily dispense with transcription wholesale. They may continue to choose to
transcribe sections of discourse samples for particular reasons, such as detailed
grammatical analysis. As in Boles’ (1998) study mentioned earlier, SLTs may be
more likely to focus clinically on one or only a few aspects of discourse at any time
(in comparison with this study, in which raters were concerned with several aspects
within the one analysis). The reduction of discourse features under analysis would of
course reduce the effort required.
Many research questions remain in the evaluation of transcription-less DA as a
valid and reliable clinical tool. These include the intra-rater reliability of the method
(which will be measured more effectively in a study involving a larger number of
raters), the content and length of training required for qualified clinicians (possibly
especially in terms of repair and gesture use), and the wider applicability of the
method to other SLT client groups (both developmental and acquired) who present
with problems at discourse level.
TRANSCRIPTION-LESS DISCOURSE ANALYSIS 373
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011
REFERENCES
Altman, D. G. (1991). Practical statistics for medical research. London: Chapman & Hall.
Armstrong, E. (2000). Aphasic discourse analysis: The story so far. Aphasiology,14, 875–892.
Boles, L. (1998). Conversational discourse analysis as a method for evaluating progress in aphasia: A case
report. Journal of Communication Disorders,31, 261–274.
Boles, L., & Bombard, T. (1998). Conversational discourse analysis: Appropriate and useful sample sizes.
Aphasiology,12, 547–560.
Brady, M., Armstrong, L., & Mackenzie, C. (2005). Further evidence on topic use following right
hemisphere brain damage: Procedural and descriptive discourse. Aphasiology,19, 731–747.
Brady, M., & Mackenzie, C. (2001). Gesture use following right hemisphere brain damage. International
Journal of Language and Communication Disorders,36(suppl), 35–40.
Brady, M., Mackenzie, C., & Armstrong, L. (2003). Topic use following right hemisphere brain damage
during three semi-structured conversational discourse samples. Aphasiology,17, 881–904.
Cherney, L. R. (1998). Pragmatics and discourse: An introduction. In L. R. Cherney, B. B. Shadden, &
C. A. Coelho (Eds.), Analyzing discourse in communicatively impaired adults (pp. 1–7). Gaithersburg,
MD: Aspen Publishers Inc.
Comrie, P. (1999). A study of the influence of dysarthria on conversational contributions. Unpublished BSc
thesis: University of Strathclyde, UK.
Comrie, P., Mackenzie, C., & McCall, J. (2001). The influence of acquired dysarthria on conversational
turn taking. Clinical Linguistics and Phonetics,15, 383–398.
Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York: Wiley.
Goodglass, H., Kaplan, E., & Barresi, B. (2001). The Boston diagnostic aphasia examination. Baltimore:
Lippincott Williams & Wilkins.
Landis, J. R., & Koch, G. (1977). The measurement of observer agreement for categorical data.
Biometrics,33, 159–174.
LeMay, A., David, R., & Thomas, A. P. (1988). The use of spontaneous gesture by aphasic patients.
Aphasiology,2, 137–145.
Lesser, R., & Milroy, L. (1993). Linguistics and aphasia: Psycholinguistic and pragmatic aspects of
intervention. London: Longman.
Mackenzie, C., Brady, M., Norrie, J., & Poedjianto, N. (2007). Picture description in neurologically
normal adults: Concepts and topic coherence. Aphasiology,21, 340–354.
Mentis, M., & Prutting, C. A. (1991). Analysis of topic as illustrated in a head-injured and normal adult.
Journal of Speech & Hearing Research,34, 583–595.
Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis (2nd ed.). Thousand Oaks, CA: Sage
Publications.
Milroy, L., & Perkins, L. (1992). Repair strategies in aphasic discourse: Towards a collaborative model.
Clinical Linguistics and Phonetics,6, 27–40.
Nicholas, L. E., & Brookshire, R. H. (1995). Presence, completeness, and accuracy of main concepts in the
connected speech of non-brain-damaged adults and adults with aphasia. Journal of Speech and Hearing
Research,38, 145–156.
Prins, R., & Bastiaanse, R. (2004). Analysing the spontaneous speech of aphasic speakers. Aphasiology,18,
1075–1091.
Psathas, G., & Anderson, T. (1990). The practices of transcription in conversation analysis. Semiotica,78,
75–99.
Royal College of Speech and Language Therapists (2005). RCSLT clinical guidelines. Bicester, UK:
Speechmark Publishing Ltd.
Shadden, B. B. (1998). Obtaining the discourse sample. In L. R. Cherney, B. B. Shadden, & C. A. Coelho
(Eds.), Analyzing discourse in communicatively impaired adults (pp. 9–34). Gaithersburg, MD: Aspen
Publishers Inc.
Sherratt, S. (2004). The whole, the parts and the interaction: connections between different levels of discourse
production. Proceedings of the International Association of Logopaedics and Phoniatrics Conference,
Brisbane, Australia, September.
Sherratt, S. (2007). Multi-level discourse analysis: A feasible approach. Aphasiology,21, 375–393.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations used in assessing rater reliability. Psychological
Bulletin,86, 420–428.
Togher, L. (2001). Discourse sampling in the 21st century. Journal of Communication Disorders,34,
131–150.
374 ARMSTRONG ET AL.
Downloaded By: [Glasgow Caledonian Univ.] At: 11:06 8 February 2011