Scientic Reports | (2020) 10:17177 |
A critical evaluation of visual
proportion of Gleason 4
and maximum cancer core length
quantied by histopathologists
Lina Maria Carmona Echeverria1,2*, Aiman Haider3, Alex Freeman3,
Urszula Stopka‑Farooqui1, Avi Rosenfeld4, Benjamin S. Simpson1, Yipeng Hu5,
David Hawkes5, Hayley Pye1, Susan Heavey1, Vasilis Stavrinides1,2, Joseph M. Norris1,2,
Ahmed El‑Shater Bosaily2,6, Cristina Cardona Barrena1, Simon Bott7, Louise Brown8,
Nick Burns‑Cox9, Tim Dudderidge10, Alastair Henderson11, Richard Hindley12,
Richard Kaplan8, Alex Kirkham5,13, Robert Oldroyd14, Maneesh Ghei15, Raj Persad16,
Shonit Punwani5,13, Derek Rosario17, Iqbal Shergill18, Mathias Winkler19,
Hashim U. Ahmed19,20, Mark Emberton2 & Hayley C. Whitaker1
Gleason score 7 prostate cancer with a higher proportion of pattern 4 (G4) has been linked to genomic
heterogeneity and poorer patient outcome. The current assessment of G4 proportion uses estimation
by a pathologist, with a higher proportion of G4 more likely to trigger additional imaging and
treatment over active surveillance. This estimation method has been shown to have inter‑observer
variability. Fifteen patients with Prostate Grade Group (GG) 2 (Gleason 3 + 4) and fteen patients with
GG3 (Gleason 4 + 3) disease were selected from the PROMIS study with 192 haematoxylin and eosin‑
stained slides scanned. Two experienced uropathologists assessed the maximum cancer core length
(MCCL) and G4 proportion using the current standard method (visual estimation) followed by detailed
digital manual annotation of each G4 area and measurement of MCCL (planimetric estimation) using
freely available software by the same two experts. We aimed to compare visual estimation of G4
and MCCL to a pathologist‑driven digital measurement. We show that the visual and digital MCCL
Molecular Diagnostics and Therapeutics Group, Division of Surgery and Interventional Science, University College
Division of Surgery and Interventional
Centre for Medical Image Computing, University College London, Charles Bell House,
Department of Urology, Hampshire
Department of Urology, Imperial College London, South
Imperial Prostate, Division of Surgery, Department of Surgery and
Scientic Reports | (2020) 10:17177 |
measurement diers up to 2 mm in 76.6% (23/30) with a high degree of agreement between the two
measurements; Visual gave a median MCCL of 10 ± 2.70 mm (IQR 4, range 5–15 mm) compared to
digital of 9.88 ± 3.09 mm (IQR 3.82, range 5.01–15.7 mm) (p = 0.64) The visual method for assessing
G4 proportion over‑estimates in all patients, compared to digital measurements [median 11.2% (IQR
38.75, range 4.7–17.9%) vs 30.4% (IQR 18.37, range 12.9–50.76%)]. The discordance was higher as
the amount of G4 increased (Bias 18.71, CI 33.87–48.75, r 0.7, p < 0.0001). Further work on assessing
actual G4 burden calibrated to clinical outcomes might lead to the use of diering G4 thresholds
of signicance if the visual estimation is used or by incorporating semi‑automated methods for G4
Gleason pattern 4 (G4) prostate cancer is genetically distinct from Gleason pattern 3 and correlates with worse
cancer control outcomes either on active surveillance or following active treatment1,2. In 2013 Pierorazio etal.,
retrospectively reviewed 7850 radical prostatectomy specimens to investigate the short-term biochemical out-
come using a prognostic based scoring system called the Prostate Grading Group (GG). By separating the
Gleason sum 7 group into 3 + 4 and 4 + 3, the authors found that men with 4 + 3 had worse outcome dened as
biochemical recurrence-free survival3. ese ndings were further validated and were subsequently endorsed by
the 2014 International Society of Urological Pathology Consensus Conference and the World Health Organiza-
tion (WHO)4–6. Additionally, there is some uncertainty about whether %G4 in 3 + 4 cancers is also relevant to
management and outcome7–9.
is new classication system calls for improved categorisation of the percentage of G4 (%G4) in Prostate
Cancer (PCa) to allow for better risk stratication and inform treatment decisions7,9–12. e distinction between
Gleason 3 + 4 (GG2) and 4 + 3 (GG3) is made when %G4 falls below or above 50%, respectively, as visually
estimated by a uropathologist5. Additionally, the maximum amount of cancer in any core (maximum cancer
core length, MCCL) has been used as a proxy for tumour volume estimation and can be used to dene clinical
Most histological prostate cancer burden studies have been performed in radical prostatectomy specimens
or on men who have undergone transrectal systematic biopsies. e Prostate MR Imaging Study (PROMIS)
includes men who are biopsy naïve whose prostates were systematically sampled every 5mm providing a unique
opportunity to perform an in-depth pathologist-driven annotation and digital analysis of the pathological slides
and compare this to the visually-reported %G4 and MCCL15.
In this study, we aimed to compare %G4 and MCCL within standard practice, estimated by a pathologist, to a
calculated burden from digitally annotated slides by the same pathologists on thirty patients from the PROMIS
study with GG2 and GG3 PCa.
Comparison between visual and digital MCCL. When comparing visual versus digital MCCL, in 23 of
the 30 patients the dierence was up to ± 2mm; taking into account the positive and negative values the median
dierence was 0.58mm (range −4.12 to + 5.52mm, t-test, p = 0.64) (Fig.1A,B). Seven patients had measure-
ments that diered by ≥ 2mm between digital and visual estimation. When viewed as a density plot, there was a
tendency to overestimate MCCL in the 3 + 4 group and under-estimate in the 4 + 3 group when using the visual
method (Fig.1C). To understand the degree of agreement between the two measurements, a Bland–Altman test
was performed16. ere was no systematic dierence (bias) between the visual and digital assessment of MCCL,
and there was no correlation between increasing MCCL and the level of disagreement between the two measure-
ments (Supplementary gureS.1).
Gleason 4. e visual %G4 overestimated %G4 burden when compared to the digital assessment in all cases
(Fig.2A). e 4 + 3 group had a mean dierence of + 26.6% (range 9.6–41.9%) compared to + 10.8% (range
1.3–24.9%) for the 3 + 4 group (t-test, p = 1.9 × 10–5). e average %G4 in the patients graded 3 + 4 was 11.2%
(range 4.7–17.9%) compared to 30.4% (range 12.9–50.6%) in the 4 + 3 group (t-test, p < 0.0001). When patholo-
gists were asked to assess the overall Gleason score based on the digital images (visual %G4), two patients were
downgraded from their original clinical grading of 4 + 3 to 3 + 4 by both pathologists (See yellow bars of patients
23 and 18 in Fig.2A).
Using the established 50% G4 threshold to designate a 4 + 3 cancer, and based on the digital %G4 (blue bars),
only one patient (number 19 in Fig.2A) would be classied as 4 + 3. When dividing the digital %G4 into quartiles,
two patients in the original 4 + 3 group had less %G4 than the upper quartile of the 3 + 4 group (18 and 30). In
other words, these two patients had less %G4 than the men with the highest %G4 compromise in the original
3 + 4 g roup. Figure2B shows the Bland–Altman analysis; showing that there was a bias towards overestimation in
the visual estimations as all values are located above the line of complete agreement (Complete agreement would
result in a zero value). e disagreement was larger when more than 20% of G4 was present (R 0.79, p < 0.0001).
Examination of the index block (block with the highest Gleason score and MCCL), revealed the same nd-
ings as previously seen with all tumour containing cores (Fig.3A). e visual assessment of digitised images
downgraded four patients index block from 4 + 3 to 3 + 4 (patients 18, 30,16, and 23). When examining the
digital %G4, only two patients reached the 50% G4 threshold (27 and 19), and so would be the only two patients
with 4 + 3 disease based on digital measurement. e Bland–Altman analysis revealed a similar trend to that of
the overall %G4 analysis. One measurement had a complete agreement between the digital and visual estimate
(Patient 6 in Fig.3A).One patient had a higher digital estimation compared to the visual estimation (Patient 4 in
Scientic Reports | (2020) 10:17177 |
Fig.3A). is is represented by the only dot in the negative area of Fig.3B.e disagreement between measure-
ments increased as the amount of %G4 increased (R 0.6, p < 0.0001).
When patients were classied using the clinical signicance criteria used in PROMIS in which MCCL and
Gleason score were combined to derive denitions 1 (≥ 4 + 3 or ≥ 6mm) and 2 (≥ 3 + 4 or 4mm) the digital
analysis reclassied four patients’ index block as lower risk13. When all blocks were compared using this system,
20 patients had discrepancy between the visual and digital classication, leading to reclassication to higher or
lower risk in six and fourteen patients, respectively (Supplementary FigureS2).
We have presented an in-depth analysis of 30 men from the PROMIS trial, to establish the level of agreement
between the gold standard visual estimation of MCCL and %G4, compared to digitally annotated images. Limi-
tations to this study include: e presence of cribriform pattern was not recorded separately in this study or
Figure1. Objective measurement of MCCL and shows a discrepancy with visual measurement and pathologist
estimation. (A) MCCL dierence between visual and digital MCCL shows under-estimation in visual compared
to digital MCCL. Bar plot of visual MCCL in yellow and digital MCCL in blue, organised by Gleason score.
MCCL is plotted on the y-axis; each patient is plotted on the x-axis. Red dashed lines represent a threshold
of 6mm as the MCCL criterion for signicance (PROMIS denition 1). Patients highlighted in red were over
or underestimated in the original visual measurement. (B) Waterfall plot representing the dierence between
visual and digital measurements as digital MCCL-visual MCCL by Gleason score (y-axis), patients plotted on
the x-axis. Visual Gleason score is represented in yellow for 3 + 4 and blue for 4 + 3. Bars with a negative value
represent measurements where the visual MCCL was shorter than the digital MCCL (underestimation). Bars
with a positive value represent cases were the visual MCCL was higher than the digital MCCL. e dierence in
80% of cases is ± 2mm (n = 24), red dashed line at −2 and 2mm dierence. (C) Density plots representing the
MCCL distribution between visual and digital images by Gleason scores. Y-axis represents the Kernel density
estimation. e X-axis contains MCCL values. Visual MCCL score is represented in yellow and blue for the
digital measurement. 4 + 3.e mean visual MCCL was 9.53mm (5–15mm) and the mean digital MCCL was
Scientic Reports | (2020) 10:17177 |
included in the nal analysis. In addition, the pathologists retrospectively assessed %G4 on annotated images,
introducing potential bias in their assessment. Finally, no long term follow up currently exists for the PROMIS
study, so we are unable to determine the prognostic signicance of our ndings.
A threshold of 4mm and 6mm has been shown to correlate with 95% of lesions that have a volume higher
than 0.2mL or 0.5mL, respectively13. Demetrios etal. found that MCCL greater than 10mm can predict T3
disease and large tumour volumes with a hazard ratio (HR) of 5.7314. Using these thresholds and taking into
account the dierence in the MCCL measurements, there is a potential impact on the treatment options oered.
For instance, patients reclassied as having < 6mm MCCL could be candidates for active surveillance instead of
radical therapy (Patients 6, 18, 19 and 20) (Fig.1A,B). Interestingly, the visual measurement of men with 3 + 4
Figure2. Visual Gleason 4 appraisal overestimates burden of disease. (A) Bar plot of the proportion of
Gleason 4 estimation average between two uropathologists (yellow) and digital estimation (blue). %G4 is
plotted on the y-axis; each patient is plotted on the x-axis. A threshold of 50% g4 for clinical signicance is
shown as a red dashed line. Patient number on the x-axis is highlighted in bold and underlined if the digital
measurement of their %G4 would lead to reclassication based on the digital value. Patient marked with *
has ≥ 50% G4 in the digital measurement. (B) Bland–Altman plot representing the dierence in measurement in
the y-axis as visual %G4 – digital %G4. e x-axis represents the mean %G4 measurement of both techniques
as (visual %G4 + digital %G4)/2. e bold black line represents complete agreement at 0. e purple dashed
line corresponds to the bias at 18.71; the dotted purple line corresponds to the bias condence interval
(33.87–48.75). Dash and dotted blue lines correspond to the upper and lower limit of agreement and condence
intervals are plotted with dotted blue lines. Upper limit of agreement: 41.31 (33.87–48.75), lower limit of
agreement: −3.87 (−11.31 to 3.56). Regression line is plotted as a continuous blue line.
Scientic Reports | (2020) 10:17177 |
disease was more likely to be greater compared to men with 4 + 3 disease (Fig.1C). Despite these dierences,
the Bland–Altman analysis showed good concordance between the two measurements; thus, the accuracy of the
MCCL is not compromised when a digital tool is used.
In our study, the visual estimation of %G4 diered from the digital one; accurate measurement of the G4 bur-
den has been shown to help risk-stratify patients9,17. In a study by de Souza etal., 20% of Gleason 3 + 4 tumours
had more extensive G4 disease than the rst quartile of 4 + 3 tumours in radical prostatectomy specimens18. In
2014, Huang etal. found that 45% of men with ≤ 5% of G4 in prostate biopsy had insignicant cancer in radical
prostatectomy7. Additionally, several papers have shown that tumours with lower %G4 behave closer to GG1
Figure3. Objective measurement of Gleason 4 burden shows a discrepancy between visual measurement
and the digital measurement for the index block. (A) Visual %G4 for the index block 30 patients shown in
yellow overlaid with digital %G4 in blue. Patients separated by original Gleason grade grouping; 3 + 4 or
4 + 3, and organized by visual %G4. A threshold of 50% G4 for clinical signicance is shown as a red dashed
line. Patient number on the x-axis highlighted in bold and underlined if the objective measurement of their
%G4 would cause reclassication. (B) Bland–Altman plot representing the dierence in measurement in the
y-axis as visual %G4−digital %G4. e x-axis represents the mean %G4 measurement of both techniques as
(visual %G4 + digital %G4)/2. e bold black line represents complete agreement at 0. e purple dashed line
corresponds to the bias at 14.36; the dotted purple line corresponds to the bias condence interval (9.78–18.94).
Dash and dotted blue lines correspond to the upper and lower limit of agreement and condence intervals are
plotted with dotted blue lines. Upper limit of agreement: 38.40 (30.49–46.32), lower limit of agreement: −9.67
(−17.59 to −1.76). e regression line is plotted as a continuous blue line.
Scientic Reports | (2020) 10:17177 |
In this study, we found that visual estimation always overestimated the amount of G4 compared to a digitally
calculated %G4. For all of these patients, reclassication of the %G4 would potentially lead to a change in treat-
ment options, and imaging follow up. For example, patient 18 was reclassied aer digital assessment and would
be downgraded from 4 + 3 of > 6mm to 3 + 4 of < 6mm (Fig.2A). e same was found when we examined the
index block only.
Integration of %G4 reporting in biopsies and radical prostatectomy specimens is already recommended6.
e ndings of our study suggest that a re-assessment of %G4 estimation may be required. Reclassication of
G4 could lead to a re-evaluation of previously published biomarker and clinical studies and redene the refer-
ence standard for research. e heterogeneity of studies of the prognostic importance of Gleason 3 + 4 disease
as compared with Gleason 4 + 3 disease may be a reection of uncertainty about how much G4 pattern disease
is actually shown in specimens and is particularly relevant to treatments such as radiotherapy or ablation where
there is no whole mount radical prostatectomy specimen to analyse.
As we move toward the inclusion of digital pathology in standard clinical practice, it will be essential to inves-
tigate the dierences between human and digital estimation of key pathological parameters and the potential
impact this could have on patient care. is will involve adapting the current visual classication to digitally-
derived grading. is study does not aim to highlight human error or criticize visual estimation of the patholo-
gists but to encourage the use of technology to improve our understanding of MCCL and G4 burden in prostate
cancer, and to seek novel methods to quantify and study the disease. Whilst this type of analysis would be cur-
rently challenging to embed directly into clinical practice due to the time taken to contour each region; work is
already ongoing to automate this process23–28. Identifying relatively overlooked elements, such as %G4, improves
the accuracy of the models used in machine learning29, as such future algorithms can be trained to specically
identify %G4, rather than GG alone.
Further research is also needed to develop and validate new thresholds of the burden of G4 against large
cohorts with medium and long-term cancer control outcomes.
Materials and methods
Patients. Two-hundred and twenty-six patients from University College London Hospital took part in the
PROMIS trial. Men underwent 5mm sampling using a transperineal template mapping procedure. Of 113 men
with Gleason 7 PCa, 85 had signicant disease (PROMIS denition 1: Gleason 4 + 3 or MCCL ≥ 6mm). 15
patients with Gleason 3 + 4 and 15 patients with 4 + 3 disease were selected from the 85, using a random number
generator (Table1; Fig.4A). A mean of 14.2 ± 8.05 cores per patient (IQR 9, range 2–34) were taken. 192 H&E
slides from these 30 patients were scanned using a NanoZoomer-SQ digital slide scanner (Hamamatsu).
Digital scan annotations and data collection. Two experienced UCH uropathologists with 16years
(AF) and 1.5years’ experience (AH) were involved in this study. e 30 cases included in this study were origi-
nally reported by AF as part of the PROMIS trial. e pathologists were blinded to the PROMIS Gleason score;
scans were shown randomly and assessed by two experienced uropathologists (AF/AH) using NDP.View 2 so-
ware. Each slide was systematically assessed as follows: 1. Each core was numbered from le to right. 2. Length
Table 1. Gleason 7 patients in the PROMIS cohort and 30 selected patients for in-depth analysis. Table
comparing the Gleason 7 patients from University College London (UCH) within the PROMIS study. UCH
PROMIS cohort is on the le, selected patients on the right. Number of patients per group by Gleason score in
each cohort as n = , percentage in parenthesis. Mean value for age, prostate volume, presenting PSA and PSA
density, with range in parenthesis. Age is denoted in years, prostate volume in cubic centimetres (cc), PSA in
ng/dL and PSA density calculated as PSA/prostate volume. Likert scores are presented as number of patients
and percentage in parenthesis, Likert NA when no Likert score was given. *p-value obtained using an unpaired
t-test, **if using Mann–Whitney test.
UCH—PROMIS cohort (4 + 3
or ≥ 6mm MCCL) p value (3 + 4 vs
4 + 3) Selected 30 patients p value (3 + 4 vs
4 + 3)
Gleason score 3 + 4 4 + 3 3 + 4 4 + 3
n = 67 (78%) n = 18 (22%) n = 15 (50%) n = 15 (50%)
Age (years) 63 (43–77) 64 (48–79) 0.44* 62 (50–72) 65 (48–79) 0.30*
(cc) 38.34 (16–83) 38.18 (26–55) 0.65** 34 (21–62) 38 (26–55) 0.11**
(ng/dL) 7.46 (1.30–13) 10.76 (5.7–15) < 0.0001* 7.60 (4.9–10.1) 10.74 (6.2–15) 0.0005*
(PSAd) 0.22 (0.06–0.59) 0.29 (0.11–0.53) 0.002** 0.24 (0.10–0.38) 0.29 (0.11–0.53) 0.14*
Likert 2 1 (1.4%) 0 0 0
Likert 3 8 (11.9%) 3 (16.6%) 1 (6.6%) 0
Likert 4 21 (31.3%) 3 (16.6%) 6 (40%) 4 (26.6%)
Likert 5 5 (7.46%) 12 (66.6%) 8 (53.3%) 11 (73.3%)
Likert NA 4 (5.87%) 0 0 1 (6.6%)
Scientic Reports | (2020) 10:17177 |
of cancer was measured (Fig.4B). 3. Areas containing any cancer were contoured in yellow (Fig.4C). 4. Areas
containing G4 were contoured in black (Fig.4C).
e MCCL was reported prospectively by the pathologists during the trial using the integrated ruler in the
microscope; this measurement was assigned as ‘visual’ MCCL. In PROMIS, the MCCL was reported by taking
into account intervening benign glands (ISUP) and measuring cancer only. For the purposes of this study, the
ISUP measurement was used. e ‘digital’ MCCL was derived as follows: If a core was straight, a single measure-
ment was performed. If there was any curvature, manual sequential measurements were performed along the
core axes and combined to give the nal measurement.
%G4 was not collected as part of the original trial, pathologists retrospectively visually estimated the %G4
per patient to the closest 10% using the annotated images. is was assigned as ‘visual’ %G4. For digital %G4,
the soware performs instant area measurements. e resulting area (for each yellow and black contours) was
prospectively recorded, and an objective percentage of G4 was calculated as shown in equation1 (Fig.4D). is
total was assigned as ‘digital’ %G4. A separate analysis of the index block was performed separately. e index
block was dened as the block with the highest Gleason score and MCCL in combination with concordance
with the index lesion on mpMRI.
Statistical analysis. Patients were divided according to the original Gleason score from the PROMIS trial
into 3 + 4 and 4 + 3. e routinely performed ‘visual’ estimation for both measurements was used as the reference
standard for all comparisons. When comparing two groups, meeting normal distribution (Shapiro–Wilk test)
and same variances (F-test), a student t-test was applied. Whenever data was not normally distributed a Mann–
Whitney test was performed. To quantify the agreement between the two methods, the Bland Altman method
was performed. e visual method was used as a standard for comparison; bias was dened as the average of
the dierence between the two methods. Limits of agreement were calculated at 95% CI. All analyses were made
using R: A Language and Environment for Statistical Computing30. e Bland–Altman analysis was performed
using the blandr package for R31.
Ethical approval. All clinical samples were collected from University College London Hospital NHS Trust
patients who had provided informed consent. Ethics committee approval was granted by National Research
Figure4. Patient selection and methods of digital manual annotation. (A) Euler diagram representing patient
selection process for 30 patients for in-depth analysis. (B) NDPview2 image of scanned H&E slide of prostate
cores from transperineal biopsies, where nuclei are shown in blue, and other structures in pink. From le to
right, MCCL measurement in a straight core of 8.5mm. Approximate visual pathologist measurement marked
with a red line (7.76mm). Following the axis of the core, three measurements in black of 2.53mm, 2.11mm and
4.48mm for a total of 9.12mm for the digital measurement. (C) ree prostate cores, areas with cancer were
contoured in yellow, areas with Gleason 4 were contoured in black. Close up of contours shown in black box.
Non-contoured areas correspond to benign prostatic tissue. (D) Equation used to derive percentage Gleason 4.
Scientic Reports | (2020) 10:17177 |
Ethics Service Committee London (reference 11/LO/0185). Access to biobank samples was obtained [reference
(EC/21.16)]. All analyses were performed in accordance with relevant guidelines and regulations.
Received: 30 April 2020; Accepted: 28 August 2020
1. Rubin, M. A., Girelli, G. & Demichelis, F. Genomic correlates to the newly proposed grading prognostic groups for prostate cancer.
Eur. Urol. 69, 557–560 (2016).
2. Sowalsky, A. G. et al. Gleason score 7 prostate cancers emerge through branched evolution of clonal Gleason pattern 3 and 4. Clin.
Cancer Res. 23, 3823–3833 (2017).
3. Pierorazio, P. M., Walsh, P. C., Partin, A. W., Epstein, J. I. & Epstein, J. Prognostic Gleason grade grouping: data based on the
modied Gleason scoring system. BJU Int. 111, 753–760 (2013).
4. Epstein, J. I. et al. A contemporary prostate cancer grading system: a validated alternative to the Gleason score. Eur. Urol. 69,
5. Epstein, J. I. et al. e 2014 International Society of Urological Pathology (ISUP) consensus conference on Gleason grading of
prostatic carcinoma denition of grading patterns and proposal for a new grading system. Am. J. Surg. Pathol. 40, 244–252 (2016).
6. Humphrey, P. A., Moch, H., Cubilla, A. L., Ulbright, T. M. & Reuter, V. E. e 2016 WHO classication of tumours of the urinary
system and male genital organs—Part B: prostate and bladder tumours. Eur. Urol. 70, 106–119 (2016).
7. Huang, C. C. et al. Gleason score 3+4=7 prostate cancer with minimal quantity of Gleason pattern 4 on needle biopsy is associated
with low-risk tumor in radical prostatectomy specimen. Am. J. Surg. Pathol. 38, 1096–1101 (2014).
8. Sato, S. et al. Cases having a Gleason Score 3+4=7 with <5% of Gleason pattern 4 in prostate needle biopsy show similar failure-
free survival and adverse pathology prevalence to Gleason Score 6 cases in a radical prostatectomy cohort. Am. J. Surg. Pathol. 43,
9. Sauter, G. et al. Clinical utility of quantitative Gleason grading in prostate biopsies and prostatectomy specimens. Eur. Urol. 69,
10. Cole, A. I. et al. Prognostic value of percent Gleason grade 4 at prostate biopsy in predicting prostatectomy pathology and recur-
rence. J. Urol. 196, 405–411 (2016).
11. Stark, J. R. et al. Gleason score and lethal prostate cancer: does 3 + 4 = 4 + 3?. J. Clin. Oncol. 27, 3459–3464 (2009).
12. Berney, D. M. et al. e percentage of high grade disease in prostate biopsies signicantly improves on grade groups in prediction
of prostate cancer death. Histopathology 75(4), 589–597 (2019).
13. Ahmed, H. U. et al. Characterizing clinically signicant prostate cancer using template prostate mapping biopsy. J. Urol. 186,
14. Simopoulos, D. N. et al. Cancer core length from targeted biopsy: an index of prostate cancer volume and pathological stage. BJU
Int. https ://doi.org/10.1111/bju.14691 (2019).
15. Ahmed, H. U. et al. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validat-
ing conrmatory study. Lancet 389, 815–822 (2017).
16. Bland, J. M. & Altman, D. G. Applying the right statistics: analyses of measurement studies. Ultrasound. Obstet. Gynecol. 22, 85–93
17. Sharma, M. & Miyamoto, H. Percent Gleason pattern 4 in stratifying the prognosis of patients with intermediate-risk prostate
cancer. Transl. Androl. Urol. 7, S484–S489 (2018).
18. de Souza, M. F., de Azevedo Araujo, A. L. C., da Silva, M. T. & Athanazio, D. A. e Gleason pattern 4 in radical prostatectomy
specimens in current practice—quantication, morphology and concordance with biopsy. Ann. Diagn. Pathol. 34, 13–17 (2018).
19. Berg, K. D., Roder, M. A., Brasso, K., Vainer, B. & Iversen, P. Primary Gleason pattern in biopsy Gleason score 7 is predictive of
adverse histopathological features and biochemical failure following radical prostatectomy. Scand. J. Urol. 48, 168–176 (2014).
20. Helpap, B. et al. e signicance of accurate determination of Gleason score for therapeutic options and prognosis of prostate
cancer. Pathol. Oncol. Res. 22, 349–356 (2016).
21. Miyake, H. et al. Prognostic signicance of primary Gleason pattern in Japanese men with Gleason score 7 prostate cancer treated
with radical prostatectomy. Urol. Oncol. Semin. Orig. Invest. 31, 1511–1516 (2013).
22. Khoddami, S. M. et al. Predictive value of primary Gleason pattern 4 in patients with Gleason score 7 tumours treated with radical
prostatectomy. BJU Int. 94, 42–46 (2004).
23. Esteva, A. et al. A guide to deep learning in healthcare. Nat. Med. 25(1), 24–29 (2019).
24. Ström, P. et al. Articial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.
Lancet Oncol. https ://doi.org/10.1016/S1470 -2045(19)30738 -7 (2020).
25. Nir, G. et al. Automatic grading of prostate cancer in digitized histopathology images: learning from multiple experts. Med. Image
Anal. 50, 167–180 (2018).
26. Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat.
Med. 25, 1301–1309 (2019).
27. Arvaniti, E. et al. Automated Gleason grading of prostate cancer tissue microarrays via deep learning. Sci. Rep. 8, 12054 (2018).
28. Nagpal, K. et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. npj
Digit. Med. 2, 1–10 (2019).
29. Rosenfeld, A., Graham, D. G., Hamoudi, R., Butawan, R., Eneh, V., Khan, S. et al. MIAT: a novel attribute selection approach to
better predict upper gastrointestinal cancer. in Proceedings of the 2015 IEEE International Conference on Data Science and Advanced
Analytics, DSAA 2015 (Institute of Electrical and Electronics Engineers Inc., 2015). https ://doi.or g/10.110 9/DSAA.2015. 73448 66.
30. Team RC. R: A Language and Environment for Statistical Computing (2019). https ://www.r-proje ct.org/.
31. Datta, D. blandr: A Bland–Altman Method Comparison Package for R (2017). https ://doi.org/10.5281/zenod o.82451 4.
L.M.C.E., H.C.W., H.U.A., and M.E. conceived and designed the study. L.M.C.E., U.S., A.H., A.F., C.C.B. collected
the data L.M.C.E. analysed the data with guidance from Y.H., A.R. and L.B. All authors were involved in writing
the paper and had nal approval of the submitted and published versions.
Ahmed currently receives funding from the Wellcome Trust, Prostate Cancer UK, Medical Research Council
(UK), Cancer Research UK, e Urology Foundation, BMA Foundation, Imperial Healthcare Charity, Sonacare
Scientic Reports | (2020) 10:17177 |
Inc., Trod Medical and Sophiris Biocorp for trials in prostate cancer. Ahmed is a paid medical consultant for
Sophiris Biocorp, Sonacare Inc., BTG and Boston for trials work and proctoring. Emberton receives funding from
NIHR-i4i, MRC, Cancer Research UK, Sonacare Inc., and Sophiris Biocorp for trials in prostate cancer. Emberton
is a medical consultant to Sonacare Inc., Sophiris Biocorp, Steba Biotech, Exact Imaging and Profound Medical.
Ahmed and Emberton are proctors for HIFU and paid for training other surgeons in this procedure. Ahmed is
a proctor for cryotherapy using the Galil/BTG system. Emberton is a proctor for Irreversible Electroporation
(Nanoknife). Rest of the authors have no conict of interest.
Supplementary information is available for this paper at https ://doi.org/10.1038/s4159 8-020-73524 -z.
Correspondence and requests for materials should be addressed to L.M.C.E.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
Open Access is article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.
© e Author(s) 2020
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at