ArticlePDF Available

Abstract and Figures

Background: The International Ovarian Tumour Analysis (IOTA) group have developed the ADNEX (The Assessment of Different NEoplasias in the adneXa) model to predict the risk that an ovarian mass is benign, borderline, stage I, stages II-IV or metastatic. We aimed to externally validate the ADNEX model in the hands of examiners with varied training and experience. Methods: This was a multicentre cross-sectional cohort study for diagnostic accuracy. Patients were recruited from three cancer centres in Europe. Patients who underwent transvaginal ultrasonography and had a histological diagnosis of surgically removed tissue were included. The diagnostic performance of the ADNEX model with and without the use of CA125 as a predictor was calculated. Results: Data from 610 women were analysed. The overall prevalence of malignancy was 30%. The area under the receiver operator curve (AUC) for the ADNEX diagnostic performance to differentiate between benign and malignant masses was 0.937 (95% CI: 0.915-0.954) when CA125 was included, and 0.925 (95% CI: 0.902-0.943) when CA125 was excluded. The calibration plots suggest good correspondence between the total predicted risk of malignancy and the observed proportion of malignancies. The model showed good discrimination between the different subtypes. Conclusions: The performance of the ADNEX model retains its performance on external validation in the hands of ultrasound examiners with varied training and experience.British Journal of Cancer advance online publication, 2 August 2016; doi:10.1038/bjc.2016.227 www.bjcancer.com.
Content may be subject to copyright.
Evaluating the risk of ovarian cancer before
surgery using the ADNEX model: a
multicentre external validation study
A Sayasneh*
,1,2,10
, L Ferrara
3,4,10
, B De Cock
5
, S Saso
3
, M Al-Memar
3
, S Johnson
6
, J Kaijser
7
, J Carvalho
3
,
R Husicka
3
, A Smith
8
, C Stalder
3
, MC Blanco
4
, G Ettore
4
, B Van Calster
5
, D Timmerman
5,9
and T Bourne
1,3,5
1
Department of Surgery and Cancer, Hammersmith Campus, Imperial College London, Du Cane Road, London W12 0HS, UK;
2
Department of Obstetrics and Gynaecology, Guy’s and St Thomas’ Hospital, Westminster Bridge Road, London SE1 7EH, UK;
3
Early Pregnancy and Acute Gynecology Unit, Queen Charlotte’s and Chelsea Hospital, Imperial College London, Du Cane Road,
London W12 0HS, UK;
4
Department of Obstetrics and Gynecology, Garibaldi Nesima Hospital, Via Palermo 636, Catania 95122,
Italy;
5
KU Leuven, Department of Development and Regeneration, Herestraat 49, Box 805, Leuven 3000, Belgium;
6
Southampton
University Hospitals, Princess Anne Hospital, Southampton SO16 5YA, UK;
7
Department of Obstetrics and Gynecology, Ikazia
Ziekenhuis Rotterdam, Montessoriweg 1, Rotterdam 3083 AN, The Netherlands;
8
Ultrasound Scan Department, Queen Charlottes
and Chelsea Hospital, Imperial College London, Du Cane Road, London W12 0HS, UK and
9
Department of Obstetrics and
Gynecology, University Hospitals Leuven, Herestraat 49, Box 7003, 3000 Leuven, Belgium
Background: The International Ovarian Tumour Analysis (IOTA) group have developed the ADNEX (The Assessment of Different
NEoplasias in the adneXa) model to predict the risk that an ovarian mass is benign, borderline, stage I, stages II–IV or metastatic.
We aimed to externally validate the ADNEX model in the hands of examiners with varied training and experience.
Methods: This was a multicentre cross-sectional cohort study for diagnostic accuracy. Patients were recruited from three cancer
centres in Europe. Patients who underwent transvaginal ultrasonography and had a histological diagnosis of surgically removed
tissue were included. The diagnostic performance of the ADNEX model with and without the use of CA125 as a predictor was
calculated.
Results: Data from 610 women were analysed. The overall prevalence of malignancy was 30%. The area under the receiver
operator curve (AUC) for the ADNEX diagnostic performance to differentiate between benign and malignant masses was 0.937
(95% CI: 0.915–0.954) when CA125 was included, and 0.925 (95% CI: 0.902–0.943) when CA125 was excluded. The calibration plots
suggest good correspondence between the total predicted risk of malignancy and the observed proportion of malignancies. The
model showed good discrimination between the different subtypes.
Conclusions: The performance of the ADNEX model retains its performance on external validation in the hands of ultrasound
examiners with varied training and experience.
According to the latest statistics from the National Cancer Institute
in United States, 12.1 per 100 000 women developed ovarian cancer
per year between 2008 and 2012, with a mortality of 7.7 per
100 000 women (Howlader et al, 2015). The overall 5-year survival
is estimated to be B45.6% for all stages of the disease (Howlader
et al, 2015). However, for early localised ovarian cancers, the 5-year
*Correspondence: A Sayasneh; E-mail: a.sayasneh@imperial.ac.uk
10
These authors contributed equally to this work.
Received 13 December 2015; revised 4 June 2016; accepted 1 July 2016
&2016 Cancer Research UK. All rights reserved 0007 – 0920/16
FULL PAPER
Keywords: diagnostic imaging; ovarian neoplasm; statistical models; ultrasonography
British Journal of Cancer (2016), 1–7 | doi: 10.1038/bjc.2016.227
www.bjcancer.com | DOI:10.1038/bjc.2016.227 1
Advance Online Publication: 2 August 2016
survival exceeds 90% (Howlader et al, 2015). A combination of
early diagnosis and centralised management are thought to be key
factors to optimise survival (Bristow et al, 2013, 2014; Howlader
et al, 2015). For early diagnosis, previous trials to evaluate ovarian
cancer screening have not been successful (Kobayashi et al, 2008;
Buys et al, 2011). However, recently, the United Kingdom
Collaborative Trial of Ovarian Cancer Screening (UKCTOCS)
showed that screening using the risk of ovarian cancer algorithm
(ROCA) doubled the number of detected primary invasive
epithelial ovarian or tubal cancers (iEOCs) compared with a fixed
cutoff of CA125 (Menon et al, 2015). The researchers also reported
a significant mortality reduction with annual multimodal screening
(MMS) when prevalent cases were excluded. However, the effect of
this mortality reduction on final ovarian cancer screening cost
effectiveness requires longer-term follow-up of the study patients
(Jacobs et al, 2015).
A further important aspect of clinical management is that an
accurate diagnosis is made when a woman presents with an
ovarian mass. This is essential if women with cancer are to be
referred to specialist oncology services. The International Ovarian
Tumour Analysis group (IOTA) have developed and validated
models and rules to characterise ovarian masses as benign or
malignant (Timmerman et al, 2005, 2010a, b; Van Holsbeke et al,
2012). These models and rules have also been validated in the
hands of less experienced (level II) ultrasound examiners (Sayasneh
et al, 2013a,b).
The IOTA group has developed the multiclass ADNEX
(The Assessment of Different NEoplasias in the adneXa) model
that can differentiate between benign tumours, borderline tumours,
early-stage primary cancers, late-stage primary cancers (stages II–
IV) and secondary metastatic cancers (Van Calster et al,2014).The
ADNEX is based on three clinical (including CA125) and six
ultrasound parameters (Van Calster et al, 2014), and also offers risk
calculation without CA125. The model was developed and
temporally validated using parameters collected by experienced
(or level III) ultrasound examiners, equivalent to a UK consultant
level with a special interest in gynaecological ultrasonography
(Education and Practical Standards Committee, European
Federation of Societies for Ultrasound in Medicine and Biology
(EFSUMB), 2006; Van Calster et al, 2014). This model should
facilitate the management of ovarian masses more efficiently as it
allows patients to be triaged to the correct management pathway,
whether for conservative follow-up, surgery at a general gynaecology
unit or management at high-volume specialised cancer centres.
Correctly classifying the subtype of malignancy is also of critical
importance as borderline ovarian tumours and early-stage ovarian
cancers can be treated less aggressively, leading to the possibility of
fertility preservation in younger women (Hennessy et al, 2009; Darai
et al, 2013). On the other hand, metastatic ovarian cancers should be
managed according to the origin of the primary cancer (Hennessy
et al, 2009).
The primary aim of this project was to externally validate the
ADNEX model. The secondary aim was to assess the performance of
the model by level II examiners with varied training (nonconsultant
doctors (MDs) and sonographers) (Education and Practical
Standards Committee, European Federation of Societies for
Ultrasound in Medicine and Biology (EFSUMB), 2006; Van
Calster et al, 2014). We hypothesised that the discriminatory
performance of ADNEX would be retained, that is, it would be
similar to the validation performance in the original ADNEX study.
MATERIALS AND METHODS
Setting and design. This was a multicentre cross-sectional cohort
study for diagnostic accuracy. Data were collected prospectively,
with the purpose of developing and validating ultrasound-based
prediction models from transvaginal ultrasound examinations
performed by level II ultrasound examiners (nonconsultant
gynaecology specialist, gynaecology trainees doctors and gynaecol-
ogy sonographers) (Education and Practical Standards Committee,
European Federation of Societies for Ultrasound in Medicine and
Biology (EFSUMB), 2006; The Royal College of Radiologists (RCR)
Board of the Faculty of Clinical Radiology, 2012). The ultrasound
examiners were blind to the results of the reference test, that is, the
final histological outcome or in the event of cancer the stage of the
disease The ADNEX model was applied by a single investigator
(AS) using a dedicated excel spreadsheet. Patients were recruited
from three cancer centres (Queen Charlotte’s Chelsea Hospital
(QCCH), London, UK; Princess Ann Hospital (PAH), South-
ampton, UK; and Garibaldi Nesima Hospital (GNH), Catania,
Italy). The study was approved as a service evaluation audit at the
UK centres and as a validation study by the hospital authority at
the Italian centre. The guidelines of the TRIPOD (Transparent
Reporting of a multivariable prediction model for Individual
Prognosis or Diagnosis) initiative were used (Collins et al, 2015).
Patients were recruited consecutively from September 2010 to
November 2014 at QCCH, from May 2012 to May 2014 at PAH
and from September 2012 to February 2015 at GNH. Patients at
QCCH and PAH were also recruited to the IOTA 4 study
(Sayasneh et al, 2013a,b). Transvaginal ultrasonography was
performed using the standardised approach previously published
by the IOTA group (Timmerman et al, 2000, 2010b). Transab-
dominal ultrasonography was undertaken when a large mass could
not be fully evaluated transvaginally (Timmerman et al, 2010b).
Participants and data collection. The inclusion criteria were
patients presenting with at least one adnexal mass who underwent
transvaginal ultrasonography at one of the participating centres.
For bilateral adnexal masses, the mass with the most complex
ultrasound features was included (Timmerman et al, 2000, 2010b).
If both masses had similar ultrasound morphology, the largest
mass or the one most easily accessible by ultrasonography was
included (Timmerman et al, 2010b).
The exclusion criteria were (1) pregnancy, (2) patients examined
by a consultant, (3) refusal of transvaginal ultrasonography, (4)
cytology rather than histology as an outcome and (5) failure to
undergo surgery within 120 days of the ultrasound examination. At
PAH, 8 cases were included in the final analysis, although they had
the ultrasound examination more than 120 days before surgery.
These cases underwent a CT scan within 120 days, confirming the
persistent presence of the mass.
The NHS Caldicott report guidelines were followed in all steps
of data handling (Great Britain; Department of Health, 1997).
At QCCH and GNH, a secure electronic data collection system was
used (Astraia Software, Munich, Germany). A unique identifier
was generated automatically for each patient’s record. Dedicated
data collection forms and excel sheets were used at PAH. Serum
CA125 was measured as per clinician’s discretion or clinical
practice in each centre, using Abbott Architect CA125 II (Abbott
Park, IL, USA) immunoassay kit at QCCH and GNH, and UniCel
DxI Immunoassay System (Beckman Coulter Inc., Brea, CA, USA)
Assay at PAH.
The ADNEX model. The ADNEX model contains three clinical
and six ultrasound predictors: age (in years), serum CA125 level
(U ml
1
), type of centre (oncology centres vs other hospitals),
maximum diameter of lesion (in mm), proportion of solid tissue,
more than 10 cyst locules (yes or no), number of papillary
projections (0, 1, 2, 3 or 43) acoustic shadows (yes or no) and
ascites (yes or no) (Van Calster et al, 2014). Oncology centres were
defined as ‘tertiary referral centres with a specific gynaecology
oncology unit’. The proportion of solid tissue is obtained as the
ratio of the maximum diameter of the largest solid component and
BRITISH JOURNAL OF CANCER Characterising ovarian masses by multiclass model
2 www.bjcancer.com | DOI:10.1038/bjc.2016.227
the maximum diameter of the lesion. The ADNEX model is
available online and in mobile applications (www.iotagroup.org/
adnexmodel/) (Van Calster et al, 2014). The ADNEX model can
still be calculated without including the serum CA125 value. In this
study we calculated the performance of the ADNEX model with
and without CA125. The temporal validation of the model with
CA125 in the original paper yielded an area under the receiver
operator curve (AUC) of 0.943 (0.934–0.952) to discriminate
benign from malignant tumours. The model without CA125 had
an AUC of 0.932 (0.922–0.941). Validation AUCs between all pairs
of the five categories varied between 0.71 (stage I cancer vs
secondary metastatic cancer) and 0.99 (benign tumours vs late
stage primary cancer). We applied the model exactly as presented
in the original publication, that is, without any changes to the
model formula or coefficients.
Reference tests. The reference standard was the histopathological
diagnosis of the mass after surgical removal. The excised tissues
underwent histological examination at the local centre. Tumours
were classified according to the WHO (World Health Organisa-
tion) classification of tumours and malignant tumours were staged
according to the FIGO (International Federation of Gynaecology
and Obstetrics) criteria (Tavassoli et al, 2003; Heintz et al, 2006).
Histological classification was performed without knowledge of the
ADNEX results or clinical and ultrasound findings for the patient.
The final diagnosis was categorised into five types: benign,
borderline, stage I invasive, stage II–IV invasive and secondary
metastatic cancer.
Statistical analysis. There were missing values for serum CA125
and for the presence of 410 cyst locules (loc10). Missing values
were handled differently for serum CA125 and loc10. The number
of missing values for the latter variable was small (3%), and hence
these were dealt with using single stochastic imputation based on
logistic regression. Missing loc10 values were predicted by a logistic
regression model with Firth correction with the following
predictors: age, maximum diameter of the lesion, proportion of
solid tissue, number of papillations, presence of acoustic shadows,
ascites, type of ovarian tumour and type of operator. The missing
serum CA125 values were handled with multiple stochastic
imputation using predictive mean matching regression. As the
distribution of serum CA125 was heavily skewed, the log–log
transformation of CA125 was used (i.e., log(log(CA125))). In this
imputation model, age, maximum diameter of the lesion,
proportion solid tissue, loc10, number of papillations, presence
of acoustic shadows, ascites, type of ovarian tumour, hospital and
operator type were used as predictors. Using this approach, the
missing values were replaced by 100 plausible values, leading to
100 completed data sets. Imputed values were back transformed to
the original scale. For the ADNEX model with CA125, each of the
100 completed data sets were analysed separately and their results
combined using Rubin’s Rules (Rubin, 1987).
External validation of the ADNEX model with and without
CA125 was performed by evaluating discrimination and calibration
performance. The AUC was calculated for the basic discrimination
between benign and malignant tumours using the total risk of
malignancy (i.e., the sum of the estimated risks of the four
malignant subtypes). The 95% confidence intervals for differences
in AUCs were computed based on 1000 bootstrap samples, where
for each bootstrap sample the same patients were selected across
the imputed data sets (Musoro et al, 2014). In addition, AUCs were
computed for each pair of tumour types using the conditional risk
method (Van Calster et al, 2012b). Finally, the polytomous
discrimination index was calculated (Van Calster et al, 2012a) that
estimates the average proportion of correctly classified patients by
the model when presented with five patients, one with each tumour
type. Sensitivity and specificity were calculated using a 1%, 5%,
10%, 15%, 20% and 30% cutoff denoting the total risk of
malignancy. Calibration of the predicted probabilities was assessed
through use of calibration plots that show the relation between the
observed and predicted probabilities for malignant tumours. The
calibration curve was estimated by using a loess smoother (Van
Calster et al, 2016).
RESULTS
During the study period, 751 women underwent ultrasonography
by level II examiners (one associate specialist in gynaecology,
12 resident gynaecology trainees and 29 sonographers) for a pelvic
mass and went through the surgical management pathway.
Of these, 141 women were excluded from the final analysis for
the following reasons: 65 women were examined by a consultant,
26 women had no histology result (14 only cytology, 12 no
cytology or histology), 24 women had surgery 4120 days from the
characterising ultrasound scan, 15 women were pregnant, 5 women
only had a transabdominal scan, 5 women had no surgery
performed (declined or were not medically fit) and finally
1 woman who had a recurrence of cervical cancer in the pelvis a
few years after radical hysterectomy and underwent a bilateral
salpingo-oophorectomy was excluded as the tumour was not
considered adnexal. Supplementary Table 1 presents exclusions for
each centre. In the final analysis, 610 women were included
(Supplementary Figure 1). Of these patients, 142 (23%) had a
missing CA125 level and 17 (3%) had a missing value for loc10.
Supplementary Table 2 presents the numbers of missing values for
each of the study centres. The prevalence of malignancy was 30%
(n¼182), with 33% for QCCH, 32% for PAH and 19% for GNH.
There were 42 (7%) borderline tumours, 47 (8%) stage I primary
ovarian cancers, 69 (11%) stage II–IV primary ovarian cancers and
24 (4%) secondary metastatic cancers (see Supplementary Table 3
for a breakdown per centre). The median age was 47 years with
352 (58%) premenopausal and 258 (42%) postmenopausal women.
Table 1 shows descriptive statistics of the ADNEX predictors per
tumour subtype. Supplementary Tables 4–6 shows descriptive
statistics per centre.
The calibration plots suggest good correspondence between the
total predicted risk of malignancy and the observed proportion of
malignant tumours, both for the ADNEX model with and without
CA125 (Figure 1).
The AUC to differentiate between benign and malignant masses
was 0.937 (95% CI: 0.915–0.954) for ADNEX with CA125 and
0.925 (95% CI: 0.902–0943) for ADNEX without CA125 (Figure 2
and Table 2). The model with CA125 showed slightly better
performance (AUC difference: 0.012, 95% CI: 0.006–0.020). At risk
cutoffs of 1%, 10% and 30%, sensitivities were 100%, 97% and 86%
for ADNEX with CA125 (Table 3). Corresponding specificities
were 12%, 68% and 84%. As in the original study, centre
differences were observed with centre-specific AUCs for ADNEX
with CA125 that varied from 0.90 for PAH to 0.99 for GNH
(Table 2). The AUC was higher for premenopausal women (0.94)
than for postmenopausal women (0.90) (Table 2): 0.939 vs 0.899
for the model with CA125 (difference 0.04, 95% CI 0.009 to
0.084) and 0.935 vs 0.873 for the model without CA125 (difference
0.062, 95% CI 0.012 0.116).
When tumours were classified into benign, borderline, stage I
invasive, stages II–IV, invasive and secondary metastatic, the
model showed good discrimination between the different subtypes
(Table 4). For example, discrimination between benign and
stage II–IV tumours was near perfect for the model with CA125
(AUC 0.99). In comparison, the model had most difficulties
discriminating between borderline and stage I tumours
(AUC 0.78), though its performance is still good. The model
without CA125 mainly had lower AUCs for stage II IV tumours
Characterising ovarian masses by multiclass model BRITISH JOURNAL OF CANCER
www.bjcancer.com | DOI:10.1038/bjc.2016.227 3
vs other groups, in particular vs secondary metastatic cancers
(AUC 0.88 for model with CA125, AUC 0.77 for model without
CA125). The polytomous discrimination index (PDI) was 0.58 for
ADNEX with CA125 and 0.52 for ADNEX without CA125
(Table 4), whereas PDI for random performance would be 0.20 for
five categories.
DISCUSSION
In this study, we have shown that in the hands of level II
ultrasound examiners, the ADNEX model was able to discriminate
between benign and malignant masses with a very similar level of
performance to that achieved by experienced ultrasound examiners
in the original ADNEX temporal validation study published by the
IOTA group (Van Calster et al, 2014). In our external validation
study using a 10% cutoff to define malignancy, the ADNEX model
achieved a sensitivity of 97.3% and a specificity of 67.7% compared
with 96.5% and 71.3% in the original study (Van Calster et al,
2014). The optimal cutoff for selecting patients for conservative
management may vary (e.g., between 1 and 5%) depending on the
health-care system, cost of surgery and surgical risk factors
(age, previous medical and surgical history). However, as this study
only included patients who underwent surgical management, we
cannot conclude which cutoff is optimal for conservative
management. This will be investigated in the IOTA5 study
(https://clinicaltrials.gov/ct2/show/NCT01698632). In contrast, in
a tertiary centre it may be preferable to have a lower false positive
rate, and a cutoff value of 30% may be more appropriate (Van
Calster et al, 2015).
To the best of our knowledge, this is the first external validation
study of the IOTA ADNEX model. Furthermore, the validation
was carried out by level II ultrasound examiners, whereas in the
previous IOTA development and temporal validation study (Van
Calster et al, 2014), the ultrasound scan parameters were collected
by experienced level III examiners. A strength of our study is that it
is multicentre, and as it includes level II examiners with varied
training and experience (sonographers and medical doctors), we
think the performance of the ADNEX model in this study is likely
to be generalisable. Another strength of our study is the robust
selection of the reference test, as only cases with a histological
outcome were included. However, this may also be seen as a
weakness in relation to the potential performance of the ADNEX
model for masses that are selected for conservative management as
these were not included in the study. This is an issue that applies to
most, if not all, of the diagnostic research carried out to date on
ovarian masses. The previously mentioned IOTA 5 study should
give us useful information on the diagnostic performance of
ADNEX and the long-term behaviour of these masses.
A potential limitation is the use of different assay kits for serum
CA125 measurements; however, the inconsistency in CA125 levels
Table 1. Descriptive information about the patients and masses included in the study according to tumour subtype
All patients Statistic
Benign
(n¼428)
Borderline
(n¼42)
Stage I OC
(n¼47)
Stage II–IV OC
(n¼69)
Secondary
metastasis
(n¼24)
Age, years Median (IQR) 43 (3155) 47 (3056) 57 (4868) 62 (5372) 55 (4969)
CA125, IU l
1
Median (IQR) 20 (1239) 28 (2164) 92 (35209) 485 (1361083) 66 (33129)
Max lesion diameter, mm Median (IQR) 72 (5195) 128 (91174) 146 (109180) 110 (76140) 90 (73135)
Presence of solid parts N(%) 142 (33%) 30 (71%) 46 (98%) 69 (100%) 22 (92%)
Proportion of solid tissue, if present Median (IQR) 0.36 (0.180.78) 0.37 (0.190.47) 0.43 (0.300.67) 0.59 (0.41–1.00) 1.00 (0.58–1.00)
More than 10 locules N(%) 31 (7%) 14 (33%) 13 (28%) 11 (16%) 7 (29%)
Number of papillations
0N(%) 371 (87%) 26 (62%) 33 (70%) 52 (75%) 21 (88%)
1N(%) 31 (7%) 6 (14%) 1 (2%) 8 (12%) 0 (0%)
2N(%) 12 (3%) 2 (5%) 5 (11%) 1 (1%) 2 (8%)
3N(%) 3 (1%) 2 (5%) 2 (4%) 0 (0%) 1 (4%)
43N(%) 11 (3%) 6 (14%) 6 (13%) 8 (12%) 0 (0%)
Acoustic shadows N(%) 94 (22%) 0 (0%) 6 (13%) 1 (1%) 1 (4%)
Ascites N(%) 6 (1%) 1 (2%) 3 (6%) 23 (33%) 7 (29%)
Abbreviations: CA125 ¼cancer antigen 125; IQR ¼interquartile range; OC ¼ovarian cancer.
1.0
AB
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Benign
Malignant
Benign
Malignant
Ideal
Flexible calibration (loess)
Predicted probability
0.0 0.2 0.4 0.6 0.8 1.0
Predicted probability
Observed proportion
1.0
0.8
0.6
0.4
0.2
0.0
Observed proportion
Ideal
Flexible calibration (loess)
Figure 1. (A) Calibration plot for the ADNEX model with serum CA125. (B) Calibration plot for the ADNEX model without serum CA125.
BRITISH JOURNAL OF CANCER Characterising ovarian masses by multiclass model
4 www.bjcancer.com | DOI:10.1038/bjc.2016.227
resulting from this is thought to be limited (Davelaar et al, 1998).
Furthermore, the variance in CA125 assay kits used in the study is
a reflection of clinical reality and again means results are more
likely to be reproducible (Van Calster et al, 2014). A further
possible limitation of the study is that all three participating
hospitals were referral centres for gynaecological cancers, resulting
in there being a relatively high prevalence of malignant disease in
the study population. Accordingly, it is possible that our findings
may have limitations when trying to predict test performance
either in primary care or secondary gynaecology units. However, it
should be noted that in the original ADNEX study the prevalence
of malignancy ranged from 0 to 66% in the 24 participating centres
(Van Calster et al, 2014), and hence this makes it more likely that
results will be generalisable. Furthermore, ADNEX explicitly
corrects its prediction for type of centre (oncology centres vs
other centres). In this sense, the potential for selection bias is
accounted for by the model.
Finally, having no centralised histopathology review in our
study may have led to bias. For example, distinguishing borderline
tumours from benign tumours or even stage I cancer may be
challenging for pathologists, where disagreement can occur and
this may give inaccurate diagnostic performance results for the
ADNEX model in these cases (Van Calster et al, 2014). However,
as all the histopathology departments involved in this study were
tertiary referral centres for gynaecological cancers, in the event of a
discrepancy (including discrepancies in the referring units) a local
review at the tertiary centre would have been held to resolve the
disagreement. Furthermore, centralised review of pathology was
discontinued in IOTA studies as it was shown in initial studies that
there were minimal differences between local and central reports
(Timmerman et al, 2005).
It is worth noting that we have observed variation in the
ADNEX performance between centres that is comparable to the
one observed in the original IOTA validation study (Van Calster
et al, 2014). This variation could be explained by the differences in
Table 3. The overall sensitivity and specificity (benign vs
malignant) of the ADNEX model with and without the
inclusion of serum CA125
Cutoff
Patients with
riskXcutoff,
N(%)
Sensitivity with
95% CI
Specificity with
95% CI
ADNEX with CA125
1% 559 (91.6%) 100.0% (97.4–100.0) 11.9% (9.1–15.5)
3% 479 (78.5%) 100.0% (97.4–100.0) 30.6% (26.3–35.3)
5% 383 (62.8%) 99.0% (94.9–99.8) 53.2% (48.2–58.1)
10% 315 (51.6%) 97.3% (93.5–98.9) 67.7% (63.0–72.0)
15% 281 (46.1%) 94.4% (90.0–97.0) 75.2% (70.7–79.2)
20% 253 (41.5%) 90.6% (85.2–94.1) 79.3% (75.1–83.0)
30% 226 (37.0%) 86.3% (80.4–90.6) 83.9% (80.1–87.2)
ADNEX without CA125
1% 557 (91.3%) 100.0% (97.4–100.0) 12.4% (9.5–16.0)
3% 490 (80.3%) 100.0% (97.4–100.0) 28.0% (23.9–32.6)
5% 374 (61.3%) 98.9% (95.7–99.7) 54.7% (49.9–59.3)
10% 317 (52.0%) 96.7% (92.9–98.5) 67.1% (62.5–71.3)
15% 289 (47.4%) 94.5% (90.1–97.0) 72.7% (68.2–76.7)
20% 261 (42.8%) 90.7% (85.5–94.1) 77.6% (73.4–81.3)
30% 225 (36.9%) 84.6% (78.6–89.2) 83.4% (80.0–86.6)
Abbreviations: ADNEX ¼The Assessment of Different NEoplasias in the adneXa; CA125 ¼
cancer antigen 125; CI¼confidence interval. When using a 1% or 3% cutoff, confidence
limits are calculated through use of Wilson’s score confidence interval method with
continuity correction (Newcombe, 1998). For the other cutoffs, confidence limits are
calculated using logistic regression to combine results after multiple imputation.
1.0
0.9
0.8 30%
20%
15% 10% 5% 3% 1%
0.7
0.6
Sensitivity
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5
1-Specificity
0.6
ADNEX without CA125 (AUC=0.925)
Cutoff ADNEX without CA125
ADNEX with CA125 (AUC=0.937)
Cutoff ADNEX with CA125
0.7 0.8 0.9 1.0
Figure 2. Receiver operating curves for the ADNEX model with and
without serum CA125 levels to discriminate between benign and
malignant masses.
Table 4. Pairwise AUCs and PDI of the ADNEX model with
and without serum CA125
Discrimination measure ADNEX with
CA125
ADNEX
without
CA125
Polytomous discrimination index (PDI) 0.59 0.52
AUC benign vs borderline 0.88 0.88
AUC benign vs stage I OC 0.95 0.94
AUC benign vs stage II–IV OC 0.99 0.97
AUC benign vs secondary metastasis 0.96 0.95
AUC borderline vs stage I OC 0.78 0.78
AUC borderline vs stage II–IV OC 0.94 0.91
AUC borderline vs secondary metastasis 0.92 0.93
AUC stage I OC vs stage II–IV OC 0.83 0.79
AUC stage I OC vs secondary metastasis 0.81 0.83
AUC stage II–IV OC vs secondary metastasis 0.88 0.77
Abbreviations: ADNEX ¼The Assessment of Different NEoplasias in the adneXa; AUC¼
area under the receiver operating curve; CA125 ¼cancer antigen 125; OC ¼ovarian cancer.
Table 2. The area under the receiver operator curve for the
discrimination between benign and malignant lesions for
ADNEX with and without CA125 according to type of centre
and sonographer
ADNEX with
CA125
ADNEX without
CA125
Subgroup AUC 95% CI AUC 95% CI
All patients 0.937 0.915–0.954 0.925 0.902–0.943
Centre
QCCH 0.942 0.913–0.962 0.931 0.900–0.953
PAH 0.900 0.841–0.938 0.889 0.828–0.930
GNH 0.990 0.959–0.998 0.983 0.950–0.995
Operator profession
MD 0.939 0.917–0.956 0.924 0.900–0.943
Sonographer 0.912 0.809–0.962 0.916 0.818–0.964
Menopausal status
Premenopausal 0.939 0.901–0.963 0.935 0.901–0.958
Postmenopausal 0.899 0.855–0.931 0.873 0.824–0.910
Abbreviations: ADNEX ¼The Assessment of Different NEoplasias in the adneXa; AUC ¼
area under the receiver operating curve; CA125 ¼cancer antigen 125; CI ¼confidence
interval; MD ¼medically qualified doctor; QCCH ¼Queen Charlotte’s and Chelsea
Hospital; PAH ¼Princess Anne Hospital; GNH ¼Garibaldi Nesima Hospital.
Characterising ovarian masses by multiclass model BRITISH JOURNAL OF CANCER
www.bjcancer.com | DOI:10.1038/bjc.2016.227 5
the case mix between these centres with a higher number of
secondary metastatic cancers in PAH compared with QCCH and
GNH. It is important to investigate heterogeneity between centres,
but this data set is not ideal for this objective because this requires
a larger database derived from a large number of centres.
In our study, the classification of the level of experience of the
ultrasound examiners (level II) was based on the recommendations
published by the European Federation of Societies for Ultrasound
in Medicine and Biology (Education and Practical Standards
Committee, European Federation of Societies for Ultrasound in
Medicine and Biology (EFSUMB), 2006) and by the Royal College
of Radiologists (The Royal College of Radiologists (RCR) Board of
the Faculty of Clinical Radiology, 2012). As guidance, a level III
examiner in the United Kingdom equates to a consultant with a
special interest in gynaecological ultrasonography (The Royal
College of Radiologists (RCR) Board of the Faculty of Clinical
Radiology, 2012). We acknowledge that this approach has
limitations as some level II examiners may have similar levels of
competence to someone with level III experience. However, it is
acknowledged that the boundaries between these levels can be
difficult to distinguish and may overlap (The Royal College of
Radiologists (RCR) Board of the Faculty of Clinical Radiology,
2012). In our study, similar to previous findings when the IOTA
model LR2 was validated in the hands of level II examiners
(Sayasneh et al, 2013b), we found the AUC for the ADNEX model
was slightly higher when the scans were performed by doctors
compared with sonographers (Table 2).
By characterising the type of malignancy (borderline, primary stage I
cancer, primary stage II–IV cancer or secondary metastatic), the
ADNEX model offers the possibility of a more personalised diagnosis in
the event of an ovarian mass. This potentially may enable fertility
preserving surgery in some women, help plan the most appropriate
surgical approach (laparoscopy or laparotomy) in others or direct
attention to the primary site of malignancy in the event of metastasis.
Although the ADNEX model gives absolute risks ratios, relative risk
ratios can be computed to give a comparison with the background risk
for individual patient (Van Calster et al, 2015). External validation is a
critical step for any diagnostic test before it can be introduced into
clinical practice. We have shown that the performance of the ADNEX
model is retained in units with different patient populations to the
original study, and that it performs well in the hands of examiners with
different levels of experience and background training. Our findings
suggest that the ADNEX model has the potential to improve
management decisions in daily clinical practice for women with
adnexal tumours.
ACKNOWLEDGEMENTS
TB is supported by the National Institute for Health Research
(NIHR) Biomedical Research Centre based at Imperial College
Healthcare NHS Trust and Imperial College London. The views
expressed are those of the author(s) and not necessarily those of
the NHS, the NIHR or the Department of Health. DT is Senior
Clinical Investigator of the Research Foundation -Flanders
(Belgium) (FWO). Research was supported by FWO Grants
G049312N and G0B4716N and by Internal Funds KU Leuven
Grant C24/15/037.
CONFLICT OF INTEREST
TB reports that clinical research in his department (QCCH,
Imperial College London Healthcare NHS Trust) is supported by
Samsung Medison and Roche Diagnostics. The remaining authors
declare no conflict of interest.
REFERENCES
Bristow RE, Chang J, Ziogas A, Anton-Culver H (2013) Adherence to
treatment guidelines for ovarian cancer as a measure of quality care. Obstet
Gynecol 121(6): 1226–1234.
Bristow RE, Chang J, Ziogas A, Randall LM, Anton-Culver H (2014) High-
volume ovarian cancer care: survival impact and disparities in access for
advanced-stage disease. Gynecol Oncol 132(2): 403–410.
Buys SS, Partridge E, Black A, Johnson CC, Lamerato L, Isaacs C, Reding DJ,
Greenlee RT, Yokochi LA, Kessel B, Crawford ED, Church TR,
Andriole GL, Weissfeld JL, Fouad MN, Chia D, O’Brien B, Ragard LR,
Clapp JD, Rathmell JM, Riley TL, Hartge P, Pinsky PF, Zhu CS,
Izmirlian G, Kramer BS, Miller AB, Xu JL, Prorok PC, Gohagan JK,
Berg CD. PLCO Project Team (2011) Effect of screening on ovarian cancer
mortality: the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer
Screening Randomized Controlled Trial. JAMA 305(22): 2295–2303.
Collins GS, Reitsma JB, Altman DG, Moons KG (2015) Transparent reporting
of a multivariable prediction model for individual prognosis or diagnosis
(TRIPOD): the TRIPOD statement. Brit J Obstet Gynacol 122(3): 434–443.
Darai E, Fauvet R, Uzan C, Gouy S, Duvillard P, Morice P (2013) Fertility
and borderline ovarian tumor: a systematic review of conservative
management, risk of recurrence and alternative options. Hum Reprod
Update 19(2): 151–166.
Davelaar EM, van Kamp GJ, Verstraeten RA, Kenemans P (1998) Comparison
of seven immunoassays for the quantification of CA 125 antigen in serum.
Clin Chem 44(7): 1417–1422.
Department of Health (1997) The Caldicott Committee Report on the Review
of Patient-Identifiable Information. Department of Health: Great Britain.
Education and Practical Standards Committee, European Federation of
Societies for Ultrasound in Medicine and Biology (EFSUMB) (2006)
Minimum training recommendations for the practice of medical
ultrasound. Ultraschall Med 27(1): 79–105.
HeintzAP,OdicinoF,MaisonneuveP,QuinnMA,BenedetJL,CreasmanWT,
Ngan HY, Pecorelli S, Beller U (2006) Carcinoma of the ovary. FIGO 26th
Annual Report on the Results of Treatment in Gynecological Cancer. Int J
Gynaecol Obstet 95(Suppl 1): S161–S192.
Hennessy BT, Coleman RL, Markman M (2009) Ovarian cancer. Lancet
374(9698): 1371–1382.
Howlader N, Noone AM, Krapcho M, Garshell J, Miller D, Altekruse SF,
Kosary CL, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS,
Feuer EJ, Cronin KA (2015) SEER Cancer Statistics Review, 1975-2012.
Vol. 2015. National Cancer Institute: Bethesda, MD.
Jacobs IJ, Menon U, Ryan A, Gentry-Maharaj A, Burnell M, Kalsi JK, Amso
NN, Apostolidou S, Benjamin E, Cruickshank D, Crump DN, Davies SK,
Dawnay A, Dobbs S, Fletcher G, Ford J, Godfrey K, Gunu R, Habib M,
Hallett R, Herod J, Jenkins H, Karpinskyj C, Leeson S, Lewis SJ, Liston
WR, Lopes A, Mould T, Murdoch J, Oram D, Rabideau DJ, Reynolds K,
Scott I, Seif MW, Sharma A, Singh N, Taylor J, Warburton F,
WidschwendterM,WilliamsonK,Woolas R, Fallowfield L, McGuire AJ,
Campbell S, Parmar M, Skates SJ (2015) Ovarian cancer screening
and mortality in the UK Collaborative Trial of Ovarian Cancer
Screening (UKCTOCS): a randomised controlled trial. Lancet 387:
945–956.
Kobayashi H, Yamada Y, Sado T, Sakata M, Yoshida S, Kawaguchi R,
Kanayama S, Shigetomi H, Haruta S, Tsuji Y, Ueda S, Kitanaka T (2008)
A randomized study of screening for ovarian cancer: a multicenter study
in Japan. Int J Gynecol Cancer 18(3): 414–420.
Menon U, Ryan A, Kalsi J, Gentry-Maharaj A, Dawnay A, Habib M,
Apostolidou S, Singh N, Benjamin E, Burnell M, Davies S, Sharma A,
Gunu R, Godfrey K, Lopes A, Oram D, Herod J, Williamson K, Seif MW,
Jenkins H, Mould T, Woolas R, Murdoch JB, Dobbs S, Amso NN,
Leeson S, Cruickshank D, Scott I, Fallowfield L, Widschwendter M,
Reynolds K, McGuire A, Campbell S, Parmar M, Skates SJ, Jacobs I (2015)
Risk algorithm using serial biomarker measurements doubles the number
of screen-detected cancers compared with a single-threshold rule in the
United Kingdom Collaborative Trial of Ovarian Cancer Screening. J Clinic
Oncol 33(18): 2062–2071.
Musoro JZ, Zwinderman AH, Puhan MA, Ter Riet G, Geskus RB (2014)
Validation of prediction models based on lasso regression with multiply
imputed data. BMC Med Res Methodol 14(1): 116.
Newcombe RG (1998) Two-sided confidence intervals for the single
proportion comparison of seven methods. Stat Med 17(8): 857–872.
BRITISH JOURNAL OF CANCER Characterising ovarian masses by multiclass model
6 www.bjcancer.com | DOI:10.1038/bjc.2016.227
Rubin DB (1987) Multiple Imputation for Nonresponse in Surveys. John Wiley
& Sons, Inc.: Hoboken, NJ, USA.
Sayasneh A, Kaijser J, Preisler J, Johnson S, Stalder C, Husicka R, Guha S,
Naji O, Abdallah Y, Raslan F, Drought A, Smith AA, Fotopoulou C,
Ghaem-Maghami S, Van Calster B, Timmerman D, Bourne T (2013a)
A multicenter prospective external validation of the diagnostic
performance of IOTA simple descriptors and rules to characterize ovarian
masses. Gynecol Oncol 130(1): 140–146.
Sayasneh A, Wynants L, Preisler J, Kaijser J, Johnson S, Stalder C, Husicka R,
Abdallah Y, Raslan F, Drought A, Smith AA, Ghaem-Maghami S, Epstein E,
Van Calster B, Timmerman D, Bourne T (2013b) Multicentre external
validation of IOTA prediction models and RMI by operators with varied
training. Br J Cancer 108(12): 2448–2454.
Tavassoli FA, Devilee P. International Agency for Research on Cancer (2003)
Pathology and Genetics of Tumours of the Breast and Female Genital
Organs. International Agency for Research on Cancer: Lyon.
The Royal College of Radiologists (RCR) Board of the Faculty of Clinical
Radiology (2012) Ultrasound Training Recommendations for Medical and
Surgical Specialties. London. Available at https://www.rcr.ac.uk/sites/
default/files/publication/BFCR(12)17_ultrasound_training.pdf (last
accessed June 2015).
Timmerman D, Ameye L, Fischerova D, Epstein E, Melis GB, Guerriero S,
Holsbeke CV, Savelli L, Fruscio R, Lissoni AA, Testa AC, Veldman J,
Vergote I, Huffel SV, Bourne T, Valentin L (2010a) Simple ultrasound
rules to distinguish between benign and malignant adnexal masses before
surgery: prospective validation by IOTA group. BMJ 341: c6839.
Timmerman D, Testa A, Bourne T, Ferrazzi E, Ameye L, Konstantinovic M,
Van Calster B, Collins W, Vergote I, Van Huffel S, Valentin L (2005)
A logistic regression model to distinguish between the benign and
malignant adnexal mass before surgery: a multicenter study by the
International Ovarian Tumor Analysis (IOTA) group. J Clin Oncol 23:
8794–8801.
Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I.
International Ovarian Tumor Analysis (IOTA) Group (2000) Terms,
definitions and measurements to describe the sonographic features of
adnexal tumors: a consensus opinion from the International Ovarian
Tumor Analysis (IOTA) Group. Ultrasound Obstet Gynecol 16(5):
500–505.
Timmerman D, Van Calster B, Testa AC, Guerriero S, Fischerova D,
Lissoni AA, Van Holsbeke C, Fruscio R, Czekierdowski A, Jurkovic D,
Savelli L, Vergote I, Bourne T, Van Huffel S, Valentin L (2010b) Ovarian
cancer prediction in adnexal masses using ultrasound-based logistic
regression models: a temporal and external validation study by the IOTA
group. Ultrasound Obstet Gynecol 36(2): 226–234.
Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina M, Steyerberg
EW (2016) A calibration hierarchy for risk models was defined: from
utopia to empirical data. J Clin Epidemiol 74: 167–176.
Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S,
Steyerberg EW (2012a) Extending the c-statistic to nominal polytomous
outcomes: the polytomous discrimination index. Stat Med 31(23):
2610–2626.
Van Calster B, Van Hoorde K, Froyman W, Kaijser J, Wynants L, Landolfo C,
Anthoulakis C, Vergote I, Bourne T, Timmerman D (2015) Practical
guidance for applying the ADNEX model from the IOTA group to
discriminate between different subtypes of adnexal tumors. Facts Views
Vis ObGyn 7(1): 32–41.
Van Calster B, Van Hoorde K, Valentin L, Testa AC, Fischerova D,
Van Holsbeke C, Savelli L, Franchi D, Epstein E, Kaijser J, Van Belle V,
Czekierdowski A, Guerriero S, Fruscio R, Lanzani C, Scala F, Bourne T,
Timmerman D. International Ovarian Tumour Analysis Group (2014)
Evaluating the risk of ovarian cancer before surgery using the ADNEX
model to differentiate between benign, borderline, early and advanced
stage invasive, and secondary metastatic tumours: prospective multicentre
diagnostic study. BMJ 349: g5920.
Van Calster B, Vergouwe Y, Looman CW, Van Belle V, Timmerman D,
Steyerberg EW (2012b) Assessing the discriminative ability of risk
models for more than two outcome categories. Eur J Epidemiol 27(10):
761–770.
Van Holsbeke C, Van Calster B, Bourne T, Ajossa S, Testa AC, Guerriero S,
Fruscio R, Lissoni AA, Czekierdowski A, Savelli L, Van Huffel S,
Valentin L, Timmerman D (2012) External validation of diagnostic
models to estimate the risk of malignancy in adnexal masses. Clin Cancer
Res 18(3): 815–825.
This work is published under the standard license to publish agree-
ment. After 12 months the work will become freely available and
the license terms will switch to a Creative Commons Attribution-
NonCommercial-Share Alike 4.0 Unported License.
Supplementary Information accompanies this paper on British Journal of Cancer website (http://www.nature.com/bjc)
Characterising ovarian masses by multiclass model BRITISH JOURNAL OF CANCER
www.bjcancer.com | DOI:10.1038/bjc.2016.227 7
... Our research found that the proportion of papillary in the ovarian cancer group was much more significant than in the benign group. According to Sayasneh et al.' while the rate of malignant tumors with papillary was 38% in borderline and 30% in stage I cancers (Sayasneh et al., 2016). IOTA's report showed that 14% were benign tumors and 30.2% were malignant tumors that had papillary on ultrasound. ...
... Our findings were similar to those of others, with rates of solid components of 11% and 87% in benign and malignant tumors, respectively, but lower than those of Sayasneh et al. This difference could be related to the fact that the author Sayasneh's study was conducted in 3 European oncology cancer with a greater sample size than ours (Sayasneh et al., 2016). ...
... The difference in the area under ROC in the model with and without CA125 was low. This difference was not significant in our study, and it was similar to the studies of Van Calster and A Sayasneh (Van Calster et al., 2015;Sayasneh et al., 2016). ...
Article
Full-text available
Objective: This study aimed to assess the effectiveness and determine the optimal cut-off point of the ADNEX model in women presenting with a pelvic or adnexal tumor. Method: All women presented with adnexal mass and were scheduled for operation at Hue University of Medicine and Pharmacy Hospital and Hue Central Hospital, Vietnam during June 2019 – May 2021 were included and categorized according to their histopathologic reports into ovarian cancer groups and benign ovarian tumor groups. Multivariable logistic regression was used to explore for potential predictors. The ADNEX model with and without CA125 was used to assess the risk of ovarian cancer preoperative. The goldden standard to evaluate the accuracy of ultrasonography using the ADNEX model was the pathological report. In addition, the accuracy as well as optimum cut-off point of the ADNEX model was estimated with and without CA125. Results: A total of 461 participants were included in analysis and predictive model development, 65 patients in ovarian cancer group and 361 in benign tumor group. The ADNEX model combined with CA125 proved to be a useful predictor with an area under ROC of 0.961 (0.940 – 0.977) with Youden’s index of 0.8395, p < 0.001. The ADNEX model without CA125 also had high predictive value between benign and malignant tumors, with an area under ROC of 0.956 (0.933 – 0.973) with Youden’s index of 0.8551, p < 0.001. Cut-off of the ADNEX with CA125 was 13.5 and without CA125 was 13.1 for sensitivities were 90.8 (81.0 – 96.5) and 93.9 (85.0 – 97.5), specificities 93.2 (90.2 – 95.5) and 91.67 (88.5 – 94.2). The difference in the predictive value of malignancy-risk between the ADNEX model with CA125, without CA125 was not statistically significant, p=0.4883. Conclusion: The ADNEX model, with or without the combining marker CA 125, provides a valuable predictive value for ovarian tumor malignancy preoperative.
... In comparison to the IOTA ADNEX model, the simple rules of IOTA and non-IOTA models perform poorly when it comes to identifying BOTs and Stage I OCs (43,44). Consistent with previous studies (44,45), the IOTA ADNEX performed excellently in terms of detecting most types of adnexal masses in this work (an AUC of 0.697 to 0.977 was observed). Nevertheless, the model performed poorly at distinguishing between an ovarian borderline tumor and a Stage I OC (AUC, 0.758), between an ovarian borderline and a metastatic tumor (AUC, 0.773), between a Stage I OC and a metastatic tumor (AUC, 0.710), between a Stage I OC and a Stage II-IV OC (AUC, 0.734), and between a Stage II-IV OC and a metastatic tumor (AUC, 0.697). ...
Article
Full-text available
Objective This work was designed to investigate the performance of the International Ovarian Tumor Analysis (IOTA) ADNEX (Assessment of Different NEoplasias in the adneXa) model combined with human epithelial protein 4 (HE4) for early ovarian cancer (OC) detection. Methods A total of 376 women who were hospitalized and operated on in Women and Children’s Hospital of Chongqing Medical University were selected. Ultrasonographic images, cancer antigen-125 (CA 125) levels, and HE4 levels were obtained. All cases were analyzed and the histopathological diagnosis serves as the reference standard. Based on the IOTA ADNEX model post-processing software, the risk prediction value was calculated. We analyzed receiver operating characteristic curves to determine whether the IOTA ADNEX model alone or combined with HE4 provided better diagnostic accuracy. Results The area under the curve (AUC) of the ADNEX model alone or combined with HE4 in predicting benign and malignant ovarian tumors was 0.914 (95% CI, 0.881–0.941) and 0.916 (95% CI, 0.883–0.942), respectively. With the cutoff risk of 10%, the ADNEX model had a sensitivity of 0.93 (95% CI, 0.87–0.97) and a specificity of 0.73 (95% CI, 0.67–0.78), while combined with HE4, it had a sensitivity of 0.90 (95% CI, 0.84–0.95) and a specificity of 0.81 (95% CI, 0.76–0.86). The IOTA ADNEX model combined with HE4 was better at improving the accuracy of the differential diagnosis between different OCs than the IOTA ADNEX model alone. A significant difference was found in separating borderline masses from Stage II–IV OC ( p = 0.0257). Conclusions A combination of the IOTA ADNEX model and HE4 can improve the specificity of diagnosis of ovarian benign and malignant tumors and increase the sensitivity and effectiveness of the differential diagnosis of Stage II–IV OC and borderline tumors.
... This limits our ability to use these systems during pregnancy. [22][23][24] Recently, Andreotti et al. published consensus guidelines for the ovarianadnexal reporting and data system (O-RADS US). To eliminate ambiguity, increase diagnostic accuracy and provide consistent interpretations and management recommendations, this working group stratified the risk of malignancy into six categories (0-5) based on reliable predictive descriptors obtained from prospective studies using IOTA (5905 patients, 24 centers, 10 countries): maximum diameter, external contour, inner margin, acoustic shadows, solid component, papillary projections, hyperechoic components, color score, fluid descriptors and other. ...
Article
Full-text available
This review summarizes the evidence-based recommendations for how to approach and laparoscopically treat adnexal masses during pregnancy. We conducted a comprehensive review of studies related to the laparoscopic management of adnexal masses during pregnancy. Selected studies were independently reviewed by two authors. The overall incidence of ovarian tumors in pregnancy ranges between 0.05% and 5.7%, of which less than 5% are malignant. Diagnosis is based mainly on routine transvaginal ultrasound. More than 64% of simple cysts, less than 6 cm in diameter, will spontaneously resolve in less than 16 weeks. However, for persistent and complex tumors, the risk of acute complications can reach up to 9%. Surgical indications are similar to those in the non-gravidic setting, and include acute complications (torsion, rupture, hemorrhage), suspected malignancy and large (over 6 cm) persistent masses. Surgery must be scheduled between 16 and 20 weeks to allow for the spontaneous resolution of functional cysts. Furthermore, within that period, pregnancy becomes independent of the corpus luteum and enlargement of the uterus gives sufficient exposure for the surgery to be performed safely. A recent meta-analysis found that, compared to open surgery, laparoscopy is associated with significantly less preterm labor, blood loss and hospital stay, without differences in pregnancy loss or preterm birth rate. Since the main concerns about maternal-fetal safety are related to increased intraperitoneal pressure and the effects of hypercarbia (maternal hypertensive complications, fetal acidosis), a lower CO2 pressure (10 to 12 mmHg) and reduced operative times (less than 30 minutes) are recommended.
... Notably, IOTA-LR1 and LR2 models have shown good diagnostic performance, with reported AUCs of 0.96 and 0.95 respectively, higher than those obtained for tumor-marker based models in our cohort [45]. ADNEX model combines ultrasound features with CA125, and has a sensitivity of 96.5%, a specificity of 71.3%, and a 0.94 AUC [46]. The ESGO/ISUOG/IOTA/ESGE Consensus Statement on Preoperative Diagnosis of Ovarian Tumors reports that ultrasound-based diagnostic models (IOTA simple rules or ADNEX) are preferable to CA125 level, HE4 level or ROMA [28]. ...
Article
Full-text available
(1) OBJECTIVE: To assess the performance of CA125, HE4, ROMA index and CPH-I index to preoperatively identify epithelial ovarian cancer (EOC) or metastatic cancer in the ovary (MCO). (2) METHODS: single center retrospective study, including women with a diagnosis of adnexal mass. We obtained the AUC, sensitivity, specificity and predictive values were of HE4, CA125, ROMA and CPH-I for the diagnosis of EOC and MCO. Subgroup analysis for women harboring adnexal masses with inconclusive diagnosis of malignancy by ultrasound features and Stage I EOC was performed. (3) RESULTS: 1071 patients were included, 852 (79.6%) presented benign/borderline tumors and 219 (20.4%) presented EOC/MCO. AUC for HE4 was higher than for CA125 (0.91 vs. 0.87). No differences were seen between AUC of ROMA and CPH-I, but they were both higher than HE4 AUC. None of the tumor markers alone achieved a sensitivity of 90%; HE4 was highly specific (93.5%). ROMA showed a sensitivity and specificity of 91.1% and 84.6% respectively, while CPH-I showed a sensitivity of 91.1% with 79.2% specificity. For patients with inconclusive diagnosis of malignancy by ultrasound features and with Stage I EOC, ROMA showed the best diagnostic performance (4) CONCLUSIONS: ROMA and CPH-I perform better than tumor markers alone to identify patients harboring EOC or MCO. They can be helpful to assess the risk of malignancy of adnexal masses, especially in cases where ultrasonographic diagnosis is challenging (stage I EOC, inconclusive diagnosis of malignancy by ultrasound features).
Article
Objectives In patients with an ovarian mass, a risk of malignancy assessment is used to decide whether referral to an oncology hospital is indicated. Risk assessment strategies do not perform optimally, resulting in either referral of patients with a benign mass or patients with a malignant mass not being referred. This process may affect the psychological well-being of patients. We evaluated cancer-specific distress during work-up for an ovarian mass, and patients’ perceptions during work-up, referral, and treatment. Methods Patients with an ovarian mass scheduled for surgery were enrolled. Using questionnaires we measured (1) cancer-specific distress using the cancer worry scale, (2) patients’ preferences regarding referral (evaluated pre-operatively), and (3) patients’ experiences with work-up and treatment (evaluated post-operatively). A cancer worry scale score of ≥14 was considered as clinically significant cancer-specific distress. Results A total of 417 patients were included, of whom 220 (53%) were treated at a general hospital and 197 (47%) at an oncology hospital. Overall, 57% had a cancer worry scale score of ≥14 and this was higher in referred patients (69%) than in patients treated at a general hospital (43%). 53% of the patients stated that the cancer risk should not be higher than 25% to undergo surgery at a general hospital. 96% of all patients were satisfied with the overall work-up and treatment. No difference in satisfaction was observed between patients correctly (not) referred and patients incorrectly (not) referred. Conclusions Relatively many patients with an ovarian mass experienced high cancer-specific distress during work-up. Nevertheless, patients were satisfied with the treatment, regardless of the final diagnosis and the location of treatment. Moreover, patients preferred to be referred even if there was only a relatively low probability of having ovarian cancer. Patients’ preferences should be taken into account when deciding on optimal cut-offs for risk assessment strategies.
Article
Full-text available
Appropriate clinical management of adnexal masses requires a detailed diagnosis. We retrospectively collected ultrasound images of 1559 cases from the first Center of Chinese PLA General Hospital and developed a fully automatic deep learning (DL) model system to diagnose adnexal masses. The DL system contained five models: a detector, a mass segmentor, a papillary segmentor, a type classifier, and a pathological subtype classifier. To test the DL system, 462 cases from another two hospitals were recruited. The DL system identified benign, borderline, and malignant tumors with macro-F1 scores that varied from 0.684 to 0.791, a benefit to preventing both delayed and overextensive treatment. The macro-F1 scores of the pathological subtype classifier to categorize the benign masses varied from 0.714 to 0.831. The detailed classification can inform clinicians of the corresponding complications of each pathological subtype of benign tumors. The distinguishment between borderline and malignant tumors and inflammation from other subtypes of benign tumors need further study. The accuracy and sensitivity of the DL system were comparable to that of the expert and intermediate sonographers and exceeded that of the junior sonographer.
Article
Objective: Previous work suggested that the ultrasound-based benign Simple Descriptors can reliably exclude malignancy in a large proportion of women presenting with an adnexal mass. We aim to validate a modified version of the Benign Simple Descriptors (BD), and we introduce a two-step strategy to estimate the risk of malignancy: if the BDs do not apply, the ADNEX model is used to estimate the risk of malignancy. Methods: This is a retrospective analysis using the data from the 2-year interim analysis of the IOTA5 study, in which consecutive patients with at least one adnexal mass were recruited irrespective of subsequent management (conservative or surgery). The main outcome was classification of tumors as benign or malignant, based on histology or on clinical and ultrasound information during one year of follow-up. Multiple imputation was used when outcome based on follow-up was uncertain according to predefined criteria. Results: 8519 patients were recruited at 36 centers between 2012 and 2015. We included all masses that were not already in follow-up at recruitment from 17 centers with good quality surgical and follow-up data, leaving 4905 patients for statistical analysis. 3441 (70%) tumors were benign, 978 (20%) malignant, and 486 (10%) uncertain. The BDs were applicable in 1798/4905 (37%) tumors, and 1786 (99.3%) of these were benign. The two-step strategy based on ADNEX without CA125 had an area under the receiver operating characteristic curve (AUC) of 0.94 (95% CI, 0.91-0.95). The risk of malignancy was slightly underestimated, but calibration varied between centers. A sensitivity analysis in which we expanded the definition of uncertain outcome resulted in 1419 (29%) tumors with uncertain outcome and an AUC of the two-step strategy without CA125 of 0.93 (95% CI, 0.91-0.95). Conclusion: A large proportion of adnexal masses can be classified as benign by the BDs. For the remaining masses the ADNEX model can be used to estimate the risk of malignancy. This two-step strategy is convenient for clinical use. This article is protected by copyright. All rights reserved.
Article
Objective: This study aimed to compare the ability of the O-RADS and ADNEX models to classify benign or malignant adnexal lesions. Methods: This retrospective single-center study included women who underwent surgery for adnexal lesions. Two gynecologists independently categorized the adnexal lesions according to the O-RADS and ADNEX models. Four additional readers were included to validate the new quick-access O-RADS flowchart. Results: Among the 322 patients included in this study, 264 (82.0%) had a benign diagnosis, and 58 (18.0%) had a malignant diagnosis. The malignant rates of O-RADS 2, O-RADS 3, O-RADS 4, and O-RADS 5 were 0%, 3.0%, 37.7%, and 78.9%, respectively. The AUC of the O-RADS in the 322 patients was 0.93. On comparing the O-RADS and ADNEX models in the remaining 281 patients, the AUCs of the O-RADS, ADNEX model with CA125, and ADNEX model without CA125 were 0.92, 0.95, and 0.94, respectively. When setting a uniform cutoff of ≥ 10% (≥ O-RADS 4) to predict malignancy, the O-RADS had higher sensitivity than the ADNEX model (96.6% vs. 91.4%), and relatively similar specificity. In addition, the readers with the quick-access flowchart spent less time categorizing O-RADS than the readers with only the original O-RADS table (mean analysis time: 99 min 15 s vs. 111 min 55 s). Conclusions: The O-RADS classification of the adnexal lesions as benign or malignant was comparable to that of the ADNEX model and had higher sensitivity at the 10% cutoff value. A quick-access O-RADS flowchart was helpful in O-RADS categorization and might shorten the analysis time. Key points: • Both O-RADS and ADNEX models had good diagnostic performance in distinguishing adnexal malignancy, and O-RADS had higher sensitivity than ADNEX model in uniform 10% cutoff to predict malignancy. • Quick-access O-RADS flowchart was developed to help review O-RADS classification and might help reduce the analysis time.
Article
To evaluate the accuracy of the assessment of different neoplasias in the adnexa (ADNEX) model in the differential diagnosis of malignant and benign ovarian tumors, the optimal cutoff value and the accuracy in diagnosing ovarian tumors at different stages, PubMed, Web of Science and Cochrane Library databases were retrieved to search literature with per-patient analysis until publication of the last study in November 2021. STATA 14.1, Meta-Disc 1.4 and Revman software 5.3 were used in the performance of meta-analysis. To explore sources of heterogeneity, a subgroup analysis was conducted for the ADNEX model. The pooled sensitivity, specificity, diagnostic odds ratio, positive likelihood, negative likelihood ratio and area under the summary receiver operating characteristic curve were 0.91 (95% confidence interval [CI]: 0.89–0.93), 0.84 (95% CI: 0.80–0.88), 55.55 (95% CI: 40.47–76.26), 5.71 (95% CI: 4.49–7.26), 0.10 (95% CI: 0.08–0.13) and 0.94 (95% CI: 0.92–0.96) in differentiating benign and malignant ovarian tumors, respectively. The area under the curve in identifying benign, borderline, stage I and stages II–IV were 0.93, 0.73, 0.27 and 0.92. The ADNEX model had high diagnostic performance was influential in the diagnosis of benign and stage II–IV ovarian tumors.
Article
Full-text available
Background: Ovarian cancer has a poor prognosis, with just 40% of patients surviving 5 years. We designed this trial to establish the effect of early detection by screening on ovarian cancer mortality. Methods: In this randomised controlled trial, we recruited postmenopausal women aged 50-74 years from 13 centres in National Health Service Trusts in England, Wales, and Northern Ireland. Exclusion criteria were previous bilateral oophorectomy or ovarian malignancy, increased risk of familial ovarian cancer, and active non-ovarian malignancy. The trial management system confirmed eligibility and randomly allocated participants in blocks of 32 using computer-generated random numbers to annual multimodal screening (MMS) with serum CA125 interpreted with use of the risk of ovarian cancer algorithm, annual transvaginal ultrasound screening (USS), or no screening, in a 1:1:2 ratio. The primary outcome was death due to ovarian cancer by Dec 31, 2014, comparing MMS and USS separately with no screening, ascertained by an outcomes committee masked to randomisation group. All analyses were by modified intention to screen, excluding the small number of women we discovered after randomisation to have a bilateral oophorectomy, have ovarian cancer, or had exited the registry before recruitment. Investigators and participants were aware of screening type. This trial is registered with ClinicalTrials.gov, number NCT00058032. Findings: Between June 1, 2001, and Oct 21, 2005, we randomly allocated 202 638 women: 50 640 (25·0%) to MMS, 50 639 (25·0%) to USS, and 101 359 (50·0%) to no screening. 202 546 (>99·9%) women were eligible for analysis: 50 624 (>99·9%) women in the MMS group, 50 623 (>99·9%) in the USS group, and 101 299 (>99·9%) in the no screening group. Screening ended on Dec 31, 2011, and included 345 570 MMS and 327 775 USS annual screening episodes. At a median follow-up of 11·1 years (IQR 10·0-12·0), we diagnosed ovarian cancer in 1282 (0·6%) women: 338 (0·7%) in the MMS group, 314 (0·6%) in the USS group, and 630 (0·6%) in the no screening group. Of these women, 148 (0·29%) women in the MMS group, 154 (0·30%) in the USS group, and 347 (0·34%) in the no screening group had died of ovarian cancer. The primary analysis using a Cox proportional hazards model gave a mortality reduction over years 0-14 of 15% (95% CI -3 to 30; p=0·10) with MMS and 11% (-7 to 27; p=0·21) with USS. The Royston-Parmar flexible parametric model showed that in the MMS group, this mortality effect was made up of 8% (-20 to 31) in years 0-7 and 23% (1-46) in years 7-14, and in the USS group, of 2% (-27 to 26) in years 0-7 and 21% (-2 to 42) in years 7-14. A prespecified analysis of death from ovarian cancer of MMS versus no screening with exclusion of prevalent cases showed significantly different death rates (p=0·021), with an overall average mortality reduction of 20% (-2 to 40) and a reduction of 8% (-27 to 43) in years 0-7 and 28% (-3 to 49) in years 7-14 in favour of MMS. Interpretation: Although the mortality reduction was not significant in the primary analysis, we noted a significant mortality reduction with MMS when prevalent cases were excluded. We noted encouraging evidence of a mortality reduction in years 7-14, but further follow-up is needed before firm conclusions can be reached on the efficacy and cost-effectiveness of ovarian cancer screening. Funding: Medical Research Council, Cancer Research UK, Department of Health, The Eve Appeal.
Article
Full-text available
Cancer screening strategies have commonly adopted single-biomarker thresholds to identify abnormality. We investigated the impact of serial biomarker change interpreted through a risk algorithm on cancer detection rates. In the United Kingdom Collaborative Trial of Ovarian Cancer Screening, 46,237 women, age 50 years or older underwent incidence screening by using the multimodal strategy (MMS) in which annual serum cancer antigen 125 (CA-125) was interpreted with the risk of ovarian cancer algorithm (ROCA). Women were triaged by the ROCA: normal risk, returned to annual screening; intermediate risk, repeat CA-125; and elevated risk, repeat CA-125 and transvaginal ultrasound. Women with persistently increased risk were clinically evaluated. All participants were followed through national cancer and/or death registries. Performance characteristics of a single-threshold rule and the ROCA were compared by using receiver operating characteristic curves. After 296,911 women-years of annual incidence screening, 640 women underwent surgery. Of those, 133 had primary invasive epithelial ovarian or tubal cancers (iEOCs). In all, 22 interval iEOCs occurred within 1 year of screening, of which one was detected by ROCA but was managed conservatively after clinical assessment. The sensitivity and specificity of MMS for detection of iEOCs were 85.8% (95% CI, 79.3% to 90.9%) and 99.8% (95% CI, 99.8% to 99.8%), respectively, with 4.8 surgeries per iEOC. ROCA alone detected 87.1% (135 of 155) of the iEOCs. Using fixed CA-125 cutoffs at the last annual screen of more than 35, more than 30, and more than 22 U/mL would have identified 41.3% (64 of 155), 48.4% (75 of 155), and 66.5% (103 of 155), respectively. The area under the curve for ROCA (0.915) was significantly (P = .0027) higher than that for a single-threshold rule (0.869). Screening by using ROCA doubled the number of screen-detected iEOCs compared with a fixed cutoff. In the context of cancer screening, reliance on predefined single-threshold rules may result in biomarkers of value being discarded. © 2015 by American Society of Clinical Oncology.
Article
Full-text available
All gynecologists are faced with ovarian tumors on a regular basis, and the accurate preoperative diagnosis of these masses is important because appropriate management depends on the type of tumor. Recently, the International Ovarian Tumor Analysis (IOTA) consortium published the Assessment of Different NEoplasias in the adneXa (ADNEX) model, the first risk model that differentiates between benign and four types of malignant ovarian tumors: borderline, stage I cancer, stage II-IV cancer, and secondary metastatic cancer. This approach is novel compared to existing tools that only differentiate between benign and malignant tumors, and therefore questions may arise on how ADNEX can be used in clinical practice. In the present paper, we first provide an in-depth discussion about the predictors used in ADNEX and the ability for risk prediction with different tumor histologies. Furthermore, we formulate suggestions about the selection and interpretation of risk cut-offs for patient stratification and choice of appropriate clinical management. This is illustrated with a few example patients. We cannot propose a generally applicable algorithm with fixed cut-offs, because (as with any risk model) this depends on the specific clinical setting in which the model will be used. Nevertheless, this paper provides a guidance on how the ADNEX model may be adopted into clinical practice.
Article
Full-text available
Background Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed.Materials and methodsThe Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors.ResultsThe resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document.Conclusions To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Article
Full-text available
In prognostic studies, the lasso technique is attractive since it improves the quality of predictions by shrinking regression coefficients, compared to predictions based on a model fitted via unpenalized maximum likelihood. Since some coefficients are set to zero, parsimony is achieved as well. It is unclear whether the performance of a model fitted using the lasso still shows some optimism. Bootstrap methods have been advocated to quantify optimism and generalize model performance to new subjects. It is unclear how resampling should be performed in the presence of multiply imputed data. The data were based on a cohort of Chronic Obstructive Pulmonary Disease patients. We constructed models to predict Chronic Respiratory Questionnaire dyspnea 6 months ahead. Optimism of the lasso model was investigated by comparing 4 approaches of handling multiply imputed data in the bootstrap procedure, using the study data and simulated data sets. In the first 3 approaches, data sets that had been completed via multiple imputation (MI) were resampled, while the fourth approach resampled the incomplete data set and then performed MI. The discriminative model performance of the lasso was optimistic. There was suboptimal calibration due to over-shrinkage. The estimate of optimism was sensitive to the choice of handling imputed data in the bootstrap resampling procedure. Resampling the completed data sets underestimates optimism, especially if, within a bootstrap step, selected individuals differ over the imputed data sets. Incorporating the MI procedure in the validation yields estimates of optimism that are closer to the true value, albeit slightly too larger. Performance of prognostic models constructed using the lasso technique can be optimistic as well. Results of the internal validation are sensitive to how bootstrap resampling is performed.
Article
Interest in screening for ovarian cancer, which is common in developed countries, has grown in recent years. This study, which seems to be the first prospective randomized report of ovarian cancer screening, was designed to establish a better strategy for detecting early cancers. Asymptomatic postmenopausal women, seen in the years 1985 to 1999, were randomly assigned to an intervention group (n = 41,688) or to a control group (n = 40,799) and were followed up for an average of 9.2 years. The original goal was to offer annual screens comprising pelvic ultrasonography and a serum cancer antigen 125 (CA-125) test to women in the intervention group. Those with abnormal ultrasound findings or elevated CA-125 were referred to a gynecologic oncologist for surgical assessment. In late 2002 when the code was broken, 27 cancers were found in screened women and 8 more were diagnosed outside the screening program. Rates of detecting ovarian cancer were 0.31 per 1000 at the prevalent screen and ranged from 0.38 to 0.74 per 1000 at subsequent screens. Rates increased on successive screens. Ovarian cancer developed in 32 control women. Fewer women in the screened group than in the control group had advanced-stage disease. The proportion of stage I ovarian cancer was higher in screened than in control women (63% vs. 38%), but the difference fell short of statistical significance. The histologic type of index cancers was similar in the screened and control groups, as were tumor grade and the use of adjuvant chemotherapy. More women in the screening group than in the control group had no disease or only microscopic disease, but this difference also was not statistically significant. Continuing follow-up of this cohort is expected to provide further information about the effects of screening for ovarian cancer on asymptomatic postmenopausal women.
Article
Cancer screening strategies have commonly adopted single-biomarker thresholds to identify abnormality. We investigated the impact of serial biomarker change interpreted through a risk algorithm on cancer detection rates.
Article
Objective: Calibrated risk models are vital for valid decision support. We define four levels of calibration and describe implications for model development and external validation of predictions. Study design and setting: We present results based on simulated datasets. Results: A common definition of calibration is "having an event rate of R % among patients with a predicted risk of R %", which we refer to as 'moderate calibration'. Weaker forms of calibration only require the average predicted risk (mean calibration) or the average prediction effects (weak calibration) to be correct. 'Strong calibration' requires that the event rate equals the predicted risk for every covariate pattern. This implies that the model is fully correct for the validation setting. We argue that this is unrealistic: the model type may be incorrect, at model development the linear predictor is only asymptotically unbiased, and all nonlinear and interaction effects should be correctly modeled. In addition, we prove that moderate calibration guarantees non-harmful decision-making. Finally, results indicate that a flexible assessment of calibration in small validation datasets is problematic. Conclusion: Strong calibration is desirable for individualized decision support, but unrealistic and counter-productive by stimulating the development of overly complex models. Model development and external validation should focus on moderate calibration.
Article
Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Royal College of Obstetricians and Gynaecologists.