PreprintPDF Available

AI-based multi-modal integration of clinical characteristics, lab tests and chest CTs improves COVID-19 outcome prediction of hospitalized patients

Authors:

Abstract and Figures

With 15% of severe cases among hospitalized patients1, the SARS-COV-2 pandemic has put tremendous pressure on Intensive Care Units, and made the identification of early predictors of severity a public health priority. We collected clinical and biological data, as well as CT scan images and radiology reports from 1,003 coronavirus-infected patients from two French hospitals. Radiologists' manual CT annotations were also available. We first identified 11 clinical variables and 3 types of radiologist-reported features significantly associated with prognosis. Next, focusing on the CT images, we trained deep learning models to automatically segment the scans and reproduce radiologists' annotations. We also built CT image-based deep learning models that predicted severity better than models based on the radiologists' scan reports. Finally, we showed that including CT scan features alongside the clinical and biological data yielded more accurate predictions than using clinical and biological data only. These findings show that CT scans provide insightful early predictors of severity.
Content may be subject to copyright.
AI-based multi-modal integration of clinical
characteristics, lab tests and chest CTs improves
COVID-19 outcome prediction of hospitalized patients
Nathalie Lassau1,2, Samy Ammari1,2, Emilie Chouzenoux3, Hugo Gortais4, Paul
Herent5, Matthieu Devilder4, Samer Soliman4, Olivier Meyrignac2, Marie-Pauline
Talabard4, Jean-Philippe Lamarque1,2, Remy Dubois5, Nicolas Loiseau5, Paul
Trichelair5, Etienne Bendjebbar5, Gabriel Garcia1, Corinne Balleyguier1,2, Mansouria
Merad6, Annabelle Stoclin7, Simon Jegou5, Franck Griscelli8, Nicolas Tetelboum1,
Yingping Li2,3, Sagar Verma3, Matthieu Terris3, Tasnim Dardouri3, Kavya Gupta3,
Ana Neacsu3, Frank Chemouni7, Meriem Sefta5, Paul Jehanno5, Imad Bousaid9,
Yannick Boursin9, Emmanuel Planchet9, Mikael Azoulay9, Jocelyn Dachary5, Fabien
Brulport5, Adrian Gonzalez5, Olivier Dehaene5, Jean-Baptiste Schiratti5, Kathryn
Schutte5, Jean-Christophe Pesquet3, Hugues Talbot3, Elodie Pronier5, Gilles
Wainrib5, Thomas Clozel5, Fabrice Barlesi6, Marie-France Bellin2,4, Michael G. B.
Blum5*.
1.Imaging Department Gustave Roussy. Université Paris Saclay,Villejuif,F-94805
2.Biomaps. UMR1281 INSERM.CEA.CNRS.Université Paris-Saclay. Villejuif,F-94805
3.Centre de Vision Numérique, Université Paris-Saclay, CentraleSupélec, Inria, 91190 Gif-sur-Yvette,
France
4.Radiology Department, Hôpital de Bicêtre – AP-HP, Université Paris Saclay, Le Kremlin-Bicêtre,
France
5.Owkin Lab, Owkin, Inc. New York, NY USA
6.Département d'Oncologie Médicale, Gustave Roussy,Université Paris-Saclay, Villejuif, F-94805,
France
7.Département de Soins Intensifs, Gustave Roussy, Université Paris-Saclay, Villejuif, F-94805,
France
8.Département de Biologie, Gustave Roussy, Université Paris-Saclay, Villejuif, F-94805, France
9.Direction de la Transformation Numérique et des Systèmes d'Information, Gustave Roussy, 94800
Villejuif, France.
Corresponding author: michael.blum@owkin.com
1
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
With 15% of severe cases among hospitalized patients1, the SARS-COV-2 pandemic
has put tremendous pressure on Intensive Care Units, and made the identification of
early predictors of future severity a public health priority. We collected
clinical and
biological data, as well as CT scan images and radiology reports from 1,003
coronavirus-infected patients from two French hospitals. Radiologists' manual CT
annotations were also available. We first identified 11 clinical variables and 3 types of
radiologist-reported features significantly associated with prognosis. Next, focusing
on the CT images, we trained deep learning models to automatically segment the
scans and reproduce radiologists' annotations. We also built CT image-based deep
learning models that predicted future severity better than models based on the
radiologists' scan reports. Finally, we showed that including CT scan features
alongside the clinical and biological data yielded more accurate predictions than
using clinical and biological data alone. These findings show that CT scans provide
insightful early predictors of future severity.
Previous studies have demonstrated that risk factors for severe evolution include
demographic variables such as age, comorbidities, and biological variables measured within
2 days of patient admission2–4. Beyond clinical and biological variables, computerized
tomography (CT) scans are also potential sources of information: the degree of pulmonary
inflammation is associated with clinical symptoms and severity5,6, and the extent of lung
abnormality is predictive of severe disease evolution7,8. Here we evaluated to what extent
visual or AI-based analysis of CT scans at patient admission added information about future
severe disease evolution once clinical and biological data had been taken into account.
A total of 1,003 patients from Kremlin-Bicêtre (KB, Paris, France) and Gustave Roussy
(IGR, Villejuif, France) were enrolled in the study. Clinical, biological, and CT scan images
and reports were collected at hospital admission. Additionally, 292 CT scans were later
annotated manually by radiologists (see supplementary materials). Summary statistics for
the clinical, biological, and CT scan data are provided in Figure 1.
2
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Figure 1: Population description for the KB and IGR hospitals. Among the 1,003 patients of the
study, biological and clinical variables were available for 989 individuals. Categorical variables are
expressed as percentages [available]. Continuous variables are shown as median (IQR) [available].
Association with severity are reported with p-values for each center and the pooled p-value has been
obtained with Stouffer's method to combine p-values. p-values that are shown are not adjusted for
multiplicity. Variables and pooled p-values are in bold when the variable is significant after Bonferroni
adjustment to account for multiple testing across the 63 variables. For continuous variables, odds
ratios are computed for an increase of one standard deviation of the continuous variable. KB odds
ratios are in blue, IGR in red.
3
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Figure 2:Axial chest CT scans and segmentation results COVID-19 radiology patterns, as provided by
AI-segment, for 3 patients with COVID-19. Green/transparent: sane lung; blue: GGO; yellow : crazy paving; red:
consolidation. (Top) 67-year-old woman with diffuse distribution, and multiple large regions of subpleural GGO
with consolidation to the right and left lower lobe. Estimated disease extent by AI: 69%/47% (right/left).
Radiologist report: critical stage of COVID-19 (stage 5). (Middle) 56-year-old man, with diffuse distribution and
multiple large regions of subpleural GGO with superimposed intralobular and interlobular septal thickening (crazy
paving). Estimated disease extent by AI: 51%/68% (right/left). Radiologist report: severe stage of COVID-19
(stage 4). (Bottom) 70-year-old woman, with minimal impairment, and multiple small regions of subpleural GGO
with consolidation to the right lower lobe. Estimated disease extent 13%/7% (left/right). Radiologist report:
moderate stage of COVID-19 (stage 2).
4
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Coronavirus progression is evaluated by the World Health Organization on a 1 to 10 scale,
severe scores of 5 or more corresponding to an oxygen flow rate of 15 L/min or higher, or
the need for mechanical ventilation, or patient death9. We first evaluated how clinical and
biological variables measured at admission were associated with future severe progression
(score of 5 or more). These variables were available for 989 individuals, and we computed
the severity odds ratios for each individual variable, and at each hospital center (Figure 1).
When combining association results from the two centers, we found 11 variables significantly
associated with severity (P <0.05/63 to account for testing 63 variables, Figure 1): age
(Odds Ratio [OR] KB 1.66 (1.41-1.96), OR IGR 1.04 (0.50-2.15), OR IGR 1.32 (0.90-1.93),
PStouffer = 5.75e-10), sex (OR KB 1.95 (1.41-2.69), OR IGR 1.04 (0.50-2.15), PStouffer =
6.10e-05), hypertension (OR KB 1.84 (1.35-2.51), OR IGR 1.09 (0.50-2.36), PStouffer =
1.15e-04), chronic kidney disease (OR KB 2.51 (1.62-3.69), OR IGR 16.29 (1.89-140.12),
PStouffer = 6.66e-06), respiratory rate (OR KB 1.34 (1.13-1.59), OR IGR 3.37 (1.28-8.86),
PStouffer = 2.10e-04), oxygen saturation (OR KB 0.38 (0.31-0.47), OR IGR 0.35 (0.20-0.63),
PStouffer = 2.79e-21), diastolic pressure (OR KB 0.70 (0.53-0.83), OR IGR 0.76 (0.51-1.11),
PStouffer = 1.35e-05), CRP (OR KB 1.47 (1.25-1.72), OR IGR 1.50 (1.04-2.16), PStouffer =
4.13e-07), LDH (OR KB 2.05 (1.65-2.54), OR IGR 2.53 (1.42-4.53), PStouffer = 4.38e-12),
polynuclear neutrophil (OR KB 1.36 (1.13-1.60), OR IGR 1.15 (0.81-1.64), PStouffer =
1.25e-04), and urea (OR KB 1.70 (1.43-2.01), OR IGR 2.13 (1.33-2.42), PStouffer = 9.49e-11).
This confirms the literature reported prognostic value of these 11 clinical and biological
markers.2,4,10–14
We then assessed the predictive value of features from admission radiology reports, and
found three significant features: (i) extent of disease (OR KB 2.37 (1.97-2.86), OR IGR 1.64
(1.12-2.38), PStouffer = 8.50e-21) and (ii) crazy paving (
OR KB 2.50 (1.82-3.44), OR IGR 2.28
(1.07-4.88), PStouffer = 3.10e-09), associated with greater severity, and (iii) peripheral
topography, associated with lesser severity (OR KB 0.54 (0.39-0.74), OR IGR 0.61
(0.26-1.42), PStouffer = 9.47e-05). This confirms the reported negative impact of disease
extent7,15,16. We hypothesize that peripheral topography has a positive impact on prognosis
because peripheral lesions could be less extended.
We next trained a deep neural network called AI-segment
(Supp Figure 1) to segment
radiological patterns and provide automatic quantification 18,19 of their volume, expressed as
a percentage of the full lung volume. These patterns included the three distinguishable
features that appear as disease severity progresses17
: ground glass opacity or GGO, crazy
paving, and finally consolidation. AI-segment
was trained on 161 patients from KB and
evaluated on 132 patients from IGR, of which 14 fully annotated, and 118 partially
annotated. The mean absolute error in volume prediction for the fully annotated scans was
6.94% for GGO, 1.01% for consolidation, and 7.21% for sane lung (no crazy paving was
present in these scans). On the larger cohort of partially annotated scans, the accuracy with
respect to the radiologist score was 78% for GGO, 67% for crazy paving, and 74% for
consolidation (for a 1% detection threshold on the AI-segment
result, Supp Table 1).
AI-segment
also accurately quantified the disease extent (Supp Figure 3). AI-segment
visual
results were also consistent with radiologist observations (See Figure 2 for three
representative cases). We lastly evaluated to what extent the AI-segment
trained on CT
5
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
scans provided finer information about future severity compared to radiologists' scan reports.
Using predicted volumes from AI-segment
, we found that GGO (OR KB 1.8 (1.5-2.16), OR
1.7(1.18-2.43), PStouffer = 3.45e-11), crazy paving (OR KB 1.57 (1.26-1.97), OR IGR 1.38
(0.95-1.99), PStouffer = 7.27e-05) consolidation (OR KB 1.86 (1.53-2.25), OR IGR 1.87
(1.26-2.77), PStouffer = 1.43e-11) and extent of disease (OR KB 2.14 (1.77-2.6), OR IGR 1.87
(1.28-2.73), PStouffer = 3.13e-16) were all associated with severity (accounting for multiple
testing). This confirms that automatic estimation of lesion volumes can add more precise
measures of future severity to the radiologists' scan reports (Supp Table 2) 8 .
We next evaluated the prognostic value of CT scans alone through three different models.
The first model called report
included variables from the radiological report only. The second
was based on the automatic lesion volumes measured by AI-segment
. The third called
AI-severity
used a weakly supervised approach with no radiologist-provided annotations
(Supp Figure 2)20. All three models were trained on 646 KB patients, tested on 150 KB
validation patients, and validated on the independent IGR dataset of 137 patients (Figure 3).
On the validation set from KB hospital, report
was outperformed by AI-severity
but not by
AI-segment
(AUCAI-severity
= 0.76, AUCAI-segment
= 0.68, AUCreport
= 0.72). On the independent
IGR validation set, both AI-segment
and AI-severity
outperformed the model report
(AUCAI-severity
= 0.70, AUCAI-segment
=0.68, AUCreport
=0.66). Our follow up analyses revealed that
the predictive performance of AI-severity
was strong in part because the internal
representation of the neural network captures clinical features from the lung CTs, such as
age, on top of the known COVID-19 radiology features (see interpretability of AI-severity
in
Supp Material).
Lastly, we evaluated whether CT scans have prognostic value beyond what can be inferred
from clinical and biological characteristics alone. We therefore compared the performance of
trimodal CT scan / clinical / biological models to bimodal clinical / biological models. We
compared model performances for three outcomes: our initial WHO-defined high severity
outcome of "oxygen flow rate of 15 L/min or higher, or need for mechanical ventilation, or
death", as well as two other outcomes studied in the literature, "death or ICU admission",
and "death". We built a trimodal version of report
, AI-segment
, and AI-severity
, adding
clinical and biological information to the original CT scan-based models by implementing a
greedy search approach to include optimal variables (Supp Figure 4). All three trimodal
models performed consistently better than the bimodal biological/clinical model (Figure 3 and
Supp Table 3), whether it be trimodal report
, AI-segment
, or AI-severity (
mean AUC increase
of 0.02-0.03)
. They also outperformed clinical/biological models from literature (Colombi at al
model 7 and MIT COVID analytics model). Of note, the fact that the models trained with
patients from the KB hospital had good performances when evaluated on IGR hospital is
evidence of their robustness, especially since these two hospitals receive patients with very
different comorbidities (85% of cancer patients at IGR and 7% at KB). Taken together, these
consistent results confirm the added prognostic value of CT scans. Importantly, while
trimodal AI-severity
generally outperformed trimodal report
across all outcomes, and trimodal
AI-segment
sometimes outperformed report
, the AUC difference was always modest (max
increase of 0.03 for AI-severity
vs report
, and max increase of 0.02 for AI-segment
vs
report
), showing that the incorporation of CT-scan analyses, no matter what the method, is
the strongest performance booster. Therefore beyond AI modeling, our study shows that a
6
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
composite scoring system integrating selected radiological measurements with key clinical
and biological variables provides accurate predictions and can rapidly become a reference
scoring approach for severity prediction.
7
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Figure 3: Receiver operating characteristic (ROC) curves of the models that predict
severity. Models were evaluated on two distinct validation sets consisting of 150 patients from
KB (left panels) and 137 patients from IGR (right panels).
8
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Our retrospective study conducted on two French hospitals shows that future disease
severity markers are present within routine CT scans performed at admission, and these can
be identified and quantified via AI-based scoring, providing useful and interpretable elements
for prognosis.
9
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Acknowledgements
We would like to thank J.-Y. Berthou, H. Berry, and Ph. Gesnouin from Inria and B.
Schmauch, G. Rouzaud, and R. Patel from Owkin for their support.
 
10
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Author Contributions
N.L., S.A., E.C.,P.H.,R.M.,N.L.,P.T., E.B.,M.S., A.S., F.C.,S.J., M.S., I.B., J.D.,JC.P.,
H.T.,E.P.,G.W., T.C., F.B.,MF.B.,M.B conceived the idea of this paper
N.L., S.A., E.C., H.G.,P.H., M.D., S.S., O.M., MP.T., JP.L.,R.M.,N.L.,P.T., E.B.,G.G,
C.B.,S.J., F.G.,N.T.,Y.L., T.D., K.G., A.N., M.T., S.V., M.S., I.B., Y.B, E.P., M.A., J.D.,F.B.,
A.G.,J.D.,JC.P., H.T.,E.P.,G.W., T.C., F.B.,MF.B.,M.B participated to the acquisition and
treatment of data
N.L., S.A., E.C.,P.H.,R.M.,N.L.,P.T., E.B.,S.J., M.S., P.J., I.B., J.D.,JC.P.,H.T.,E.P.,G.W.,
T.C., MF.B.,M.B.implemented the analysis
N.L., S.A., E.C.,P.H.,R.M.,N.L.,P.T., E.B.,S.J., M.S., I.B., J.D.,JC.P., H.T.,E.P.,G.W., T.C.,
MF.B.,M.B.contributed to the writing of the manuscript
 
11
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Competing Interests statement
The authors declare the following competing interests:
Employment: Michael Blum, Paul Herent, Rémy Dubois, Nicolas Loiseau, Paul
Trichelair, Etienne Bendjebbar, Simon Jégou, Meriem Sefta, Paul Jehanno, Fabien
Brulport, Olivier Dehaene, Jean-Baptiste Schiratti, Kathryn Schutte, Elodie Pronier,
Jocelyn Dachary, Adrian Gonzalez, employed by Owkin
Co-founders of Owkin Inc : Thomas Clozel, Gilles Wainrib.
12
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
References
1. Guan, W.-J. et al.
Clinical Characteristics of Coronavirus Disease
2019 in China. N. Engl. J. Med.
382, 1708–1720 (2020).
2. Zhou, F. et al.
Clinical course and risk factors for mortality of adult
inpatients with COVID-19 in Wuhan, China: a retrospective cohort
study. Lancet
395, 1054–1062 (2020).
3. Richardson, S. et al.
Presenting Characteristics, Comorbidities, and
Outcomes Among 5700 Patients Hospitalized With COVID-19 in the
New York City Area. JAMA
(2020)
4. Yang, J. et al.
Prevalence of comorbidities and its effects in
coronavirus disease 2019 patients: A systematic review and
meta-analysis. Int. J. Infect. Dis.
94, 91–95 (2020).
5. Wu, J. et al.
Chest CT Findings in Patients With Coronavirus
Disease 2019 and Its Relationship With Clinical Features. Invest.
Radiol.
55, 257–261 (2020).
6. Zhao, W., Zhong, Z., Xie, X., Yu, Q. & Liu, J. Relation Between
Chest CT Findings and Clinical Conditions of Coronavirus Disease
(COVID-19) Pneumonia: A Multicenter Study. AJR Am. J.
Roentgenol.
214, 1072–1077 (2020).
7. Colombi, D. et al.
Well-aerated Lung on Admitting Chest CT to
Predict Adverse Outcome in COVID-19 Pneumonia. Radiology
201433 (2020).
8. Zhang, K. et al.
Clinically Applicable AI System for Accurate
Diagnosis, Quantitative Measurements and Prognosis of COVID-19
Pneumonia Using Computed Tomography. Cell
(2020).
9. Clinical management of severe acute respiratory infection when
COVID-19 is suspected.
https://www.who.int/publications-detail/clinical-management-of-seve
re-acute-respiratory-infection-when-novel-coronavirus-(ncov)-infecti
on-is-suspected.
10. Livingston, E. & Bucher, K. Coronavirus Disease 2019 (COVID-19)
in Italy. JAMA
(2020)
11. Wu, C. et al.
Risk Factors Associated With Acute Respiratory
Distress Syndrome and Death in Patients With Coronavirus Disease
13
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
2019 Pneumonia in Wuhan, China. JAMA Intern. Med.
(2020)
12. Ruan, Q., Yang, K., Wang, W., Jiang, L. & Song, J. Clinical
predictors of mortality due to COVID-19 based on an analysis of
data of 150 patients from Wuhan, China. Intensive Care Med.
46,
846–848 (2020).
13. Huang, R. et al.
Clinical Findings of Patients with Coronavirus
Disease 2019 in Jiangsu Province, China: A Retrospective,
Multi-Center Study. PLoS Negl. Trop. Dis.
14: e0008280. (2020)
14. Wu, Z. & McGoogan, J. M. Characteristics of and Important Lessons
From the Coronavirus Disease 2019 (COVID-19) Outbreak in China:
Summary of a Report of 72 314 Cases From the Chinese Center for
Disease Control and Prevention. JAMA
(2020)
15. Yuan, M., Yin, W., Tao, Z., Tan, W. & Hu, Y. Association of
radiologic findings with mortality of patients infected with 2019 novel
coronavirus in Wuhan, China. PLoS One
15, e0230548 (2020).
16. Zhang, R. et al.
CT features of SARS-CoV-2 pneumonia according
to clinical presentation: a retrospective analysis of 120 consecutive
patients from Wuhan city. Eur. Radiol.
(2020)
17. Wang, Y. et al.
Temporal Changes of CT Findings in 90 Patients
with COVID-19 Pneumonia: A Longitudinal Study. Radiology
200843 (2020).
18. Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional
Networks for Biomedical Image Segmentation. in Medical Image
Computing and Computer-Assisted Intervention – MICCAI 2015
234–241 (Springer International Publishing, 2015).
19. Hara, K., Kataoka, H. & Satoh, Y. Learning spatio-temporal features
with 3D residual networks for action recognition. Proc. IEEE
(2017).
20. Courtiol, P. et al.
Deep learning-based classification of
mesothelioma improves prediction of patient outcome. Nat. Med.
25,
1519–1525 (2019).
21. Dai, M. et al.
Patients with cancer appear more vulnerable to
SARS-COV-2: a multi-center study during the COVID-19 outbreak.
Cancer Discov.
(2020)
14
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supplementary material of “AI-based
multi-modal
integration of clinical
characteristics, lab tests and chest
CTs improves COVID-19 outcome
prediction of hospitalized
patients”
Description of the retrospective study
Data were collected at two French hospitals (Kremlin Bicêtre Hospital (KB), APHP, Paris, and
Gustave Roussy Hospital (GR), Villejuif). CT scans, clinical, and biological data were collected in
the first 2 days after hospital admission.
This study has received the approval of both hospitals ethic committees and we submit a
declaration to the National Commission of Data Processing and Liberties (N° INDS
MR5413020420, CNIL) in order to get registered in the medical studies database and respect the
General Regulation on Data Protection (RGPD) requirements. Also an information letter was sent
to all patients included in the study.
Inclusion criteria were (1) date of admission at hospital (from the 12th of February to the 20th of
March at Kremin Bicêtre and from the 2nd of March to the 24th of April at Institut Gustave
Roussy) and (2) a positive diagnosis of COVID-19. Patients were considered positive either
because of a positive RT-PCR (real-time fluorescence polymerase chain reaction) based on
nasal or lower respiratory tract specimens or a CT scan with a typical appearance of COVID-19
as defined by the ACR criteria for negative RT-PCR patients1. Children and pregnant women
were excluded from the study.
The clinical and laboratory data were obtained from detailed medical records, cleaned and
formatted retrospectively by 10 radiologists with 3 to 20 years of experience (5 radiologists at GR
and 5 at KB). Data from the clinical examination include: sex, age, body weight and height, body
mass index, heart rate, body temperature, oxygen saturation, blood pressure, respiratory rate,
and a list of symptoms including cough, sputum, chest pain, muscle pain, abdominal pain or
diarrhoea, and dyspnea. Health and medical history data include presence or absence of
comorbidities (systemic hypertension, diabetes mellitus, asthma, heart disease, emphysema,
immunodeficiency) and smoker status. Laboratory data include conjugated alanine, bilirubin, total
bilirubin, creatine kinase, CRP, ferritin, haemoglobin, LDH, leucocytes, lymphocyte, monocyte,
platelet, polynuclear neutrophil, and urea.
15
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Chest Thoracic (CT) imaging
CT scan acquisition
Three different models of CT scanners were used : two General Electric CT scanners (Discovery
CT750 HD and Optima 660 GE Medical Systems, Milwaukee, USA) and a Siemens CT
scanner (Somatom Drive; Siemens Medical Solutions, Forchheim). All the patients were
scanned in a supine position during breath-holding at full inspiration. The acquisition and
reconstruction parameters were of 120kV tube voltage with automatic tube current
modulation (100-350 mAs), 1mm slice thickness without interslice gap, using
filtered-back-projection (FBP) reconstruction (SOMATOM Drive) or blended FBP/iterative
reconstruction (Discovery or Optima). Axial images with slice thickness of 1 mm were used for
coronal and sagittal reconstructions.
The scans performed were independently examined by experienced radiologists using a
standard workstation in the clinical image archiving and transmission system. All radiologists
were informed of patients clinical status (suspicion of COVID-19, clinical signs of severity).
Definition of CT Features
COVID-19 associated CT imaging features identified by radiologists were defined following
ACR recommendation1. The term parenchymal opacification is applied to any homogeneous
increase in lung density on chest CT. When this parenchymal opacification is dense enough
to obscure the vessels margins and airway walls and other parenchymal structures, it is
called consolidation. Ground-glass attenuation is defined as an increase in lung density not
sufficient to obscure vessels or preservation of bronchial and vascular margins crazy-paving
pattern was defined as ground-glass opacification with associated interlobular septal
thickening2 .
For 959 patients, CT imaging characteristics were evaluated and the following findings were
reported: ground glass opacity (rounded / non rounded / absent), consolidation (rounded /
non rounded / absent) interlobular septal thickening or “crazy paving” (present / absent),
subpleural line, lymph node enlargement, pleural effusion, and pericardial effusion,
according to morphological descriptors based on recommendations of the Fleischner
Nomenclature Committee2.
The results of the CT were examined in terms of location, distribution, size and type. The
location refers to the different lobes and segments involved (lower or medium or upper). The
distribution was described as peripheral (1/3 external of the lung), central (2/3 internal), or
both central and peripheral.
The assessment of the size and extent of lung involvement was based on a visual
classification of lung anatomy according to the evaluation criteria established by the French
Society of Radiology (SFR)3. The size of the lesion was assessed; the volume of the lung
affected absent / minimal (<10%) / moderate (10-25%) / extensive (25-50%) / severe (>50%)
16
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
/ critical >75%. The coding absent / minimal / moderate extensive / severe / critical was
based on a quantitative variable with values of 0 / 1 / 2 / 3 / 4 / 5.
Automatic extraction from radiological report
Radiological features from radiological reports were automatically extracted using Optical
Character Recognition and regular expression functions.
Annotation scenario of CT scans by radiologists in order to train the AI-Volumetry model
Two radiologists (4 and 9 years of experience) examined and annotated 292 anonymized
chest scans independently and without access to the patient's clinic or COVID-19 PCR
results. All CT images were viewed with lung window parameters (width, 1500 HU; level,
-550 HU) using the SPYD software developed by Owkin. Regions of interest were annotated
by the radiologists in four distinct classes : healthy pulmonary parenchyma, ground glass
opacity, consolidation, crazy-paving. One AI and imaging PhD student provided full 3D
annotation of the four classes on 22 anonymized chest scans using the 3D Slicer software.
The presence of organomegaly was also notified when present, as a binary class. When
multiple CT images were available for a single patient, the scan to analyze was selected
using the SPYD software.
Machine learning models
Models for segmentation of CT scans (AI-segment
)
In the proposed pipeline called AI-segment
for lesion segmentation from CT scans, we
deployed 3 segmentation networks: 3D Resnet504 , 2.5D U-Net, and 2D U-Net 5. These are
three powerful convolutional neural networks that have achieved state of the art performance
in numerous medical image segmentation tasks. U-Net consists of convolution, max pooling,
ReLU activations, concatenation and up-sampling layers with sections: contraction,
bottleneck, and expansion. ResNet contains convolutions, max pooling, batch normalization,
and ReLU layers that are grouped in multiple bottleneck blocks.
All models were trained on CT scans provided by Kremlin-Bicêtre (KB) and evaluated on
annotated CT scans Institut Gustave Roussy (IGR). The dataset was divided into two
categories: Fully Annotated Scans (FAS) composed of 22 scans (8 from KB and 14 from
IGR) and Partially Annotated Scans (PAS) composed of 292 scans (153 from KB and 118
from IGR)
2D U-Net was trained for left/right lung segmentation while 3D ResNet and 2.5D U-Net were
used for lesion segmentation. 3D ResNet50 was trained on 8 KB FAS. We used Stochastic
Gradient Descent for parameter optimization and a learning rate starting of 0.1 with a decay
factor of 0.1 every 20 epochs. The network was trained for a total of 100 epochs. As for 2.5D
U-Net, Adam optimization algorithm was used with learning rate, weight decay, gradient
clipping and learning rate decay parameters set respectively to 1e-3, 1e-8, 1e-1, and 0.1
17
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
(applied at epochs 90 and 150) for 300 epochs. While the validation set remains the same
as 3D resnet50, 153 KB PAS scans were added to the 8 KB FAS, in the training set. PAS
were only added to the 2.5D U-Net training set due to the incompleteness of the annotated
volume (on average 16 slices are annotated per PAS) in the scans which would not satisfy
the volumetric requirements of the 3D ResNet50 input. Finally, for the left/right lung
segmentation, the 2DU-Net was trained on the 8 KB FAS. Similarly to 2.5D U-Net, Adam
optimization algorithm was used with learning rate, weight decay, gradient clipping and
learning rate decay parameters set respectively to 1e-3, 1e-8, 1e-1, and 0.1 at epoch 70
over 104 epochs. Both 2.5D U-Net and 2D U-Net use affine transformation and contrast
change for data augmentation while 3D resnet50 uses affine transformation, contrast
change, thin plate splines, and flipping. 3D ResNet and 2.5D U-Net are trained through the
minimization of the cross entropy loss and 2D U-Net minimizes the binary cross entropy loss.
All training was performed on NVIDIA Tesla V100 GPUs and Pytorch is the used framework.
During the validation phase, ensemble inference6 is performed on all the available scans.
Models for severity classification based of CT scans (AI
-severity)
The AI
-severity
model is defined as an ensemble of four sub-models, as illustrated in Supp
Fig 2. Each of these sub-models is designed to predict the disease severity from CT scans.
Since they do not require expert annotations at the slice level, these sub-models fall in the
scope of weakly supervised learning
. The preprocessing of the data consisted in resizing the
CT scans to 10mm pixel spacing along the vertical axis and obtaining a segmentation of the
lungs using a pre-trained U-Net algorithm7. Each sub-model is composed of two blocks: a
deep neural network called feature extractor
and a logistic regression. CT scans may contain
biases such as catheters (EKG monitoring, oxygenation tubing...) that are easily detectable
in a CT and can bias the prediction of severity (i.e.
predict the presence of a technical device
associated with severity instead of predicting the radiological features associated with
severity). In order to ensure that these biases do not affect the features, the lung
segmentation mask was applied before the features were extracted. As a result, only the
lungs were visible to the feature extractor
.
Two of the sub-models used an EfficientNet-B08 pre-trained on the ImageNet public
database as feature extractor while the other two used a ResNet509 pre-trained with MoCo
v210 on one million CT scan slices from both Deep Lesion11 and LIDC12. Each of these
networks provide an embedding of the slices of the input CT scans into a lower-dimensional
(1280 for EfficientNet-B0 and 2048 for ResNet50 with MoCo v2) feature space. A windowing
used for selecting specific ranges of intensities was also applied on the CT scans before the
features extraction. For the two sub-models based on the EfficientNet-B0, the image
intensities were respectively clipped in the (-1000 HU, 200 HU) and (-1000 HU, 600 HU)
range. For one of the remaining two sub-models (based on ResNet50 with MoCo v2), the
(-1350 HU, 150 HU) range was used whereas for the last one, a combination of the following
ranges was used: (-1000 HU, 0 HU), (0 HU, 1000 HU) and (-1000 HU, 4000 HU). Finally, for
each of these sub-model, a Logistic Regression (with ridge penalty) was used to predict the
disease severity from the averaged features. For the ResNet50-based sub-models, a
Principal Component Analysis (PCA) with 40 components was used to reduce the
dimensionality of the feature space before the Logistic Regression was applied. All the
18
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
sub-models were equally weighted in the ensemble and the disease severity predictions of
the AI-severity
model were obtained by averaging the prediction of the models in the
ensemble.
Interpretability of AI-severity
An interpretability study was conducted on AI-severity
to get a better understanding of its
performances. The correlation between the internal representation of the sub-models (i.e.
the input of the logistic regression),radiological and clinical variables were analyzed. By
replacing the output of the logistic regression by variables from the radiology reports, AUC
on the KB validation set of 150 patients were 94.1% for disease extent (threshold >2), 71.4%
for crazy paving, 67.1% for condensation and 74.8% for GGO, showing that the feature
extractors correctly captured part of the radiology signal. More interestingly, it was also
possible to correlate internal representations with clinical variables such as age (AUC 85.1%
with a threshold of 60 years old), sex (AUC 85.2%) or oxygen saturation (AUC 76.2%,
threshold 90%). As a comparison, a logistic regression trained on the radiology report
variables only gets respectively AUC scores of 70.0%, 59.9% and 67.8%. This gap shows
that the AI-severity
internal representations present within the neural network capture
clinical information directly from CT scans.
Models for multimodal integration
The models used to predict the outcome from multiple modalities are logistic regressions,
trained by cross validation with 5 folds on the training dataset of 646 patients from KB,
stratified by age and outcome. Variables that were filled for less than 300 patients
(conjugated bilirubin and alanine) were not used. For the remaining variables, missing
values were simply replaced by the average over patients of the training set. L2
regularization was applied to the weights of the models. The regularization coefficient value
was chosen by comparing the results obtained in cross validation with different values,
ranging from 0.01 to 100. The value maximizing the average AUC over the 5 folds was
selected. We use pandas and scikit-learn to manipulate data and perform machine learning
algorithms 13.
Selection of clinical and biological variables added to the models based on CT scan
variables
Clinical and biological variables were selected through a forward feature selection technique
(Supp Fig 4). At baseline (left of the figure), a model was trained in cross-validation using
only a fixed set of variables. Three initial sets were considered here: radiologist report, AI
Lungs and AI volumetry. The variables encoded in the radiologist report includes a
presence/absence coding of Ground Glass opacity (GGO), rounded GGO, Crazy paving,
Consolidation, Consolidation rounded, Topography peripheral, and Predominance inferior,
as well as disease extent, which is a semi automatic assessment of the amount of lesions in
19
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
the lung. The AI-Lung model includes the one variable output of the neural network model to
predict severity and the AI volumetry model includes the automatic quantification of the
ground glass, consolidation and crazy paving pattern, and the automatic quantification of
disease extent. For comparison, the procedure was also performed starting from an empty
set of variables (clinical only).
The added prognosis value of every clinical or biological variable was then assessed
separately, by training a new model using this variable in addition to the previous set. The
variable resulting in the largest AUC score was added to the selection. This procedure was
repeated for 20 iterations. For every initial selection, performances of the models increased
quickly at first (left part of Supp Fig 4), then reached a plateau (right half of the figure),
indicating that the variables added after the tenth iteration did not significantly increase the
predictive power of the models. Thus, for every case, only the ten best clinical and biological
variables were selected.
Training and evaluation of models
To predict severity, models were trained on 646 patients from KB, which included the
training set of AI-segment
, and evaluated on two distinct evaluation sets, with 150 patients
from KB and 137 patients from IGR. The prediction is performed using the logistic regression
approach.
We evaluated models that predict severity using the Area Under the Curve (AUC) and
differences between AUC values were tested using DeLong test 14.
We evaluated the segmentation model AI-segment
using mean absolute error that is defined
as the average, over the available fully annotated CT scans in the validation set, of the
absolute value of the difference between the ground truth percentage of each lesion type
(deduced from annotations) and the estimated ones. We also evaluated the detection
accuracy per lesion with respect to the reported radiologist scores, defined as the
percentage of correctly predicted classes by AI-segment
(GGO ; CP ; Consolidation) among
the validation set. A given lesion type, in the AI-segment result,
is considered as present
when the estimated volumetry of the lesion type, averaged over both lungs, is above a
certain threshold (here, we reported results for threshold 1% and 2%).
Benchmark models
We use the clinical and biological variables previously proposed in a multivariate risk score
for severity, which is defined as admission to ICU or death, and we retrain a logistic
regression model using these variables 15. We also considered the MIT Covid Analytics
calculator as a risk score for mortality (https://www.covidanalytics.io/mortality_calculator).
20
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supplementary Figures and Tables
Supp Fig. 1: AI-segment
architecture - Proposed pipeline to generate lesion volumetry estimates
from patient CT scans employing ensemble of segmentation networks. Normalized patient scans are
provided to our trained 2.5D U-Net and 3D ResNet50. The masks predicted from both models are
then merged by geometric mean. In parallel, we segment left-right lungs from the patient scans using
a dedicated U-Net. Finally, the left-right lung mask is used to mask-out lesions in left and right lungs
from the ensemble output. This pipeline utilizes the complementary features learned by a weak model
(2.5D U-Net) and a strong one (3D ResNet50).
21
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supp Fig. 2: AI-severity
model to predict severity from 3D chest CT scans.
Two different pipelines were used: one using Resnet50 (trained with MocoV2 on 1 million public CT
scan slices) as encoder (models 1 & 2) and one using EfficientNet B0 as encoder (models 3 & 4).
22
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supp Fig. 3: Boxplot of the automatic quantification of disease extent by AI-segment
versus
disease extent as estimated by a radiologist. The coding of the report is as follows: 0 (0% of
lesions), 1 (<10% of lesions), 2 (between 10 and 25% of lesions), 3 (between 25 and 50% of lesions),
4 (between 50 and 75% of lesions), 5 (more than 75% of lesions).
23
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supp Fig. 4: AUC curve as a function of the number of clinical and biological information
added to the multimodal model. Variables included in the models consist of CT scan variables only
and then a greedy algorithm adds clinical or biological variables iteratively. At each step of the
algorithm, the variable that results in the largest increase of AUC score is added.
24
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
GGO
Crazy paving
Consolidation
Accuracy (1%
threshold)
0.7829
0.6712
0.7419
Accuracy (2%
threshold)
0.7679
0.6712
0.7558
Supp Table 1: detection accuracy computed for the binary decision “presence or not of a lesion type”
for AI-segment
(threshold of the predicted disease extent, maximum of both lungs), when compared
to standardized radiologist report, on the IGR cohort.
Variable
Center
Odds ratio
(95%lower - 95%
upper)
P-value
P-value Stouffer
GGO AI
KB
1.8 (1.5-2.16)
2.86e-10
3.45e-11
GGO AI
IGR
1.7(1.18-2.43)
0.00424
Crazy Paving AI
KB
1.57 (1.26-1.97)
6.37e-05
7.27e-05
Crazy Paving AI
IGR
1.38 (0.95-1.99)
0.08712
Consolidation AI
KB
1.86 (1.53-2.25)
2.49e-10
1.43e-11
Consolidation AI
IGR
1.87 (1.26-2.77)
0.00196
Disease extent AI
KB
2.14 (1.77-2.6)
7.11e-15
3.13e-16
Disease extent AI
IGR
1.87 (1.28-2.73)
0.00109
Supp Table 2: association of lesion volumes inferred by AI-segment
and severity.
25
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
Supp Table 3: AUC values for the different models on the different sets. Each model was trained on
646 patients from KB. Results are reported on the validation set from KB (150 patients) and the
external validation set from IGR (137 patients), as well as on the training set using 5 fold cross
validation stratified by outcome and age (CV KB).
26
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
References of the Supplementary Material
1. Simpson, S. et al.
Radiological Society of North America Expert Consensus Statement
on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of
Thoracic Radiology, the American College of Radiology, and RSNA. Radiology:
Cardiothoracic Imaging
2, e200152 (2020).
2. Hansell, D. M. et al.
Fleischner Society: glossary of terms for thoracic imaging.
Radiology
246, 697–722 (2008).
3. La société d’Imagerie Thoracique propose un compte-rendu structuré de scanner
thoracique pour les patients suspects de COVID-19. SFR e-Bulletin
https://ebulletin.radiologie.fr/actualites-covid-19/societe-dimagerie-thoracique-propose-c
ompte-rendu-structure-scanner-thoracique (2020).
4. Hara, K., Kataoka, H. & Satoh, Y. Learning Spatio-Temporal Features with 3D Residual
Networks for Action Recognition. in 2017 IEEE International Conference on Computer
Vision Workshops (ICCVW)
3154–3160 (2017).
5. Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical
Image Segmentation. in Medical Image Computing and Computer-Assisted Intervention
– MICCAI 2015
234–241 (Springer International Publishing, 2015).
6. Baldeon Calisto, M. & Lai-Yuen, S. K. AdaEn-Net: An ensemble of adaptive 2D-3D Fully
Convolutional Networks for medical image segmentation. Neural Netw.
126, 76–94
(2020).
7. Hofmanninger, J. et al.
Automatic lung segmentation in routine imaging is a data
diversity problem, not a methodology problem. arXiv [eess.IV]
(2020).
8. Tan, M. & Le, Q. V. EfficientNet: Rethinking Model Scaling for Convolutional Neural
Networks. arXiv [cs.LG]
(2019).
27
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
9. He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition.
arXiv [cs.CV]
(2015).
10. Chen, X., Fan, H., Girshick, R. & He, K. Improved Baselines with Momentum
Contrastive Learning. arXiv [cs.CV]
(2020).
11. Yan, K., Wang, X., Lu, L. & Summers, R. M. DeepLesion: automated mining of
large-scale lesion annotations and universal lesion detection with deep learning. J Med
Imaging (Bellingham)
5, 036501 (2018).
12. LIDC-IDRI - The Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging
Archive Wiki. https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI.
13. Pedregosa, F. et al.
Scikit-learn: Machine learning in Python. the Journal of machine
Learning research
12, 2825–2830 (2011).
14. Sun, X. & Xu, W. Fast implementation of DeLong’s algorithm for comparing the areas
under correlated receiver operating characteristic curves. IEEE Signal Process. Lett.
21,
1389–1393 (2014).
15. Colombi, D. et al.
Well-aerated Lung on Admitting Chest CT to Predict Adverse
Outcome in COVID-19 Pneumonia. Radiology
201433 (2020).
28
. CC-BY 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 19, 2020. .https://doi.org/10.1101/2020.05.14.20101972doi: medRxiv preprint
... At present, there are hundreds of papers in preprint servers and medical journals employing machine learning methodologies in an attempt to bridge the gaps in the diagnosis, triage, and management of COVID-19; eight of them have integrated both radiological and clinical data. [1][2][3][4][5][6][7][8] However, most of these studies were found to have little clinical utility, producing a credibility crisis in the realm of artificial intelligence in healthcare. A recent review by Roberts et al. found that, after screening more than 400 machine learning models using various risk and bias assessment tools, none of the evaluated machine learning models had sufficiently fulfilled all of the following: (1) documentation of reproducible methods, (2) adherence to best practices in the development of a model, and (3) external validation that could justify claims of applicability. ...
... Therefore, we believe that some of the ways investigators can address the questions surrounding the credibility of a machine learning model are (1) to state the clinical context, which include patient demographics, geography, and timeframe, of the training and testing datasets that were used (2) to provide the resources for other centers to create or fine-tune models specific to their contexts, (3) to be explicit about the appropriate level of the model's generalizability based on results of external validation studies, (4) to explore strategies that either build in and/or evaluate the explainability of models, and (5) to externally validate the performance of the model on different subpopulations of the sample and explore the fairness of the model in underrepresented patient groups. ...
... 30-day mortality is chosen as the target outcome in accordance with clinical precedence. [4][5][6] The F1-score finds an equal balance between PPV (precision) and sensitivity (recall), which gives a better indication of model performance for unbalanced dataset (mortality is relatively rare compared to survival). Reporting all metrics allows assessment of how the models might perform in populations with a different COVID-19 related mortality distribution. ...
Preprint
Full-text available
The unprecedented global crisis brought about by the COVID-19 pandemic has sparked numerous efforts to create predictive models for the detection and prognostication of SARS-CoV-2 infections with the goal of helping health systems allocate resources. Machine learning models, in particular, hold promise for their ability to leverage patient clinical information and medical images for prediction. However, most of the published COVID-19 prediction models thus far have little clinical utility due to methodological flaws and lack of appropriate validation. In this paper, we describe our methodology to develop and validate multi-modal models for COVID-19 mortality prediction using multi-center patient data. The models for COVID-19 mortality prediction were developed using retrospective data from Madrid, Spain (N=2547) and were externally validated in patient cohorts from a community hospital in New Jersey, USA (N=242) and an academic center in Seoul, Republic of Korea (N=336). The models we developed performed differently across various clinical settings, underscoring the need for a guided strategy when employing machine learning for clinical decision-making. We demonstrated that using features from both the structured electronic health records and chest X-ray imaging data resulted in better 30-day-mortality prediction performance across all three datasets (areas under the receiver operating characteristic curves: 0.85 (95% confidence interval: 0.83-0.87), 0.76 (0.70-0.82), and 0.95 (0.92-0.98)). We discuss the rationale for the decisions made at every step in developing the models and have made our code available to the research community. We employed the best machine learning practices for clinical model development. Our goal is to create a toolkit that would assist investigators and organizations in building multi-modal models for prediction, classification and/or optimization.
... These models were developed for predicting severity of outcomes including: death or need for ventilation 72,78,79 , a need for intensive care unit (ICU) admission 63,73,77-79 , progression to acute respiratory distress syndrome 80 , the length of hospital stay 51,74 , likelihood of conversion to severe disease 64,65,75 and the extent of lung infection 76 . Most papers used models based on a multivariable Cox proportional hazards model 51,72,78,79 , logistic regression 65,[73][74][75]80 , linear regression 75,76 , random forest 74,77 or compare a huge variety of machine learning models such as tree-based methods, support vector machines, neural networks and nearestneighbour clustering 63,64 . ...
... Predictors from radiological data were extracted using either handcrafted radiomic features 63,64,[68][69][70]72,74,75,[77][78][79][80] or deep learning 51,66,70,71,73,76 . Clinical data included basic observations, serology and comorbidities. ...
... Clinical data included basic observations, serology and comorbidities. Only eight models integrated both radiological and clinical data 62,63,69,72,73,[77][78][79] . ...
Article
Full-text available
Machine learning methods offer great promise for fast and accurate detection and prognostication of coronavirus disease 2019 (COVID-19) from standard-of-care chest radiographs (CXR) and chest computed tomography (CT) images. Many articles have been published in 2020 describing new machine learning-based models for both of these tasks, but it is unclear which are of potential clinical utility. In this systematic review, we consider all published papers and preprints, for the period from 1 January 2020 to 3 October 2020, which describe new machine learning models for the diagnosis or prognosis of COVID-19 from CXR or CT images. All manuscripts uploaded to bioRxiv, medRxiv and arXiv along with all entries in EMBASE and MEDLINE in this timeframe are considered. Our search identified 2,212 studies, of which 415 were included after initial screening and, after quality screening, 62 studies were included in this systematic review. Our review finds that none of the models identified are of potential clinical use due to methodological flaws and/or underlying biases. This is a major weakness, given the urgency with which validated COVID-19 models are needed. To address this, we give many recommendations which, if followed, will solve these issues and lead to higher-quality model development and well-documented manuscripts.
... These models were developed for predicting severity of outcomes including: death or need for ventilation 72,78,79 , a need for intensive care unit (ICU) admission 63,73,77-79 , progression to acute respiratory distress syndrome 80 , the length of hospital stay 51,74 , likelihood of conversion to severe disease 64,65,75 and the extent of lung infection 76 . Most papers used models based on a multivariable Cox proportional hazards model 51,72,78,79 , logistic regression 65,[73][74][75]80 , linear regression 75,76 , random forest 74,77 or compare a huge variety of machine learning models such as tree-based methods, support vector machines, neural networks and nearestneighbour clustering 63,64 . ...
... Predictors from radiological data were extracted using either handcrafted radiomic features 63,64,[68][69][70]72,74,75,[77][78][79][80] or deep learning 51,66,70,71,73,76 . Clinical data included basic observations, serology and comorbidities. ...
... Clinical data included basic observations, serology and comorbidities. Only eight models integrated both radiological and clinical data 62,63,69,72,73,[77][78][79] . ...
Article
Full-text available
Machine learning methods offer great promise for fast and accurate detection and prognostication of coronavirus disease 2019 (COVID-19) from standard-of-care chest radiographs (CXR) and chest computed tomography (CT) images. Many articles have been published in 2020 describing new machine learning-based models for both of these tasks, but it is unclear which are of potential clinical utility. In this systematic review, we consider all published papers and preprints, for the period from 1 January 2020 to 3 October 2020, which describe new machine learning models for the diagnosis or prognosis of COVID-19 from CXR or CT images. All manuscripts uploaded to bioRxiv, medRxiv and arXiv along with all entries in EMBASE and MEDLINE in this timeframe are considered. Our search identified 2,212 studies, of which 415 were included after initial screening and, after quality screening, 62 studies were included in this systematic review. Our review finds that none of the models identified are of potential clinical use due to methodological flaws and/or underlying biases. This is a major weakness, given the urgency with which validated COVID-19 models are needed. To address this, we give many recommendations which, if followed, will solve these issues and lead to higher-quality model development and well-documented manuscripts.
... Two developed models estimating mortality risk or need for intubation (AUC=0·70, Accuracy=0·81 resp.) 42,43 , two predicted the length of hospital stay (AUC=0·97, 0·88 resp.) 36,44 , one estimated likelihood of conversion to severe disease (AUC=0·86) 45 , and one predicted the extent of lung infection using CXR (Correlation=0·78) 46 . ...
... 36,44 , one estimated likelihood of conversion to severe disease (AUC=0·86) 45 , and one predicted the extent of lung infection using CXR (Correlation=0·78) 46 . Predictors from radiological data were extracted using either handcrafted radiomic features 42,44,45 or deep learning 36,43,46 . Clinical data included basic observations, serology and comorbidities. ...
... Including both internal and external validation allows more insight to generalisability of the algorithm. We find 23/29 papers consider internal validation only with 6/29 using external validation 21,27,28,36,42,43 . Five used truly external test datasets and one tested on the same data the algorithm was trained on 21 . ...
Preprint
Full-text available
Background: Machine learning methods offer great potential for fast and accurate detection and prognostication of COVID-19 from standard-of-care chest radiographs (CXR) and computed tomography (CT) images. In this systematic review we critically evaluate the machine learning methodologies employed in the rapidly growing literature. Methods: In this systematic review we reviewed EMBASE via OVID, MEDLINE via PubMed, bioRxiv, medRxiv and arXiv for published papers and preprints uploaded from Jan 1, 2020 to June 24, 2020. Studies which consider machine learning models for the diagnosis or prognosis of COVID-19 from CXR or CT images were included. A methodology quality review of each paper was performed against established benchmarks to ensure the review focusses only on high-quality reproducible papers. This study is registered with PROSPERO [CRD42020188887]. Interpretation: Our review finds that none of the developed models discussed are of potential clinical use due to methodological flaws and underlying biases. This is a major weakness, given the urgency with which validated COVID-19 models are needed. Typically, we find that the documentation of a model's development is not sufficient to make the results reproducible and therefore of 168 candidate papers only 29 are deemed to be reproducible and subsequently considered in this review. We therefore encourage authors to use established machine learning checklists to ensure sufficient documentation is made available, and to follow the PROBAST (prediction model risk of bias assessment tool) framework to determine the underlying biases in their model development process and to mitigate these where possible. This is key to safe clinical implementation which is urgently needed.
... AI has been surveyed for COVID-19 in both diagnosis and digital contact tracing [62]. The study in [69] found that Nathalie's patients were diagnosed with the help of multi-modal AI-based clinical characteristics. As a result, chest CTs and lab test prediction of COVID-19 outcome were improved by 15% of total cases. ...
Article
Full-text available
The current pandemic caused by the COVID-19 virus requires more effort, experience, and science-sharing to overcome the damage caused by the pathogen. The fast and wide human-to-human transmission of the COVID-19 virus demands a significant role of the newest technologies in the form of local and global computing and information sharing, data privacy, and accurate tests. The advancements of deep neural networks, cloud computing solutions, blockchain technology, and beyond 5G (B5G) communication have contributed to the better management of the COVID-19 impacts on society. This paper reviews recent attempts to tackle the COVID-19 situation using these technological advancements.
... Prior studies describe separate use of DL algorithms, volume of disease, and radiomics for diagnosis, disease severity, treatment response, outcome (death), oxygen supplement, intubation and ICU admission in patients with SARS-CoV-2 pneumonia. 5,6,[17][18][19] Although performance of our DL algorithm and radiomics approach is similar to prior reports, besides the influence of motion artifacts, we document both the comparative and additive value of DL-based and radiomics features in prediction of outcome and need for ICU admission. The previously reported subjective grading of disease extent in each lobe, a tedious and time-consuming process, we demonstrate that quantitative lung lobelevel information on volume and percentage of affected lungs is superior for assessing disease severity and predicting patient outcome. ...
Article
Purpose Comparison of deep learning algorithm, radiomics and subjective assessment of chest CT for predicting outcome (death or recovery) and intensive care unit (ICU) admission in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Methods The multicenter, ethical committee-approved, retrospective study included non-contrast-enhanced chest CT of 221 SARS-CoV-2 positive patients from Italy (n = 196 patients; mean age 64 ± 16 years) and Denmark (n = 25; mean age 69 ± 13 years). A thoracic radiologist graded presence, type and extent of pulmonary opacities and severity of motion artifacts in each lung lobe on all chest CTs. Thin-section CT images were processed with CT Pneumonia Analysis Prototype (Siemens Healthineers) which yielded segmentation masks from a deep learning (DL) algorithm to derive features of lung abnormalities such as opacity scores, mean HU, as well as volume and percentage of all-attenuation and high-attenuation (opacities >−200 HU) opacities. Separately, whole lung radiomics were obtained for all CT exams. Analysis of variance and multiple logistic regression were performed for data analysis. Results Moderate to severe respiratory motion artifacts affected nearly one-quarter of chest CTs in patients. Subjective severity assessment, DL-based features and radiomics predicted patient outcome (AUC 0.76 vs AUC 0.88 vs AUC 0.83) and need for ICU admission (AUC 0.77 vs AUC 0.0.80 vs 0.82). Excluding chest CT with motion artifacts, the performance of DL-based and radiomics features improve for predicting ICU admission. Conclusion DL-based and radiomics features of pulmonary opacities from chest CT were superior to subjective assessment for differentiating patients with favorable and adverse outcomes.
Article
Full-text available
The unprecedented global crisis brought about by the COVID-19 pandemic has sparked numerous efforts to create predictive models for the detection and prognostication of SARS-CoV-2 infections with the goal of helping health systems allocate resources. Machine learning models, in particular, hold promise for their ability to leverage patient clinical information and medical images for prediction. However, most of the published COVID-19 prediction models thus far have little clinical utility due to methodological flaws and lack of appropriate validation. In this paper, we describe our methodology to develop and validate multi-modal models for COVID-19 mortality prediction using multi-center patient data. The models for COVID-19 mortality prediction were developed using retrospective data from Madrid, Spain (N = 2547) and were externally validated in patient cohorts from a community hospital in New Jersey, USA (N = 242) and an academic center in Seoul, Republic of Korea (N = 336). The models we developed performed differently across various clinical settings, underscoring the need for a guided strategy when employing machine learning for clinical decision-making. We demonstrated that using features from both the structured electronic health records and chest X-ray imaging data resulted in better 30-day mortality prediction performance across all three datasets (areas under the receiver operating characteristic curves: 0.85 (95% confidence interval: 0.83–0.87), 0.76 (0.70–0.82), and 0.95 (0.92–0.98)). We discuss the rationale for the decisions made at every step in developing the models and have made our code available to the research community. We employed the best machine learning practices for clinical model development. Our goal is to create a toolkit that would assist investigators and organizations in building multi-modal models for prediction, classification, and/or optimization.
Article
Background The novel coronavirus (COVID-19) has presented a significant and urgent threat to global health and there has been a need to identify prognostic factors in COVID-19 patients. The aim of this study was to determine whether chest CT characteristics had any prognostic value in patients with COVID-19. Methods A retrospective analysis of COVID-19 patients who underwent a chest CT-scan was performed in four medical centers. The prognostic value of chest CT results was assessed using a multivariable survival analysis with the Cox model. The characteristics included in the model were the degree of lung involvement, ground glass opacities, nodular consolidations, linear consolidations, a peripheral topography, a predominantly inferior lung involvement, pleural effusion, and crazy paving. The model was also adjusted on age, sex, and the center in which the patient was hospitalized. The primary endpoint was 30-day in-hospital mortality. A second model used a composite endpoint of admission to an intensive care unit or 30-day in-hospital mortality. Results A total of 515 patients with available follow-up information were included. Advanced age, a degree of pulmonary involvement ≥ 50% (Hazard Ratio 2.25 [95% Cl: 1.378 to 3.671], p= 0.001), nodular consolidations and pleural effusions were associated with lower 30-day in-hospital survival rates. An exploratory subgroup analysis showed a 60.6% mortality rate in patients over 75 with ≥ 50% lung involvement on a CT-scan. Conclusions Chest CT findings such as the percentage of pulmonary involvement ≥ 50%, pleural effusion and nodular consolidation were strongly associated with 30-day mortality in COVID-19 patients. CT examinations are essential for the assessment of severe COVID-19 patients and their results must be considered when making care management decisions.
Article
Coronavirus disease 2019 (COVID-19) emerged in 2019 and disseminated around the world rapidly. Computed tomography (CT) imaging has been proven to be an important tool for screening, disease quantification and staging. The latter is of extreme importance for organizational anticipation (availability of intensive care unit beds, patient management planning) as well as to accelerate drug development through rapid, reproducible and quantified assessment of treatment response. Even if currently there are no specific guidelines for the staging of the patients, CT together with some clinical and biological biomarkers are used. In this study, we collected a multi-center cohort and we investigated the use of medical imaging and artificial intelligence for disease quantification, staging and outcome prediction. Our approach relies on automatic deep learning-based disease quantification using an ensemble of architectures, and a datadriven consensus for the staging and outcome prediction of the patients fusing imaging biomarkers with clinical and biological attributes. Highly promising results on multiple external/independent evaluation cohorts as well as comparisons with expert human readers demonstrate the potentials of our approach.
Article
Full-text available
Routine screening CT for the identification of coronavirus disease 19 (COVID-19) pneumonia is currently not recommended by most radiology societies. However, the number of CT examinations performed in persons under investigation for COVID-19 has increased. We also anticipate that some patients will have incidentally detected findings that could be attributable to COVID-19 pneumonia, requiring radiologists to decide whether or not to mention COVID-19 specifically as a differential diagnostic possibility. We aim to provide guidance to radiologists in reporting CT findings potentially attributable to COVID-19 pneumonia, including standardized language to reduce reporting variability when addressing the possibility of COVID-19. When typical or indeterminate features of COVID-19 pneumonia are present in endemic areas as an incidental finding, we recommend contacting the referring providers to discuss the likelihood of viral infection. These incidental findings do not necessarily need to be reported as COVID-19 pneumonia. In this setting, using the term viral pneumonia can be a reasonable and inclusive alternative. However, if one opts to use the term COVID-19 in the incidental setting, consider the provided standardized reporting language. In addition, practice patterns may vary, and this document is meant to serve as a guide. Consultation with clinical colleagues at each institution is suggested to establish a consensus reporting approach. The goal of this expert consensus is to help radiologists recognize findings of COVID-19 pneumonia and aid their communication with other health care providers, assisting management of patients during this pandemic. Published under a CC BY 4.0 license.
Article
Full-text available
Radiologic characteristics of 2019 novel coronavirus (2019-nCoV) infected pneumonia (NCIP) which had not been fully understood are especially important for diagnosing and predicting prognosis. We retrospective studied 27 consecutive patients who were confirmed NCIP, the clinical characteristics and CT image findings were collected, and the association of radiologic findings with mortality of patients was evaluated. 27 patients included 12 men and 15 women, with median age of 60 years (IQR 47–69). 17 patients discharged in recovered condition and 10 patients died in hospital. The median age of mortality group was higher compared to survival group (68 (IQR 63–73) vs 55 (IQR 35–60), P = 0.003). The comorbidity rate in mortality group was significantly higher than in survival group (80% vs 29%, P = 0.018). The predominant CT characteristics consisted of ground glass opacity (67%), bilateral sides involved (86%), both peripheral and central distribution (74%), and lower zone involvement (96%). The median CT score of mortality group was higher compared to survival group (30 (IQR 7–13) vs 12 (IQR 11–43), P = 0.021), with more frequency of consolidation (40% vs 6%, P = 0.047) and air bronchogram (60% vs 12%, P = 0.025). An optimal cutoff value of a CT score of 24.5 had a sensitivity of 85.6% and a specificity of 84.5% for the prediction of mortality. 2019-nCoV was more likely to infect elderly people with chronic comorbidities. CT findings of NCIP were featured by predominant ground glass opacities mixed with consolidations, mainly peripheral or combined peripheral and central distributions, bilateral and lower lung zones being mostly involved. A simple CT scoring method was capable to predict mortality.
Article
Full-text available
Background: An outbreak of coronavirus disease 2019 (COVID-19) occurred in Wuhan, China; the epidemic is more widespread than initially estimated, with cases now confirmed in multiple countries. Aims: The aim of this meta-analysis was to assess the prevalence of comorbidities in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)infected patients and the risk of underlying diseases in severe patients compared to non-severe patients. Methods: A literature search was conducted using the databases PubMed, EMBASE, and Web of Science through February 25, 2020. Odds ratios (ORs) and 95% confidence intervals (CIs) were pooled using random-effects models. Results: Seven studies were included in the meta-analysis, including 1 576 infected patients. The results showed the most prevalent clinical symptom was fever (91.3%, 95% CI: 86–97%), followed by cough (67.7%, 95% CI: 59–76%), fatigue (51.0%, 95% CI: 34–68%) and dyspnea (30.4%, 95% CI: 21–40%). The most prevalent comorbidities were hypertension (21.1%, 95% CI: 13.0–27.2%) and diabetes (9.7%, 95% CI: 7.2– 12.2%), followed by cardiovascular disease (8.4%, 95% CI: 3.8–13.8%) and respiratory system disease (1.5%, 95% CI: 0.9–2.1%). When compared between severe and non-severe patients, the pooled OR of hypertension, respiratory system disease, and cardiovascular disease were 2.36 (95% CI: 1.46–3.83), 2.46 (95% CI: 1.76–3.44) and 3.42 (95% CI: 1.88–6.22) respectively. Conclusion: We assessed the prevalence of comorbidities in the COVID-19 patients and found that underlying disease, including hypertension, respiratory system disease and cardiovascular disease, may be risk factors for severe patients compared with non-severe patients.
Article
The novel COVID-19 outbreak has affected more than 200 countries and territories as of March 2020. Given that patients with cancer are generally more vulnerable to infections, systematic analysis of diverse cohorts of patients with cancer affected by COVID-19 is needed. We performed a multicenter study including 105 patients with cancer and 536 age-matched noncancer patients confirmed with COVID-19. Our results showed COVID-19 patients with cancer had higher risks in all severe outcomes. Patients with hematologic cancer, lung cancer, or with metastatic cancer (stage IV) had the highest frequency of severe events. Patients with nonmetastatic cancer experienced similar frequencies of severe conditions to those observed in patients without cancer. Patients who received surgery had higher risks of having severe events, whereas patients who underwent only radiotherapy did not demonstrate significant differences in severe events when compared with patients without cancer. These findings indicate that patients with cancer appear more vulnerable to SARS-CoV-2 outbreak. Significance Because this is the first large cohort study on this topic, our report will provide much-needed information that will benefit patients with cancer globally. As such, we believe it is extremely important that our study be disseminated widely to alert clinicians and patients. This article is highlighted in the In This Issue feature, p. 747
Article
Background Computed tomography (CT) of patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) disease depicts the extent of lung involvement in COVID-19 pneumonia. Purpose The aim of the study was to determine the value of quantification of the well-aerated lung obtained at baseline chest CT for determining prognosis in patients with COVID-19 pneumonia. Materials and Methods Patients who underwent chest CT suspected for COVID-19 pneumonia at the emergency department admission between February 17 to March 10, 2020 were retrospectively analyzed. Patients with negative reverse-transcription polymerase chain reaction (RT-PCR) for SARS-CoV-2 in nasal-pharyngeal swabs, negative chest CT, and incomplete clinical data were excluded. CT was analyzed for quantification of well aerated lung visually (%V-WAL) and by open-source software (%S-WAL and absolute volume, VOL-WAL). Clinical parameters included demographics, comorbidities, symptoms and symptom duration, oxygen saturation and laboratory values. Logistic regression was used to evaluate relationship between clinical parameters and CT metrics versus patient outcome (ICU admission/death vs. no ICU admission/ death). The area under the receiver operating characteristic curve (AUC) was calculated to determine model performance. Results The study included 236 patients (females 59/123, 25%; median age, 68 years). A %V-WAL<73% (OR, 5.4; 95% CI, 2.7-10.8; P<0.001), %S-WAL<71% (OR, 3.8; 95% CI, 1.9-7.5; P<0.001), and VOL-WAL<2.9 L (OR, 2.6; 95% CI, 1.2-5.8; P<0.01) were predictors of ICU admission/death. In comparison with clinical model containing only clinical parameters (AUC, 0.83), all three quantitative models showed higher diagnostic performance (AUC 0.86 for all models). The models containing %V-WAL<73% and VOL-WAL<2.9L were superior in terms of performance as compared to the models containing only clinical parameters (P=0.04 for both models). Conclusion In patients with confirmed COVID-19 pneumonia, visual or software quantification the extent of CT lung abnormality were predictors of ICU admission or death.
Article
Objectives To characterize the chest computed tomography (CT) findings of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) according to clinical severity. We compared the CT features of common cases and severe cases, symptomatic patients and asymptomatic patients, and febrile and afebrile patients.Methods This was a retrospective analysis of the clinical and thoracic CT features of 120 consecutive patients with confirmed SARS-CoV-2 pneumonia admitted to a tertiary university hospital between January 10 and February 10, 2020, in Wuhan city, China.ResultsOn admission, the patients generally complained of fever, cough, shortness of breath, and myalgia or fatigue, with diarrhea often present in severe cases. Severe patients were 20 years older on average and had comorbidities and an elevated lactate dehydrogenase (LDH) level. There were no differences in the CT findings between asymptomatic and symptomatic common type patients or between afebrile and febrile patients, defined according to Chinese National Health Commission guidelines.Conclusions The clinical and CT features at admission may enable clinicians to promptly evaluate the prognosis of patients with SARS-CoV-2 pneumonia. Clinicians should be aware that clinically silent cases may present with CT features similar to those of symptomatic common patients.Key Points • The clinical features and predominant patterns of abnormalities on CT for asymptomatic, typic common, and severe cases were summarized. These findings may help clinicians to identify severe patients quickly at admission. • Clinicians should be cautious that CT findings of afebrile/asymptomatic patients are not better than the findings of other types of patients. These patients should also be quarantined. • The use of chest CT as the main screening method in epidemic areas is recommended.
Article
Background CT may play a central role in the diagnosis and management of COVID-19 pneumonia. Purpose To perform a longitudinal study to analyze the serial CT findings over time in patients with COVID-19 pneumonia. Materials and Methods During January 16 to February 17, 2020, 90 patients (male:female, 33:57; mean age, 45 years) with COVID-19 pneumonia were prospectively enrolled and followed up until they were discharged or died, or until the end of the study. A total of 366 CT scans were acquired and reviewed by 2 groups of radiologists for the patterns and distribution of lung abnormalities, total CT scores and number of zones involved. Those features were analyzed for temporal change. Results CT scores and number of zones involved progressed rapidly, peaked during illness days 6-11 (median: 5 and 5), and followed by persistence of high levels. The predominant pattern of abnormalities after symptom onset was ground-glass opacity (35/78 [45%] to 49/79 [62%] in different periods). The percentage of mixed pattern peaked (30/78 [38%]) on illness days 12-17, and became the second most predominant pattern thereafter. Pure ground-glass opacity was the most prevalent sub-type of ground-glass opacity after symptom onset (20/50 [40%] to 20/28 [71%]). The percentage of ground-glass opacity with irregular linear opacity peaked on illness days 6-11 (14/50 [28%)]) and became the second most prevalent subtype thereafter. The distribution of lesions was predominantly bilateral and subpleural. 66/70 (94%) patients discharged had residual disease on final CT scans (median CT scores and zones involved: 4 and 4), with ground-glass opacity (42/70 [60%]) and pure ground-glass opacity (31/42 [74%]) the most common pattern and subtype. Conclusion The extent of lung abnormalities on CT peaked during illness days 6-11. The temporal changes of the diverse CT manifestations followed a specific pattern, which might indicate the progression and recovery of the illness.
Article
Background Since December, 2019, Wuhan, China, has experienced an outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Epidemiological and clinical characteristics of patients with COVID-19 have been reported but risk factors for mortality and a detailed clinical course of illness, including viral shedding, have not been well described. Methods In this retrospective, multicentre cohort study, we included all adult inpatients (≥18 years old) with laboratory-confirmed COVID-19 from Jinyintan Hospital and Wuhan Pulmonary Hospital (Wuhan, China) who had been discharged or had died by Jan 31, 2020. Demographic, clinical, treatment, and laboratory data, including serial samples for viral RNA detection, were extracted from electronic medical records and compared between survivors and non-survivors. We used univariable and multivariable logistic regression methods to explore the risk factors associated with in-hospital death. Findings 191 patients (135 from Jinyintan Hospital and 56 from Wuhan Pulmonary Hospital) were included in this study, of whom 137 were discharged and 54 died in hospital. 91 (48%) patients had a comorbidity, with hypertension being the most common (58 [30%] patients), followed by diabetes (36 [19%] patients) and coronary heart disease (15 [8%] patients). Multivariable regression showed increasing odds of in-hospital death associated with older age (odds ratio 1·10, 95% CI 1·03–1·17, per year increase; p=0·0043), higher Sequential Organ Failure Assessment (SOFA) score (5·65, 2·61–12·23; p<0·0001), and d-dimer greater than 1 μg/L (18·42, 2·64–128·55; p=0·0033) on admission. Median duration of viral shedding was 20·0 days (IQR 17·0–24·0) in survivors, but SARS-CoV-2 was detectable until death in non-survivors. The longest observed duration of viral shedding in survivors was 37 days. Interpretation The potential risk factors of older age, high SOFA score, and d-dimer greater than 1 μg/L could help clinicians to identify patients with poor prognosis at an early stage. Prolonged viral shedding provides the rationale for a strategy of isolation of infected patients and optimal antiviral interventions in the future. Funding Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences; National Science Grant for Distinguished Young Scholars; National Key Research and Development Program of China; The Beijing Science and Technology Project; and Major Projects of National Science and Technology on New Drug Creation and Development.