ArticlePDF Available

Abstract and Figures

Algorithms proposed in computational pathology can allow to automatically analyze digitized tissue samples of histopathological images to help diagnosing diseases. Tissue samples are scanned at a high-resolution and usually saved as images with several magnification levels, namely whole slide images (WSIs). Convolutional neural networks (CNNs) represent the state-of-the-art computer vision methods targeting the analysis of histopathology images, aiming for detection, classification and segmentation. However, the development of CNNs that work with multi-scale images such as WSIs is still an open challenge. The image characteristics and the CNN properties impose architecture designs that are not trivial. Therefore, single scale CNN architectures are still often used. This paper presents Multi_Scale_Tools, a library aiming to facilitate exploiting the multi-scale structure of WSIs. Multi_Scale_Tools currently include four components: a pre-processing component, a scale detector, a multi-scale CNN for classification and a multi-scale CNN for segmentation of the images. The pre-processing component includes methods to extract patches at several magnification levels. The scale detector allows to identify the magnification level of images that do not contain this information, such as images from the scientific literature. The multi-scale CNNs are trained combining features and predictions that originate from different magnification levels. The components are developed using private datasets, including colon and breast cancer tissue samples. They are tested on private and public external data sources, such as The Cancer Genome Atlas (TCGA). The results of the library demonstrate its effectiveness and applicability. The scale detector accurately predicts multiple levels of image magnification and generalizes well to independent external data. The multi-scale CNNs outperform the single-magnification CNN for both classification and segmentation tasks. The code is developed in Python and it will be made publicly available upon publication. It aims to be easy to use and easy to be improved with additional functions.
Content may be subject to copyright.
Multi_Scale_Tools: A Python Library to
Exploit Multi-Scale Whole Slide
Images
Niccolò Marini
1
,
2
*, Sebastian Otálora
1
,
2
, Damian Podareanu
3
, Mart van Rijthoven
4
,
Jeroen van der Laak
4
,
5
, Francesco Ciompi
4
, Henning Müller
1
,
6
and Manfredo Atzori
1
,
7
1
Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO Valais), Sierre, Switzerland,
2
Centre
Universitaire dInformatique, University of Geneva, Carouge, Switzerland,
3
SURFsara, Amsterdam, Netherlands,
4
Department of
Pathology, Radboud University Medical Center, Nijmegen, Netherlands,
5
Center for Medical Image Science and Visualization,
Linkoping University, Linkoping, Sweden,
6
Medical Faculty, University of Geneva, Geneva, Switzerland,
7
Department of
Neurosciences, University of Padua, Padua, Italy
Algorithms proposed in computational pathology can allow to automatically analyze
digitized tissue samples of histopathological images to help diagnosing diseases.
Tissue samples are scanned at a high-resolution and usually saved as images with
several magnication levels, namely whole slide images (WSIs). Convolutional neural
networks (CNNs) represent the state-of-the-art computer vision methods targeting the
analysis of histopathology images, aiming for detection, classication and segmentation.
However, the development of CNNs that work with multi-scale images such as WSIs is still
an open challenge. The image characteristics and the CNN properties impose architecture
designs that are not trivial. Therefore, single scale CNN architectures are still often used.
This paper presents Multi_Scale_Tools, a library aiming to facilitate exploiting the multi-
scale structure of WSIs. Multi_Scale_Tools currently include four components: a pre-
processing component, a scale detector, a multi-scale CNN for classication and a multi-
scale CNN for segmentation of the images. The pre-processing component includes
methods to extract patches at several magnication levels. The scale detector allows to
identify the magnication level of images that do not contain this information, such as
images from the scientic literature. The multi-scale CNNs are trained combining features
and predictions that originate from different magnication levels. The components are
developed using private datasets, including colon and breast cancer tissue samples. They
are tested on private and public external data sources, such as The Cancer Genome Atlas
(TCGA). The results of the library demonstrate its effectiveness and applicability. The scale
detector accurately predicts multiple levels of image magnication and generalizes well to
independent external data. The multi-scale CNNs outperform the single-magnication
CNN for both classication and segmentation tasks. The code is developed in Python and
it will be made publicly available upon publication. It aims to be easy to use and easy to be
improved with additional functions.
Keywords: multi-scale approaches, computational pathology, scale detection, classication, segmentation, deep
learning
Edited by:
Nianyin Zeng,
Xiamen University, China
Reviewed by:
Heimo Müller,
Medical University of Graz, Austria
Han Li,
Xiamen University, China
*Correspondence:
Niccolò Marini
niccolo.marini@hevs.ch
Specialty section:
This article was submitted to
Digital Public Health,
a section of the journal
Frontiers in Computer Science
Received: 23 March 2021
Accepted: 07 July 2021
Published: 09 August 2021
Citation:
Marini N, Otálora S, Podareanu D,
van Rijthoven M, van der Laak J,
Ciompi F, Müller H and Atzori M (2021)
Multi_Scale_Tools: A Python Library to
Exploit Multi-Scale Whole
Slide Images.
Front. Comput. Sci. 3:684521.
doi: 10.3389/fcomp.2021.684521
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845211
TECHNOLOGY AND CODE
published: 09 August 2021
doi: 10.3389/fcomp.2021.684521
1 INTRODUCTION
The implicit multi-scale structure of digitized histopathological
images represents an open challenge in computational pathology.
Training machine learning algorithms that can simultaneously
learn both microscopic and macroscopic tissue structures comes
with technical and computational challenges that are not yet well
studied.
As of 2021, histopathology represents the gold standard to
diagnose many diseases, including cancer (Aeffner et al., 2017;
Rorke, 1997). Histopathology images include several tissue
structures, ranging from microscopic entities (such as single
cell nuclei) to macroscopic components (such as tumor bulks).
Whole Slide Images (WSIs) are digitized histopathology images
that are scanned at high-resolution and are stored in a multi-scale
(pyramidal) format. WSI resolution is related to the spatial
resolution and the optical resolution used to scan the images
(Wu et al., 2010). The spatial resolution is the minimum distance
that the scanner can capture so that two objects are still
distinguished, measured in terms of μm per pixel (Sellaro
et al., 2013). The optical resolution (or magnication) is the
magnication factor (x) of the lens within the scanner (Sellaro
et al., 2013). Currently, the de facto standard spatial resolutions
adopted to scan tissue samples (for example in The Cancer
Genome Atlas) are usually 0.230.25 μm (magnication ×40)
or 0.460.50 μm (magnication ×20). Tissue samples such as
surgical resection samples (or specimens) are often
approximately 20 mm ×15 mm in size
1
, while samples such as
biopsies are approximatively 2 mm ×6 mm in size. The size of the
samples combined with the spatial resolution of the scanners
leads to gigapixel images: image size can reach 200 000 ×200 000
pixels, meaning gigabytes of pixel data. The multi-scale WSI
format (Figure 1) includes several magnication levels (with a
different spatial resolution) of the sample, stored in a pyramid,
usually varying between ×1.25 and 40x. The baseline image of the
pyramid is the one at the highest resolution. The multi-scale
structure of the images allows pathologists to analyze the image
from the lowest to the highest magnication level. Pathologists
analyze the images by rst identifying a few regions of interest
and zooming afterwards through them to visualize different
details of the tissue (Schmitz et al., 2019). Each magnication
level includes different types of information (Molin et al., 2016),
since tissue structures appear in different ways according to their
magnication level. Therefore, it is essential to detect an
abnormality and detect it in a specic range of levels. The
characteristics of microscopes and scanners often lead to a
scale-dependent analysis. For example, at middle magnication
levels (such as 510x) it is possible to distinguish between glands,
while at the highest ones (such as 2040x) it is possible to better
resolve cells. Figure 2 includes examples of tissues scanned at
different magnication levels.
Computational pathology is the computational analysis of digital
images obtained through scanning slides of cells and tissues (van der
Laak et al., 2021). Currently, deep Convolutional Neural Networks
(CNNs) are the state-of-the-art machine learning algorithms in
computational pathology tasks, in particular for classication (del
Toro et al., 2017;Arvaniti and Claassen, 2018;Coudray et al., 2018;
Komura and Ishikawa, 2018;Ren et al., 2018;Campanella et al.,
2019;Roy et al., 2019;Iizuka et al., 2020) and segmentation
(Ronneberger et al., 2015;Paramanandam et al., 2016;Naylor
et al., 2017;Naylor et al., 2018;Wang et al., 2019) of images.
Their success relies on automatically learning the relevant
FIGURE 1 | An example of WSI format including multiple magnication
levels. The size of each image of the pyramid is reported under the
magnication level in terms of pixels.
FIGURE 2 | An example of tissue represented at multiple magnication
level (5x, 10x, 20x, 40x). The tissues come from colon, prostate and lung
cancer images.
1
http://dicom.nema.org/Dicom/DICOMWSI/. Retrieved 13th of November, 2020
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845212
Marini et al. Multi_Scale_Tools
features from the input data. However, usually, CNNs cannot
easily handle the multi-scale structure of the images since they
are not scale-equivariant by design (Marcos et al., 2018;Zhu
et al., 2019) and because of WSI size. The equivariance property
of a transformation means that when a transformation is
applied, it is possible to predict how the representation will
change (Lenc and Vedaldi, 2015;Tensmeyer and Martinez,
2016). This is not normally true for CNNs, because if a scale
transformation is applied to the input data, it is usually not
possible to predict its effect on the output of the CNN. The
knowledge about the scale is essential for the model to identify
diseases, since the same tissue structures, represented at
different scales, include different information (Janowczyk and
Madabhushi, 2016). CNNs can identify abnormalities in tissues,
but the information and the features related to the abnormalities
arenotthesameforeachscalerepresentation(Jimenez-del Toro
et al., 2017). Therefore, the proper scale must be selected to train
CNNs (Gecer et al., 2018;Otálora et al., 2018b). Unfortunately,
scale information is not always available into images. This is the
case, for instance, of pictures taken with standard cameras or
processed in compression and resolution, such as images
downloaded from the web or images included in scientic
articles. Furthermore, modern hardware (Graphic Processing
Units,GPUs)cannoteasilyhandleWSIs,duetotheirlargepixel
size and the limited video random access memory space that has
to temporally store it. The combination of different
magnication levels leads to larger images, making it even
harder to analyze the images.
The characteristics of the WSIs can lead to modication of
CNNs in terms of architecture, both for classication (Jimenez-
del Toro et al., 2017;Lai and Deng, 2017;Gecer et al., 2018;Yang
et al., 2019;Hashimoto et al., 2020) and segmentation
(Ronneberger et al., 2015;Li et al., 2017;Salvi and Molinari,
2018;Schmitz et al., 2019;van Rijthoven et al., 2020), such as
multi-brach networks (Yang et al., 2019;Hashimoto et al., 2020;
Jain and Massoud, 2020), multiple receptive elds convolutional
neural networks (Han et al., 2017;Lai and Deng, 2017;Ullah,
2017;Li et al., 2019;Zhang et al., 2020) and U-Net based networks
(Bozkurt et al., 2018;van Rijthoven et al., 2020). The modication
of architectures to include multiple scales is prevalent in medical
imaging, since it can allow to identify examples of architectures
modications also from other modalities, such as MRI imaging
(Zeng et al., 2021a) and Gold immunochromatographic strip
(GIGS) images (Zeng et al., 2019;Zeng et al., 2021b).
The code library (called Multi_Scale_Tools) described in this
paper contributes to alleviate the mentioned problems by
presenting tools that allow handling and exploiting
histopathological imagesmulti-scale structure end-to-end
CNN architectures. The library includes pre-processing tools
to extract multi-scale patches, a scale detector, a component to
train a multi-scale CNN classier and a component to train a
multi-scale CNN for segmentation. The tools are platform-
independent and developed in Python. The code is publicly
available at https://github.com/sara-nl/multi-scale-tools.
Multi_Scale_Tools is aimed at being easy to use and easy to
be improved with additional functions.
2 METHODS
The library includes four components: a pre-processing tool, a
scale detector tool, a component to train a multi-scale CNN
classier and a component to train a multi-scale segmentation
CNN. Each tool is described in a dedicated subsection as follows:
Pre-processing component, Sub-section 2.1
Scale detector, Sub-section 2.2
Multi-scale CNN for classication, Sub-section 2.3
Multi-scale CNN for segmentation, Sub-section 2.4
2.1 Pre-Processing Component
The pre-processing component allows researchers to
generate multi-scale input data. The component includes two
parametric and scalable methods to extract patches from the
different magnication levels of a WSI: the grid extraction
and the multi center extraction method. Both methods
need a WSI and the corresponding tissue mask as input,
and they both produce images and metadata as output. The
grid extraction methods (Patch_Extractor_Dense_Grid.py,
Patch_Extractor_Dense_Grid_Strong_Labels.py), allow to
extract patches from one magnication level (Figure 3). The
tissue mask is split in a grid of patches according to the following
parameters: magnication level, mask magnication, patch size, and
stride between the patches. The output of the method is a set of
patches selected according to the parameters. The multi center
extraction methods (Patch_Extractor_Dense_Centroids.py,
Patch_Extractor_Dense_Centroids_Strong_Labels.py) allow to
extract patches from multiple magnication levels. According to
the usershighestmagnication level, the tissue mask is split into a
grid (as done in the functions previously described). The patches
within this grid are called centroids. Each centroid is used to generate
the coordinates for a patch at a lower magnication level, so that the
latter includes the centroid (the patch at the highest magnication
level) in its central section. The methodsoutputisasetoftuples,each
one including patches at different magnication levels (Figure 4).
Compared with other patch extraction methods, such as the one
presented in (Lu et al., 2021), this pre-processing component has two
main characteristics. The rst one is that the component extracts
patches from multiple magnication levels of the WSIs, pairing the
patches coming from the same region of the image. The second one is
that the component allows extracting patches from an arbitrary
magnication level, despite the magnication level not being
included in the WSI. Usually, patch extractor methods extract
patches only from the magnication levels stored in the WSI
format (Ma),suchas40x,20x,10x,5x,2.5xand1.25x.This
process is driven by the input parameters that include both the
patch size (Pw)and the magnication wanted (Mw).Themethod
extracts a patch of size Pafrom a magnication stored in the WSI and
afterwards the patch is resized to Pw.
Pw:MwPa:Ma.(1)
In both methods, only patches from tissue regions are
extracted and saved using tissue masks, distinguishing between
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845213
Marini et al. Multi_Scale_Tools
patches from tissue regions and patches from the background.
The methods are developed to work with masks including tissue
and, in case they are available, with pixel-wise annotated masks.
In the case of tissue masks, the tissue masks are generated using
HistoQC tools (Janowczyk et al., 2019). The HistoQC
conguration adopted is reported in the repository. In the case
of pixel-wise annotations, the masks must be rstly converted to a
RGB image.
Besides the patches, the methods save also metadata le (csv
les). The metadata includes information regarding the
magnication level where the patches are extracted and the x
and y coordinates of the patchesupper left corner. The scripts are
developed to be multi-thread, in order to exploit hardware
architectures with multiple cores. In the Supplementary
Materials section, the parameters for the scripts are described
in more detail.
2.2 Scale Detector
The scale detector tool is a CNN trained to estimate the
magnication level of a given patch or image. This task has been
explored in the past Otálora et al. (2018a),Otálora et al. (2018b) in
the prostate and breast tissue types. Similar approaches have been
recently extended to different organs in the TCGA repository Zaveri
et al. (2020). The tool involves the scripts related to the training of
the models (the input data generation, the training and testing
modules) and a module to use the detector as a standalone
component that performs the magnication inference for new
images. The models are trained in a fully-supervised fashion.
Therefore, the scripts to train them need a set of patches and
the corresponding magnication level as input, which are provided
into csv les, including the patch path and the corresponding
magnication levels. Two scripts are developed to generate the
input les, assuming that the patches are previously generated with
the pre-processing components, described in the previous section.
The rst script is made to split the WSIs into partitions
(Create_csv_from_partitions.py), which generates three les (the
input data for training, validation and testing partitions) starting
from three les (previously prepared by the user) including the
names of the WSIs. The second script (Create_csv.py) generates an
input data csv starting from a list of les. The model is trained
(Train_regressor.py) and tested (Test_regressor.py) with several
magnication levels that the user can choose (in this paper, 5x,
8x, 10x, 15x, 20x, 30x, 40x were used). Training the model with
patches from a discrete and small set of scales can lead to regressors
that are precise to estimate the magnications close to input scales,
and less precise when scales are far from them. Therefore, a scale
augmentation technique was applied to patches and labels during
the training (in addition to more standard augmentation techniques
adopted such as rotation, ipping and color augmentation). In
order to perform scale augmentation, the image is randomly
cropped of a factor and resized to the original patch size. The
factor is applied to perturbate also the magnication level. The scale
FIGURE 3 | An example of the grid extraction method. The patches in green are selected since they contain enough tissue.
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845214
Marini et al. Multi_Scale_Tools
detector component includes also a module to import and use the
model in the code (regression.py). The component works both as a
standalone module (with the required parameters) but it is also
possible to load the functions from the python module. The
Supplementary Materials section includes a more thorough
description of the parameters for the scripts.
2.3 Multi-Scale CNN for Classication
The Multi-scale CNN component includes scripts to train a multi-
scale CNN for classication, in a fully supervised fashion. Two
different multi-scale CNN architectures and two training variants
are proposed and compared with a single-scale CNN. The multi-scale
CNN architectures are composed of multiple branches (one for each
magnication level) trained with patches that come from several
magnications. Each branch is fed with patches from a specic
magnication level. The rst architecture of multi-scale CNN
combines each CNN branch features (the output of the
convolutional layers). The scripts developed to train and test the
models are Fully_supervised_training_combine_features_multi.py
and Fully_supervised_testing_combine_features_multi.py The
second architecture of multi-scale CNN combines the classier
predictions (the output of each CNNs fully-connected layers). The
scripts developed to train and test the models are
Fully_supervised_training_combine_probs_multi.py and
Fully_supervised_testing_combine_probs_multi.py Both
architectures are presented in two variants, optimizing respectively
one and multiple loss functions. In the rst variant (one loss function),
the input is a set of tuples of patches from several magnication levels
(one patch for each level), generated using the multi center
extraction tool (presented in Section 2.1). The input tuples are
generated with a script (Generate_csv_multicenter.py) that exploits
the coordinates of the patches (stored in the metadata) to generate the
tuples (stored in a csv le). The tuple label corresponds to the class of
the centroid patch (the patch from the highest level within the tuple).
Therefore, the model outputs only the class of the tuple. Only one loss
function is minimized in this variant, i.e. the categorical cross-entropy
between the CNN output and the patch ground truth. Figure 5
summarizes the CNN architecture. In the second variant (multiple loss
functions), the input is a set of tuples of patches from several
magnication levels (one patch for each level), previously
generated using the grid extraction method (presented in Section
2.1). The input tuples are generated with a script
FIGURE 4 | An example of the multi center extraction method. The grid is made according to the highest magnication level selected by the used. The patch is the
centroid for patches at lower magnication levels.
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845215
Marini et al. Multi_Scale_Tools
(Generate_csv_upper.py) that exploits the coordinates of the patches
(stored in the metadata) to generate the tuples (stored in a csv le). The
tuple labels correspond to the classes of the patches. The model has n+
1 outputs: the class for each of the n magnication levels and the whole
tuple class. In this variant, n+ 1 loss functions are minimized (n
representing the number of magnication levels considered). The n
loss functions are the categorical cross-entropy between the output for
each of the scale branches and the tuple labels. The other loss term is
the categorical cross-entropy between the output of the network (after
the combination of the features or the predictions of the single
branches) and the tuple labels. Figure 6 summarizes the CNN
architecture. The Supplementary Materials section includes a more
thorough description of the parameters.
2.4 Multi-Scale CNN for Segmentation
This component includes HookNet (van Rijthoven et al., 2020), a
multi-scale CNN for semantic segmentation. HookNet combines
information from low-resolution patches (large eld of view) and
high-resolution patches (small eld of view) to semantically segment
the image, using multiple branches. The low-resolution patches
come from lower magnication levels and include context
information, while the high-resolution patches come from higher
magnication levels and include more ne-grained information. The
network is composed of two branches of encoder-decoder models,
the context branch (fed with low-resolution patches) and the target
branch (fed with high-resolution patches). The two branches are fed
with concentric multi-eld-view multi-resolution (MFMR) patches
(284 ×284 pixels in size). Although they have the same architecture,
the branches do not share their weights (an encoder-decoder CNN
based on U-Net). Hooknet is thoroughly described in a dedicated
article (van Rijthoven et al., 2020).
2.5 Datasets
The following datasets are used to develop the Multi_Scale_Tools
components:
Colon dataset, Sub-section 2.5.1, used in the Pre-processing
component, the Scale detector and the Multi-scale CNN for
classication
Breast dataset, Sub-section 2.5.2, used in the Multi-scale
CNN for segmentation
Prostate dataset, Sub-section 2.5.3, used in the Scale detector
Lung dataset, Sub-section 2.5.4, used in the Scale detector
and the Multi-scale CNN for segmentation
2.5.1 Colon Dataset
The colon dataset is a subset of the ExaMode colon dataset. This
subset includes 148 WSIs (provided by the Department of
Pathology of Cannizaro Hospital, Catania, Italy), stained with
FIGURE 5 | The rst multi-scale CNN architecture, in which features are combined from different scale branches, optimizing only one loss function (A) and
optimizing n+ 1 loss function (B).
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845216
Marini et al. Multi_Scale_Tools
Hematoxylin and Eosin (H&E). The images are digitized with
an Aperio scanner: some of the images are scanned with a
maximum spatial resolution of 0.50 μmperpixel(20x),while
the others are scanned with a spatial resolution of 0.25 μmper
pixel (40x). The images are pixel-wise annotated by a
pathologist. The annotations include ve classes: cancer,
high-grade dysplasia, low-grade dysplasia, hyperplastic
polyp and non-informative tissue.
2.5.2 Breast Dataset
The breast dataset (provided by Department of Pathology of
Radboud University Medical Center, Nijmegen, Netherlands) is a
private dataset including 86 WSIs, stained with H&E. The images
are digitized with a 3DHistech scanner, with a spatial resolution
of 0.25 μm per pixel (40x). The images are pixel-wise annotated
by a pathologist. 6,279 regions are annotated, with the following
classes: ductal carcinoma in-situ (DCIS), invasive ductal
carcinoma (IDC), invasive lobular carcinoma (ILC) benign
epithelium (BE), other, and fat.
2.5.3 Prostate Dataset
The prostate dataset is a subset of the publicly available database
offered by The Cancer Genome Atlas (TCGA-PRAD), that
includes 20 WSIs, stained with H&E. The images come from
several sources and are digitized with different scanners, with a
spatial resolution of 0.25 μm per pixel (40x). The images come
without pixel-wise annotations.
2.5.4 Lung Dataset
The Lung dataset is a subset of the public available database
offered by The Cancer Genome Atlas Lung Squamous Cell
carcinoma dataset (TCGA-LUSC), including 27 WSIs stained
with H&E. The images come from several sources and are
digitized with different scanners, with a spatial resolution of
0.25 μm per pixel (40x). Initially, the images come without
pixel-wise annotation from the repository, but a medical
expert from Radboudc Hospital pixel-wise annotated them
with four classes: tertiary lymphoid structures (TLS), germinal
centers (GC), tumor, and other.
3 EXPERIMENTS AND RESULTS
The Section presents the assessment of the components of the
library Multi_Scale_Tools in dedicated subsections as follows:
Pre-processing component assessment, Sub-section 3.1
Scale detector assessment, Sub-section 3.2
Multi-scale CNN for classication assessment, Sub-
section 3.3
FIGURE 6 | The second multi-scale CNN architecture, in which predictions are combined from different scale branches, optimizing only one loss function (A) and
optimizing n+ 1 loss functions (B).
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845217
Marini et al. Multi_Scale_Tools
Multi-scale CNN for segmentation, Sub-section 3.4
Library organization, Sub-section 3.5
3.1 Pre-Processing Tool Assessment
The pre-processing component allows to extract a large amount
of patches from multiple magnication levels, guaranteeing
scalable performance. The patch extractor components (grid
and multi-center methods) are tested on WSIs scanned with
Aperio (.svs), 3DStech (.mrxs) and Hamamatsu (.ndpi)
scanners, on data coming from different tissues (colon,
prostate and lung) and datasets. Table 1 includes the
number of patches extracted. The upper part of the table
includes the number of patches extracted with the grid
extraction method, considering four different magnication
levels (5x, 10x, 20x, 40x). The lower part of the Table includes
the number of patches extracted with the multi-center
extraction method, considering two possible combinations
of magnication levels (5x/10x, 5x/10x/20x). In both cases,
patches are extracted with a patch size of 224 ×224 pixels
without any stride. Methods performance are evaluated in
terms of scalability, since the methods are designed to work
on multi-core hardwares. Table 2 includes the time results
obtained with the grid method (upper part) and with the
multi-center method (lower part). The evaluation is made
considering the amount of time needed to extract the
patches from the colon dataset, using several threads. The
results show that both the methods benet from multi-core
hardwares, reducing the time needed to pre-process data.
3.2 Scale Detector Assessment
The scale detector shows high performance in estimating the
magnication level of patches that come from different tissues.
The detector is trained with patches from the colon dataset and it is
tested with patches from three different tissues. The performance
of the models is assessed with the coefcient of determination (R2),
the Mean Square Error (MSE), the Cohensκ-score (McHugh,
2012) and the balanced accuracy. While the experimental setup
and the metrics descriptions are presented in detail the
supplementary material, Table 3 summarizes the results. The
higher performance is reached on the colon test partition, but
the scale detector shows high performance also on the other tissues.
The scale detector makes almost perfect scale estimations in the
colon dataset (data come from the same medical source and
include the same tissue type), in both the regression and the
classication metrics. The scale detector makes reasonably good
scale estimations also on the prostate data, in both the regression
and the classication metrics, and in lung dataset, where the
performance is the lowest though. The fact that the regressor
shows exceptionally high performance in colon data and good
performance in other tissues means that it has learnt to distinguish
the colon morphology represented at different magnication level
very well and that the learnt knowledge can generalize well to other
tissues too. Even though tissues from different organs share similar
structures (glands, stroma, etc.), the morphology of the structures is
different in the organs, such as prostate and colon glands. Training
the regressor with patches from several organs may allow to close
this gap, guaranteeing extremely high performance for different
types of tissue.
TABLE 1 | Number of patches extracted with the grid extraction method (above) and with the multi-center method (below), at different magnication level.
Grid
Dataset/Magnication 5x 10x 20x 40x Total
Colon dataset (148 WSIs) 15,514 67,592 279,964 1,127,190 1,490,260
Prostate dataset (20 WSIs) 11,468 46,676 187,254 743,583 988,981
Lung dataset (27 WSIs) 22,124 90,307 365,398 886,298 1,364,127
Multicenter
Dataset/Magnication 5x/10x 5x/10x/20x Total
Colon dataset 135,184 839,892 975,076
Prostate dataset 93,352 561,762 655,114
Lung dataset 180,614 1,096,194 1,276,808
TABLE 2 | Time needed to extract the patches (in seconds), varying the amount of threads, using the grid extraction method (above) and using the multi center method
(below). The method is evaluated on colon dataset (148 WSIs). The number of patches extracted from each method is reported in Table 1.
Magnication/N_threads 10 20 30 40 50
grid extraction
5x 408 ±3317±7 285 ±5 255s ±5 238 ±10
10x 553 ±5429±6 389 ±5 384 ±8 371 ±8
20x 1,295 ±100 969 ±113 876 ±69 872 ±41 869 ±15
multi center extraction
5x/10x 1,662 ±30 1,180 ±119 1,071 ±50 1,039 ±18 1,022 ±14
5x/10x/20x 6,604 ±104 5,745 ±45 5,283 ±161 4,814 ±137 4,549 ±82
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845218
Marini et al. Multi_Scale_Tools
3.3 Multi-Scale CNN for Classication
Assessment
The multi-scale CNNs show higher performance in the fully
supervised classication compared to the single-scale CNNs.
Several congurations of the multi-scale CNN architectures are
evaluated. They involve variations in optimization strategy (one or
multiple loss functions), in the magnication levels (combinations of
5x, 10x, 20x) and in how information from the scales is combined
(combining the single-scale predictions or the single-scale features).
Table 4 summarizes the results obtained.TheCNNsaretrainedand
tested with the colon dataset, that come with pixel-wise annotations
made by a pathologist. The performance of the models is assessed with
the Cohensκ-score and the balanced accuracy. More detailed
descriptions of the experimental setup and the metrics adopted are
presented in the Supplementary material. In the presented experiment,
the best multi-scale CNN architecture is the one that combines
features from 5/10x magnication levels and is trained optimizing
n+ 1 loss functions. It outperforms the best single-scale CNN (trained
with patches acquired at 5x) in terms of balanced accuracy, while the
κ-score of the two architectures is comparable. The characteristics of
the classes involved can explain the fact that CNNs trained combining
patches from 5/10x reach the highest results. These classes show
morphologies including several alterations of the gland structure.
Glands can be usually identied at low magnication levels, such
as 5/10x, while at 20x the cells are visible. For this reason, the CNNs
show high performance with patches from magnication 5/10x, while
including patches from 20x decreases the performance. The fact that
the discriminant characteristics are identied in a range of scales may
explain why the combination of the features shows higher
performance than the combination of the predictions.
3.4 Multi-Scale CNN for Segmentation
Assessment
The multi-scale CNN (HookNet) shows higher tissue segmentation
performancethansingle-scaleCNNs(U-Net).Themodelistrained
and tested with breast and lung datasets, comparing it with models
trained with images from a single magnication level. The
performance of the models is assessed with the F1 score and the
macro F1 score. More detailed descriptions of the experimental setup
and the metrics adopted are presented in the Supplementary
Material.Table 5 and Table 6 summarize the results obtained
respectively on the breast dataset and on lung dataset. In both the
tissues, HookNet shows an higher overall performance, while some of
the single scale U-Nets have better performance for some
segmentation tasks (such as breast DCIS or lung TLS). This result
can be interpreted as a consequence of the characteristics of the task,
therefore the user should choose the proper magnication levels to
combine, depending of the problem.
3.5 Library Organization
ThesourcecodeforthelibraryisavailableonGIT
2
,whiletheHookNet
code is available here
3
. The library is available can be deployed as Python
package directly from the repository or as Docker container that can be
downloaded from
4
(the multiscale folder). Interaction with the library is
done through a model class and an Inference class
5
.Themodel
instantion depends on the choice of algorithms. For a more detailed
explanation about the hyperparameters and other options please make
sure to browse the Readme le
6
.Anexamplecanbefoundhere
7
.The
Python libraries used to develop Multi_Scale_Tools are reported in
Supplementary Materials.
TABLE 3 | Performance of the scale detector, evaluated on three different tissue dataset. The scale detector is evaluated in: coefcient of determination (R2), Mean squared
error (MSE), balanced accuracy, Cohensκ-score.
Dataset/Metric R2MSE Balanced accuracy κ-score
Colon dataset 0.9997 ±0.0001 0.0250 ±0.0155 0.9859 ±0.0086 0.9991 ±0.0004
Prostate dataset 0.8013 ±0.0798 19.34 ±7.78 0.9094 ±0.0268 0.8515 ±0.0589
Lung dataset 0.6682 ±0.1549 32.13 ±15.01 0.7973 ±0.0458 0.8743 ±0.0571
TABLE 4 | Performance of the multi-scale CNNs architectures, compared with
CNNs trained with patches from only one magnication level, evaluated in
κ-score and balanced accuracy. Both the multi-scale architectures are presented
(combine features and combine predictions from multi-scale branches) and both
the training variants (one loss function and n+ 1 losses). The values marked in
bold highlight the method that reaches the highest performance, respect to
the metric.
Magnication/metric κ-score Balanced-accuracy
Single scale CNNs
5x 0.7127 ±0.0988 0.6558 ±0.0903
10x 0.6818 ±0.0940 0.6200 ±0.0780
20x 0.6005 ±0.1106 0.5744 ±0.0804
Multi-scale CNNs (combine features)
5x/10x (One loss) 0.6955 ±0.1013 0.6529 ±0.0859
5x/10x (n+ 1 losses) 0.7167 ±0.1060 0.6813 ±0.0942
5x/10x/20x (One loss) 0.6630 ±0.1090 0.6508 ±0.1089
5x/10x/20x (n+ 1 losses) 0.6871 ±0.1110 0.6364 ±0.1046
Multi-scale CNNs (combine probabilities)
5x/10x (One loss) 0.6901 ±0.1136 0.6582 ±0.0973
5x/10x (n+ 1 losses) 0.7026 ±0.0988 0.6626 ±0.0897
5x/10x/20x (One loss) 0.6678 ±0.0973 0.6239 ±0.0860
5x/10x/20x (n+ 1 losses) 0.6784 ±0.0995 0.6355 ±0.0835
2
https://github.com/sara-nl/multi-scale-tools. Retrieved 11th of January, 2021
3
https://github.com/DIAGNijmegen/pathology-hooknet. Retrieved 19th of
June, 2021
4
https://surfdrive.surf.nl/les/index.php/s/PBBnjwzwMragAGd. Retrieved 11th of
January, 2021
5
https://github.com/computationalpathologygroup/hooknet/blob/
fcba7824ed982f663789f0c617a4ed65bedebb85/source/inference.py#L20. Retrieved
11th of January, 2021
6
https://github.com/sara-nl/multi-scale-tools/blob/master/README.md.
Retrieved 11th of January, 2021
7
https://github.com/DIAGNijmegen/pathology-hooknet/blob/master/hooknet/
apply.py. Retrieved 19th of June, 2021
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 6845219
Marini et al. Multi_Scale_Tools
4 DISCUSSION AND CONCLUSION
Multi_Scale_Tools library aims at facilitating the exploitation of
multi-scale structure in WSIs with code that is easy to use and
easy to be improved with additional functions. The library currently
includes four components. The components are a pre-processing tool
to extract multi-scale patches, a scale detector, two multi-scale CNNs
for classication and a multi-scale CNN for segmentation. The pre-
processing component includes two methods to extract patches from
several magnication levels. The methods are designed to be scalable
on multi-core hardware. The scale detector component includes a
CNN allowing to regress the magnication level of a patch. The CNN
obtains high performance in patches that come from the colon (the
tissue used to train it) and it reaches good performance on other
tissues such as prostate and lung too. Two multi-scale CNN
architectures are developed for fully-supervised classication. The
rst one combines features from multi-scale branches, while the
second one combines predictions from multi-scale branches. The rst
architecture obtains better performance and outperforms the model
trained with patches from only one magnication level. The HookNet
architecture for multi-scale segmentation is also included into the
library, fostering its usage and making the library more complete. The
tests show that HookNet outperforms single scale U-Net in the
considered tasks. The presented library allows to exploit the multi-
scale structure of WSIs efciently. In any case, the user remains a
fundamental part of the system for several components, such as
identifying the scale that can be more relevant for a specic problem.
The comparison between the single-scale CNNs and the multi-scale
CNN is an example of this. The CNN is trained to classify between
cancer, dysplasia (both high-grade and low-grade), hyperplastic polyp
and non-informative tissue. In the classication task, the highest
performance is reached using patches of magnication 5x and 10x,
while patches from 20x lead to lower classication performance. This
can likely be related to the fact that the main feature related to the
considered classes is the structure of the glands, therefore high
magnications (e.g. 20x) limitedly introduce helpful information
into the models. The importance of the user to select the proper
magnication levels is highlighted even more in the segmentation
results. Considering low magnications, the models show good
performance in ductal carcinoma in-situ and invasive ductal
carcinoma segmentation since these tasks need context about
the duct structures in the breast use case. Considering higher
magnications, the models perform well in invasive lobular
carcinoma and benign tissue segmentation, where the details
are more important. The methods identied to pair images from
several magnication levels can pave the way to multi-modal
combination of images too. The combination may increase the
information included in the single modality, increasing the
performanceoftheCNNs.Somepossibleapplicationscanbe
the combination of WSIs stained with different reagents, such
H&E and immunohistochemical (IHC) stainings, the
application in Raman spectroscopy data, combining
information about tissue morphologies and architectures with
protein biomarkers, and the combination of patches from
different focal planes.
DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in
the article/Supplementary Material, further inquiries can be
directed to the corresponding author.
AUTHOR CONTRIBUTIONS
NM: design of the work, software, analysis, original draft SO:
design of the work, revised the work DP: software, revised the
TABLE 5 | Performance of the U-Net (above) and HookNet (below) on the breast dataset. The architectures are compared on the F1 score, for each tissue type (description
of the tissue type in the Supplementary Material). The overall macro-F1 score is reported. The values marked in bold highlight the method that reaches the highest
performance, respect to the task.
Magnication DCIS IDC ILC Benign Other Fat Overall
Model: U-Net
20x 0.47 0.55 0.85 0.75 0.95 0.99 0.76
10x 0.67 0.69 0.79 0.87 0.98 1.00 0.83
5x 0.79 0.83 0.79 0.84 0.98 1.00 0.87
2.5x 0.83 0.85 0.63 0.73 0.96 1.00 0.83
1.25x 0.86 0.81 0.20 0.66 0.96 1.00 0.75
Model: HookNet
20x (target)-5x (context) 0.62 0.75 0.82 0.82 0.98 1.00 0.83
20x (target)-1.25x (context) 0.84 0.89 0.91 0.84 0.98 1.00 0.91
TABLE 6 | Performance of the U-Net (above) and HookNet (below) on the lung
dataset. The architectures are compared on the F1 score, for each tissue type
(description of the tissue type in the Supplementary Material). The overall
macro-F1 score is reported. The values marked in bold highlight the method that
reaches the highest performance, respect to the task.
Magnication TLS GC Tumor Other Overal
Model: U-Net
20x 0.81 0.28 0.75 0.87 0.70
10x 0.86 0.44 0.71 0.86 0.72
5x 0.84 0.49 0.67 0.85 0.71
2.5x 0.80 0.37 0.56 0.80 0.63
1.25x 0.78 0.35 0.39 0.77 0.57
Model: HookNet
20x (target)-5x (context) 0.84 0.48 0.72 0.87 0.73
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 68452110
Marini et al. Multi_Scale_Tools
work MR: software, analysis JL: revised the work, approval for
publication FC: revised the work, approval for publication HM:
revised the work, approval for publication MA: revised the work,
approval for publication.
FUNDING
This project has received funding from the European Unions
Horizon 2020 research and innovation programme under
grant agreement No. 825292 (ExaMode, http://www.
examode.eu/).
ACKNOWLEDGMENTS
The authors also thank Nvidia for the Titan Xp GPUs used for
some of the weakly supervised experiments. SO thanks to the
Colombian science ministry Minciencias for partially funding his
Ph.D. studies through the call 756-Doctorados en el exterior.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at:
https://www.frontiersin.org/articles/10.3389/fcomp.2021.684521/
full#supplementary-material
REFERENCES
Aeffner, F., Wilson, K., Martin, N. T., Black, J. C., Hendriks, C. L. L., Bolon, B., et al.
(2017). The Gold Standard Paradox in Digital Image Analysis: Manual versus
Automated Scoring as Ground Truth. Arch. Pathol. Lab. Med. 141, 12671275.
doi:10.5858/arpa.2016-0386-ra
Arvaniti, E., and Claassen, M. (2018). Coupling Weak and strong Supervision for
Classication of Prostate Cancer Histopathology Images. Medical Imaging meets
NIPS Workshop, NIPS 2018. arXiv preprint arXiv:1811.07013.
Bozkurt,A.,Kose,K.,Alessi-Fox,C.,Gill,M.,Dy,J.,Brooks,D.,and
Rajadhyaksha,M.(2018).AMultiresolutionConvolutionalNeural
Network with Partial Label Training for Annotating Reectance
Confocal Microscopy Images of Skin. In International Conference on
Medical Image Computing and Computer-Assisted Intervention,
Granada, Spain, 1620 September 2018 (Springer), 292299.
doi:10.1007/978-3-030-00934-2_33
Campanella, G., Hanna, M. G., Geneslaw, L., Miraor, A., Werneck Krauss Silva,
V., Busam, K. J., et al. (2019). Clinical-grade Computational Pathology Using
Weakly Supervised Deep Learning on Whole Slide Images. Nat. Med. 25,
13011309. doi:10.1038/s41591-019-0508-1
Coudray, N., Ocampo, P. S., Sakellaropoulos, T., Narula, N., Snuderl, M., Fenyö, D.,
et al. (2018). Classication and Mutation Prediction from Non-small Cell Lung
Cancer Histopathology Images Using Deep Learning. Nat. Med. 24, 15591567.
doi:10.1038/s41591-018-0177-5
del Toro, O. J., Atzori, M., Otálora, S., Andersson, M., Eurén, K., Hedlund, M., et al.
(2017). Convolutional Neural Networks for an Automatic Classication of
Prostate Tissue Slides with High-Grade gleason Score,in Medical Imaging
2017: Digital Pathology (Bellingham, WA: International Society for Optics and
Photonics), 10140, 101400O. doi:10.1117/12.2255710
Gecer, B., Aksoy, S., Mercan, E., Shapiro, L. G., Weaver, D. L., and Elmore, J. G.
(2018). Detection and Classication of Cancer in Whole Slide Breast
Histopathology Images Using Deep Convolutional Networks. Pattern
recognition 84, 345356. doi:10.1016/j.patcog.2018.07.022
Han, D., Kim, J., and Kim, J. (2017). Deep Pyramidal Residual Networks. In
Proceedings of the IEEE conference on computer vision and pattern
recognition, Honolulu, HI, July 21-26 2017 (IEEE) 59275935. doi:10.1109/
cvpr.2017.668
Hashimoto, N., Fukushima, D., Koga, R., Takagi, Y., Ko, K., Kohno, K., et al. (2020).
Multi-scale Domain-Adversarial Multiple-Instance Cnn for Cancer Subtype
Classication with Unannotated Histopathological Images. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
1419 June 2020 (IEEE) 38523861. doi:10.1109/cvpr42600.2020.00391
Iizuka, O., Kanavati, F., Kato, K., Rambeau, M., Arihiro, K., and Tsuneki, M.
(2020). Deep Learning Models for Histopathological Classication of Gastric
and Colonic Epithelial Tumours. Sci. Rep. 10, 15041511. doi:10.1038/s41598-
020-58467-9
Jain, M. S., and Massoud, T. F. (2020). Predicting Tumour Mutational burden from
Histopathological Images Using Multiscale Deep Learning. Nat. Mach Intell. 2,
356362. doi:10.1038/s42256-020-0190-5
Janowczyk, A., and Madabhushi, A. (2016). Deep Learning for Digital Pathology
Image Analysis: A Comprehensive Tutorial with Selected Use Cases. J. Pathol.
Inform. 7, 29. doi:10.4103/2153-3539.186902
Janowczyk, A., Zuo, R., Gilmore, H., Feldman, M., and Madabhushi, A. (2019).
Histoqc: an Open-Source Quality Control Tool for Digital Pathology Slides.
JCO Clin. Cancer Inform. 3, 17. doi:10.1200/cci.18.00157
Jimenez-del-Toro, O., Otálora, S., Andersson, M., Eurén, K., Hedlund, M.,
Rousson, M., et al. (2017). Analysis of Histopathology Images,in
Biomedical Texture Analysis (Cambridge, MA: Elsevier), 281314.
doi:10.1016/b978-0-12-812133-7.00010-7
Komura, D., and Ishikawa, S. (2018). Machine Learning Methods for
Histopathological Image Analysis. Comput. Struct. Biotechnol. J. 16, 3442.
doi:10.1016/j.csbj.2018.01.001
Lai, Z., and Deng, H. (2017). Multiscale High-Level Feature Fusion for
Histopathological Image Classication. Comput. Math. Methods Med. 2017,
7521846. doi:10.1155/2017/7521846
Lenc, K., and Vedaldi, A. (2015). Understanding Image Representations by
Measuring Their Equivariance and Equivalence. In Proceedings of the IEEE
conference on computer vision and pattern recognition, Boston, USA, 712
June 2015 (IEEE) 991999. doi:10.1109/cvpr.2015.7298701
Li, J., Sarma, K. V., Chung Ho, K., Gertych, A., Knudsen, B. S., and Arnold, C. W.
(2017). A Multi-Scale U-Net for Semantic Segmentation of Histological Images
from Radical Prostatectomies, AMIA Annu. Symp. Proc.. In AMIA Annual
Symposium Proceedings, Washington, DC, 48 November 2017 (American
Medical Informatics Association), vol. 2017, 1140, 1148.
Li, S., Liu, Y., Sui, X., Chen, C., Tjio, G., Ting, D. S. W., and Goh, R. S. M. (2019).
Multi-instance Multi-Scale Cnn for Medical Image Classication. In
International Conference on Medical Image Computing and Computer-
Assisted Intervention, Shenzhen, China, 1317 October 2019 (Springer),
531539. doi:10.1007/978-3-030-32251-9_58
Lu,M.Y.,Williamson,D.F.,Chen,T.Y.,Chen,R.J.,Barbieri,M.,andMahmood,F.
(2021). Data-efcient and Weakly Supervised Computational Pathology on Whole-
Slide Images. Nat. Biomed. Eng.,116. doi:10.1038/s41551-020-00682-w
Marcos, D., Kellenberger, B., Lobry, S., and Tuia, D. (2018). Scale Equivariance in
Cnns with Vector elds. ICML/FAIM 2018 workshop on Towards learning with
limited labels: Equivariance, Invariance, and Beyond (oral presentation). arXiv
preprint arXiv:1807.11783.
McHugh, M. L. (2012). Interrater Reliability: the Kappa Statistic. Biochem. Med. 22,
276282. doi:10.11613/bm.2012.031
Molin, J., Bodén, A., Treanor, D., Fjeld, M., and Lundström, C. (2016). Scale Stain:
Multi-Resolution Feature Enhancement in Pathology Visualization. arXiv
preprint arXiv:1610.04141.
Naylor, P., Laé, M., Reyal, F., and Walter, T. (2018). Segmentation of Nuclei in
Histopathology Images by Deep Regression of the Distance Map. IEEE Trans.
Med. Imaging 38, 448459. doi:10.1109/TMI.2018.2865709
Naylor, P., Laé, M., Reyal, F., and Walter, T. (2017). Nuclei Segmentation in
Histopathology Images Using Deep Neural Networks. In 2017 IEEE 14th
international symposium on biomedical imaging (ISBI 2017), Melbourne,
Australia, 1821 April 2017 (IEEE), 933936. doi:10.1109/
isbi.2017.7950669
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 68452111
Marini et al. Multi_Scale_Tools
Otálora, S., Atzori, M., Andrearczyk, V., and Müller, H. (2018a). Image
Magnication Regression Using Densenet for Exploiting Histopathology
Open Access Content,in Computational Pathology and Ophthalmic
Medical Image Analysis (New York, USA: Springer), 148155. doi:10.1007/
978-3-030-00949-6_18
Otálora, S., Perdomo, O., Atzori, M., Andersson, M., Jacobsson, L., Hedlund, M.,
et al. (2018b). Determining the Scale of Image Patches Using a Deep Learning
Approach. 2018 IEEE 15th International Symposium on Biomedical Imaging
(ISBI 2018), Washington, DC, 47 Aprile 2018. (IEEE), 843846. doi:10.1109/
isbi.2018.8363703
Paramanandam, M., OByrne, M., Ghosh, B., Mammen, J. J., Manipadam, M. T.,
Thamburaj, R., et al. (2016). Automated Segmentation of Nuclei in Breast
Cancer Histopathology Images. PloS one 11, e0162053. doi:10.1371/
journal.pone.0162053
Ren, J., Hacihaliloglu, I., Singer, E. A., Foran, D. J., and Qi, X. (2018). Adversarial
Domain Adaptation for Classication of Prostate Histopathology Whole-Slide
Images. International Conference on Medical Image Computing and
Computer-Assisted Intervention, Granada, Spain, 1620 September 2018
(Springer), 201209. doi:10.1007/978-3-030-00934-2_23
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net: Convolutional Networks
for Biomedical Image Segmentation. International Conference on Medical
image computing and computer-assisted intervention, Munich, Germany, 5-
9 October 2015 (Springer), 234241. doi:10.1007/978-3-319-24574-4_28
Rorke, L. B. (1997). Pathologic Diagnosis as the Gold Standard. Cancer 79,
665667. doi:10.1002/(sici)1097-0142(19970215)79:4<665::aid-cncr1>3.0.co;
2-d
Roy, K., Banik, D., Bhattacharjee, D., and Nasipuri, M. (2019). Patch-based
System for Classication of Breast Histology Images Using Deep Learning.
Comput. Med. Imaging Graphics 71, 90103. doi:10.1016/
j.compmedimag.2018.11.003
Salvi, M., and Molinari, F. (2018). Multi-tissue and Multi-Scale Approach for
Nuclei Segmentation in H&E Stained Images. Biomed. Eng. Online 17, 89.
doi:10.1186/s12938-018-0518-0
Schmitz, R., Madesta, F., Nielsen, M., Krause, J., Werner, R., and Rösch, T. (2019).
Multi-scale Fully Convolutional Neural Networks for Histopathology Image
Segmentation: From Nuclear Aberrations to the Global Tissue Architecture.
Medical Image Analysis 70, 101996.
Sellaro, T. L., Filkins, R., Hoffman, C., Fine, J. L., Ho, J., Parwani, A. V., et al. (2013).
Relationship between Magnication and Resolution in Digital Pathology
Systems. J. Pathol. Inform. 4, 21. doi:10.4103/2153-3539.116866
Tensmeyer, C., and Martinez, T. (2016). Improving Invariance and Equivariance
Properties of Convolutional Neural Networks ICLR 2017 conference.
Ullah, I. (2017). A Pyramidal Approach for Designing Deep Neural Network
Architectures PhD thesis. Available at: https://air.unimi.it/handle/2434/
466758#.YQEi7FMzYWo.
van der Laak, J., Litjens, G., and Ciompi, F. (2021). Deep Learning in
Histopathology: the Path to the Clinic. Nat. Med. 27, 775784. doi:10.1038/
s41591-021-01343-4
van Rijthoven, M., Balkenhol, M., Siliņa, K., van der Laak, J., and Ciompi, F. (2020).
Hooknet: Multi-Resolution Convolutional Neural Networks for Semantic
Segmentation in Histopathology Whole-Slide Images. Med. Image Anal. 68,
101890. doi:10.1016/j.media.2020.101890
Wang, S., Yang, D. M., Rong, R., Zhan, X., and Xiao, G. (2019). Pathology Image
Analysis Using Segmentation Deep Learning Algorithms. Am. J. Pathol. 189,
16861698. doi:10.1016/j.ajpath.2019.05.007
Wu, Q., Merchant, F., and Castleman, K. (2010). Microscope Image Processing.New
York, USA: Elsevier.
Yang,Z.,Ran,L.,Zhang,S.,Xia,Y.,andZhang,Y.(2019).Ems-net:Ensembleof
Multiscale Convolutional Neural Networks for Classication of Breast Cancer
Histology Images. Neurocomputing 366, 4653. doi:10.1016/j.neucom.2019.07.080
Zaveri, M., Hemati, S., Shah, S., Damskinos, S., and Tizhoosh, H. (2020). Kimia-
5maga Dataset for Learning the Magnication in Histopathology Images. In
2020 IEEE 32nd International Conference on Tools with Articial Intelligence
(ICTAI), 911 November 2020 (IEEE), 363367.
Zeng, N., Li, H., and Peng, Y. (2021a). A New Deep Belief Network-Based Multi-
Task Learning for Diagnosis of Alzheimers Disease. Neural Comput. Appl.,
112. doi:10.1007/s00521-021-06149-6
Zeng, N., Li, H., Wang, Z., Liu, W., Liu, S., Alsaadi, F. E., et al. (2021b). Deep-
reinforcement-learning-based Images Segmentation for Quantitative Analysis
of Gold Immunochromatographic Strip. Neurocomputing 425, 173180.
doi:10.1016/j.neucom.2020.04.001
Zeng, N., Wang, Z., Zhang, H., Kim, K.-E., Li, Y., and Liu, X. (2019). An Improved
Particle Filter with a Novel Hybrid Proposal Distribution for Quantitative
Analysis of Gold Immunochromatographic Strips. IEEE Trans. Nanotechnology
18, 819829. doi:10.1109/tnano.2019.2932271
Zhang, Q., Heldermon, C. D., and Toler-Franklin, C. (2020). Multiscale Detection
of Cancerous Tissue in High Resolution Slide Scans. In International
Symposium on Visual Computing, San Diego, USA, 57 October 2020
(Springer), 139153. doi:10.1007/978-3-030-64559-5_11
Zhu, W., Qiu, Q., Calderbank, R., Sapiro, G., and Cheng, X. (2019). Scale-
equivariant Neural Networks with Decomposed Convolutional Filters. arXiv
preprint arXiv:1909.11193.
Conict of Interest: The authors declare that the research was conducted in the
absence of any commercial or nancial relationships that could be construed as a
potential conict of interest.
Publishers Note: All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their afliated organizations, or those of
the publisher, the editors and the reviewers. Any product that may be evaluated in
this article, or claim that may be made by its manufacturer, is not guaranteed or
endorsed by the publisher.
Copyright © 2021 Marini, Otálora, Podareanu, van Rijthoven, van der Laak,
Ciompi, Müller and Atzori. This is an open-access article distributed under the
terms of the Creative Commons Attribution License (CC BY). The use, distribution
or reproduction in other forums is permitted, provided the original author(s) and the
copyright owner(s) are credited and that the original publication in this journal is
cited, in accordance with accepted academic practice. No use, distribution or
reproduction is permitted which does not comply with these terms.
Frontiers in Computer Science | www.frontiersin.org August 2021 | Volume 3 | Article 68452112
Marini et al. Multi_Scale_Tools
... Additionally, we conducted a cross-magnification experiment to assess how varying image resolutions affect the performance of classification models in histopathological image analyses. This experiment is particularly important as it addresses the practical challenges faced in histopathology, where images may be captured at different magnifications [10,11]. High-magnification images provide detailed cellular structures, while low-magnification images offer a broader tissue architecture perspective [10,11]. ...
... This experiment is particularly important as it addresses the practical challenges faced in histopathology, where images may be captured at different magnifications [10,11]. High-magnification images provide detailed cellular structures, while low-magnification images offer a broader tissue architecture perspective [10,11]. By analyzing how different resolutions affect classification accuracy, we contribute valuable knowledge that can inform future practices in the field, ensuring that classifiers are effective and adaptable to the diverse conditions under which pathology images are obtained. ...
Article
Full-text available
Gastric cancer is the fifth most common and fourth deadliest cancer worldwide, with a bleak 5-year survival rate of about 20%. Despite significant research into its pathobiology, prognostic predictability remains insufficient due to pathologists’ heavy workloads and the potential for diagnostic errors. Consequently, there is a pressing need for automated and precise histopathological diagnostic tools. This study leverages Machine Learning and Deep Learning techniques to classify histopathological images into healthy and cancerous categories. By utilizing both handcrafted and deep features and shallow learning classifiers on the GasHisSDB dataset, we conduct a comparative analysis to identify the most effective combinations of features and classifiers for differentiating normal from abnormal histopathological images without employing fine-tuning strategies. Our methodology achieves an accuracy of 95% with the SVM classifier, underscoring the effectiveness of feature fusion strategies. Additionally, cross-magnification experiments produced promising results with accuracies close to 80% and 90% when testing the models on unseen testing images with different resolutions.
... Currently, GPU hardware has limited memory and struggles to handle large input images. Images are split into patches of 224 × 224 pixels, extracted from magnification 10x, using the Multi_Scale_Tools library (Marini et al., 2021b). The patch size is chosen considering that the ResNet34 backbone used as CNN requires this input data size. ...
... The acquisition of a WSI typically occurs up to x40 magnification level, resulting in images of over 100,000 pixels in each dimension, with a pixel size of 0.25 µm. 3 The availability of WSIs paved the way for the development of computational pathology domain. Computational pathology aims to develop automatic algorithms to analyze WSIs unleashing the power of digital pathology. ...
... In this organization, the image is divided into small fragments called tiles, and multiple levels are created. Each level contains the same image but at different resolutions [17]. The top of the pyramid represents the lowest resolution, while the base represents the original resolution (see Fig. 6). ...
Article
Full-text available
Background Multimodal histology image registration is a process that transforms into a common coordinate system two or more images obtained from different microscopy modalities. The combination of information from various modalities can contribute to a comprehensive understanding of tissue specimens, aiding in more accurate diagnoses, and improved research insights. Multimodal image registration in histology samples presents a significant challenge due to the inherent differences in characteristics and the need for tailored optimization algorithms for each modality. Results We developed MMIR a cloud-based system for multimodal histological image registration, which consists of three main modules: a project manager, an algorithm manager, and an image visualization system. Conclusion Our software solution aims to simplify image registration tasks with a user-friendly approach. It facilitates effective algorithm management, responsive web interfaces, supports multi-resolution images, and facilitates batch image registration. Moreover, its adaptable architecture allows for the integration of custom algorithms, ensuring that it aligns with the specific requirements of each modality combination. Beyond image registration, our software enables the conversion of segmented annotations from one modality to another.
... Additionally, every multi-resolution image is divided into small sections called tiles. The pyramid organization enables us to load into memory the optimal image resolution for each required view [16]. To implement this image organization, we used the open-source library openSeadragon [17]. ...
... In this paper, multi-magnification images refer to images with different resolutions in the same field of view (FOV), while multi-scale images have different FOVs at the same or different resolutions. The method of multi-scale features [43] is thus different from those of the two we mentioned earlier. An example of each type of image pair is displayed in Fig. 1. ...
Article
Full-text available
Precise classification of histopathological images is crucial to computer-aided diagnosis in clinical practice. Magnification-based learning networks have attracted considerable attention for their ability to improve performance in histopathological classification. However, the fusion of pyramids of histopathological images at different magnifications is an under-explored area. In this paper, we proposed a novel deep multi-magnification similarity learning (DSML) approach that can be useful for the interpretation of multi-magnification learning framework and easy to visualize feature representation from low-dimension (e.g., cell-level) to high-dimension (e.g., tissue-level), which has overcome the difficulty of understanding cross-magnification information propagation. It uses a similarity cross entropy loss function designation to simultaneously learn the similarity of the information among cross-magnifications. In order to verify the effectiveness of DMSL, experiments with different network backbones and different magnification combinations were designed, and its ability to interpret was also investigated through visualization. Our experiments were performed on two different histopathological datasets: a clinical nasopharyngeal carcinoma and a public breast cancer BCSS2021 dataset. The results show that our method achieved outstanding performance in classification with a higher value of area under curve, accuracy, and F-score than other comparable methods. Moreover, the reasons behind multi-magnification effectiveness were discussed.
Article
Hematoxylin and eosin (H&E) stained slides are widely used in disease diagnosis. Remarkable advances in deep learning have made it possible to detect complex molecular patterns in these histopathology slides, suggesting automated approaches could help inform pathologists’ decisions. Multiple instance learning (MIL) algorithms have shown promise in this context, outperforming transfer learning (TL) methods for various tasks, but their implementation and usage remains complex. We introduce HistoMIL, a Python package designed to streamline the implementation, training and inference process of MIL-based algorithms for computational pathologists and biomedical researchers. It integrates a self-supervised learning module for feature encoding, and a full pipeline encompassing TL and three MIL algorithms: ABMIL, DSMIL, and TransMIL. The PyTorch Lightning framework enables effortless customization and algorithm implementation. We illustrate HistoMIL’s capabilities by building predictive models for 2,487 cancer hallmark genes on breast cancer histology slides, achieving AUROC performances of up to 85%.
Article
The suggestion that the systemic immune response in lymph nodes (LNs) conveys prognostic value for triple-negative breast cancer (TNBC) patients has not previously been investigated in large cohorts. We used a deep learning (DL) framework to quantify morphological features in haematoxylin and eosin-stained LNs on digitised whole slide images. From 345 breast cancer patients, 5,228 axillary LNs, cancer-free and involved, were assessed. Generalisable multiscale DL frameworks were developed to capture and quantify germinal centres (GCs) and sinuses. Cox regression proportional hazard models tested the association between smuLymphNet-captured GC and sinus quantifications and distant metastasis-free survival (DMFS). smuLymphNet achieved a Dice coefficient of 0.86 and 0.74 for capturing GCs and sinuses, respectively, and was comparable to an interpathologist Dice coefficient of 0.66 (GC) and 0.60 (sinus). smuLymphNet-captured sinuses were increased in LNs harbouring GCs (p < 0.001). smuLymphNet-captured GCs retained clinical relevance in LN-positive TNBC patients whose cancer-free LNs had on average ≥2 GCs, had longer DMFS (hazard ratio [HR] = 0.28, p = 0.02) and extended GCs' prognostic value to LN-negative TNBC patients (HR = 0.14, p = 0.002). Enlarged smuLymphNet-captured sinuses in involved LNs were associated with superior DMFS in LN-positive TNBC patients in a cohort from Guy's Hospital (multivariate HR = 0.39, p = 0.039) and with distant recurrence-free survival in 95 LN-positive TNBC patients of the Dutch-N4plus trial (HR = 0.44, p = 0.024). Heuristic scoring of subcapsular sinuses in LNs of LN-positive Tianjin TNBC patients (n = 85) cross-validated the association of enlarged sinuses with shorter DMFS (involved LNs: HR = 0.33, p = 0.029 and cancer-free LNs: HR = 0.21 p = 0.01). Morphological LN features reflective of cancer-associated responses are robustly quantifiable by smuLymphNet. Our findings further strengthen the value of assessment of LN properties beyond the detection of metastatic deposits for prognostication of TNBC patients. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Article
Full-text available
Accurate classification of Alzheimer’s disease (AD) and mild cognitive impairment (MCI), especially distinguishing the progressive MCI (pMCI) from stable MCI (sMCI), will be helpful in both reducing the risk of converting into AD and also releasing the burden on the family and even the society. In this study, a novel deep belief network (DBN) based multi-task learning algorithm is developed for the classification issue. In particular, the dropout technology and zero-masking strategy are exploited for getting over the overfitting problem and also enhancing the generalization ability and robustness of the model. Then, a new framework based on the DBN-based multi-task learning is established for accurate diagnosis of AD. After MRI preprocessing, not only the principal component analysis is utilized to reduce the feature dimension, but also multi-task feature selection approach is introduced to select the feature set related to all tasks as a result of taking the internal relevancy among multiple related tasks into consideration. Using data from the ADNI dataset, our method achieves satisfactory results in six tasks of health control (HC) vs. AD, HC vs. pMCI, HC vs. sMCI, pMCI vs. AD, sMCI vs. AD and sMCI vs. pMCI with the accuracies are 98.62%, 96.67%, 92.31%, 91.89%, 99.62% and 87.78%, respectively. Experimental results demonstrate that the DBN-based MTL algorithm developed in this study is an effective, superior and practical method of AD diagnosis.
Article
Full-text available
Deep-learning methods for computational pathology require either manual annotation of gigapixel whole-slide images (WSIs) or large datasets of WSIs with slide-level labels and typically suffer from poor domain adaptation and interpretability. Here we report an interpretable weakly supervised deep-learning method for data-efficient WSI processing and learning that only requires slide-level labels. The method, which we named clustering-constrained-attention multiple-instance learning (CLAM), uses attention-based learning to identify subregions of high diagnostic value to accurately classify whole slides and instance-level clustering over the identified representative regions to constrain and refine the feature space. By applying CLAM to the subtyping of renal cell carcinoma and non-small-cell lung cancer as well as the detection of lymph node metastasis, we show that it can be used to localize well-known morphological features on WSIs without the need for spatial labels, that it overperforms standard weakly supervised classification algorithms and that it is adaptable to independent test cohorts, smartphone microscopy and varying tissue content.
Article
Full-text available
We propose HookNet, a semantic segmentation model for histopathology whole-slide images, which combines context and details via multiple branches of encoder-decoder convolutional neural networks. Concentric patches at multiple resolutions with different fields of view, feed different branches of HookNet, and intermediate representations are combined via a hooking mechanism. We describe a framework to design and train HookNet for achieving high-resolution semantic segmentation and introduce constraints to guarantee pixel-wise alignment in feature maps during hooking. We show the advantages of using HookNet in two histopathology image segmentation tasks where tissue type prediction accuracy strongly depends on contextual information, namely (1) multi-class tissue segmentation in breast cancer and, (2) segmentation of tertiary lymphoid structures and germinal centers in lung cancer. We show the superiority of HookNet when compared with single-resolution U-Net models working at different resolutions as well as with a recently published multi-resolution model for histopathology image segmentation. We have made HookNet publicly available by releasing the source code¹ as well as in the form of web-based applications²,³ based on the grand-challenge.org platform.
Conference Paper
Full-text available
We propose a new method for cancer subtype classification from histopathological images, which can automatically detect tumor-specific features in a given whole slide image (WSI). The cancer subtype should be classified by referring to a WSI, i.e., a large-sized image (typically 40,000x40,000 pixels) of an entire pathological tissue slide, which consists of cancer and non-cancer portions. One difficulty arises from the high cost associated with annotating tumor regions in WSIs. Furthermore, both global and local image features must be extracted from the WSI by changing the magnifications of the image. In addition, the image features should be stably detected against the differences of staining conditions among the hospitals/specimens. In this paper, we develop a new CNN-based cancer subtype classification method by effectively combining multiple-instance, domain adversarial, and multi-scale learning frameworks in order to overcome these practical difficulties. When the proposed method was applied to malignant lymphoma subtype classifications of 196 cases collected from multiple hospitals, the classification performance was significantly better than the standard CNN or other conventional methods, and the accuracy compared favorably with that of standard pathologists.
Article
Full-text available
Tumour mutational burden (TMB) is an important biomarker for predicting the response to immunotherapy in patients with cancer. Gold-standard measurement of TMB is performed using whole exome sequencing (WES), which is not available at most hospitals because of its high cost, operational complexity and long turnover times. We have developed a machine learning algorithm, Image2TMB, which can predict TMB from readily available lung adenocarcinoma histopathological images. Image2TMB integrates the predictions of three deep learning models that operate at different resolution scales (×5, ×10 and ×20 magnification) to determine if the TMB of a cancer is high or low. On a held-out set of patients, Image2TMB achieves an area under the precision recall curve of 0.92, an average precision of 0.89, and has the predictive power of a targeted sequencing panel of ~100 genes. This study demonstrates that it is possible to infer genomic features from histopathology images, and potentially opens avenues for exploring genotype–phenotype relationships. Tumour mutational burden (TMB) shows promise as a biomarker in cancer immunotherapy, but it usually requires whole-exome sequencing, which is costly, time-consuming and unavailable at most hospitals. The authors develop a machine learning algorithm that uses standard H&E histopathological images to quickly, inexpensively and accurately predict TMB. The approach may have applications as a tool to screen and prioritize patient samples and subsequent treatments.
Article
Machine learning techniques have great potential to improve medical diagnostics, offering ways to improve accuracy, reproducibility and speed, and to ease workloads for clinicians. In the field of histopathology, deep learning algorithms have been developed that perform similarly to trained pathologists for tasks such as tumor detection and grading. However, despite these promising results, very few algorithms have reached clinical implementation, challenging the balance between hope and hype for these new techniques. This Review provides an overview of the current state of the field, as well as describing the challenges that still need to be addressed before artificial intelligence in histopathology can achieve clinical value. Recent advances in machine learning techniques have created opportunities to improve medical diagnostics, but implementing these advances in the clinic will not be without challenge.
Chapter
We present an algorithm for multi-scale tumor (chimeric cell) detection in high resolution slide scans. The broad range of tumor sizes in our dataset pose a challenge for current Convolutional Neural Networks (CNN) which often fail when image features are very small (8 pixels). Our approach modifies the effective receptive field at different layers in a CNN so that objects with a broad range of varying scales can be detected in a single forward pass. We define rules for computing adaptive prior anchor boxes which we show are solvable under the equal proportion interval principle. Two mechanisms in our CNN architecture alleviate the effects of non-discriminative features prevalent in our data - a foveal detection algorithm that incorporates a cascade residual-inception module and a deconvolution module with additional context information. When integrated into a Single Shot MultiBox Detector (SSD), these additions permit more accurate detection of small-scale objects. The results permit efficient real-time analysis of medical images in pathology and related biomedical research fields.
Conference Paper
We present an algorithm for multi-scale tumor (chimeric cell) detection in high resolution slide scans. The broad range of tumor sizes in our dataset pose a challenge for current Convolutional Neural Networks (CNN) which often fail when image features are very small (8 pixels). Our approach modifies the effective receptive field at different layers in a CNN so that objects with a broad range of varying scales can be detected in a single forward pass. We define rules for computing adaptive prior anchor boxes which we show are solvable under the equal proportion interval principle. Two mechanisms in our CNN architecture alleviate the effects of non-discriminative features prevalent in our data - a foveal detection algorithm that incorporates a cascade residual-inception module and a deconvolution module with additional context information. When integrated into a Single Shot MultiBox Detector (SSD), these additions permit more accurate detection of small-scale objects. The results permit efficient real-time analysis of medical images in pathology and related biomedical research fields.
Article
Gold immunochromatographic strip (GICS) is a widely used lateral flow immunoassay technique. A novel image segmentation method is developed in this paper for quantitative analysis of GICS based on the deep reinforcement learning (DRL), which can accurately distinguish the test line and the control line in the GICS images. The deep belief network (DBN) is employed in the deep Q network in our DRL algorithm. Meanwhile, the multi-factor learning curve is introduced in the DRL algorithm to dynamically adjust the capacity of the replay buffer and the sampling size, which leads to enhanced learning efficiency. It is worth mentioning that the states, actions, and rewards in the developed DRL algorithm are determined based on the characteristics of GICS images. Experiment results demonstrate the feasibility and reliability of the proposed DRL-based image segmentation method and show that the proposed new image segmentation method outperforms some existing image segmentation methods for quantitative analysis of GICS images.