Conference Paper

Deep Learning Versus Classic Methods for Multi-taxon Diatom Segmentation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Diatom identification is a crucial process to estimate water quality, which is essential in biological studies. This process can be automated with machine learning algorithms. For this purpose, a dataset with 10 common taxa is collected, with annotations provided by an expert diatomist. In this work, a comparison of the classical state-of-the-art general purpose methods along with two different deep learning approaches is carried out. The classical methods are based on Viola-Jones and scale and curvature invariant ridge object detectors. The deep learning based methods are Semantic Segmentation and YOLO. This is the first time that Viola-Jones and Semantic Segmentation techniques are applied and compared for diatom segmentation in microscopic images containing several taxon shells. While all methods provide relatively good results in specific species, the deep learning approaches are consistently better in terms of sensitivity and specificity (up to 0.99 for some taxa) and up to 0.86 precision.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Article
While the use of deep learning is a valuable technology for automatic detection systems for medical data and images, the biofouling community is still lacking an analytical tool for the detection and counting of diatoms on samples after short-term field exposure. In this work a fully convolutional neural network was implemented as a fast and simple approach to detect diatoms on two-channel (fluorescence and phase contrast) microscopy images by predicting bounding boxes. The developed approach performs well with only a small number of trainable parameters and a F1 score of 0.82. Counting diatoms was evaluated on a dataset of 600 microscopy images of three different surface chemistries (hydrophilic and hydrophobic) and is very similar to counting by humans, while demanding only a fraction of the analysis time.
Chapter
Full-text available
The quantity of certain types of diatoms is used for determining water quality. Currently, a precise identification of species present in a water sample is conducted by diatomists. However, different points of view of diatomists along with different sizes and shapes that diatoms may have in samples makes diatoms identification difficult, which is required to classify them into genera to which they belong to. Additionally, chemical processes, that are applied to eliminate unwanted elements in water samples (debris, flocs, etc.) are insufficient. Thus, diatoms have to be differentiated from those structures before classifying them into a genus. In fact, researchers have a special interest on looking for different ways to perform an automated identification and classification of diatoms. In spite of applications, an automatic identification of diatom has a high level of difficulty, due to the present of unwanted elements in water samples. After diatoms have been identified, diatoms classification into genera is an additional problem.
Article
Full-text available
Phytoplankton such as diatoms or desmids are useful for monitoring water quality. Manual image analysis is impractical due to the huge diversity of this group of microalgae and its great morphological plasticity, hence the importance of automating the analysis procedure. Highresolution images of phytoplankton cells can now be acquired by digital microscopes, which facilitate automating the analysis and identification process of specimens. Therefore, new systems of image analysis are potentially advantageous compared to manual methods of counting for solution identification. Segmentation is an important step in the analysis of phytoplankton images. Many standard techniques like thresholding and edge detection are employed in the segmentation of diatoms and other phytoplankton, which are crucial organisms in microscopy images. However, in general, they require several parameters to be fixed beforehand by the user in order to get the best results. This process is usually done by comparing results and looking for the best parameters. To automatize this process, we propose an automatic tuning method to find the optimal parameters in an iterative procedure, called Parametric Segmentation Tuning (PST). This technique compares successive segmentation results, choosing the ones that gets the maximal similarity. In this paper, tuning is formulated as an optimization problem using a similarity function within the solution space. This space consists of the set of binary images that are generated by the segmentation technique to be tuned, where these binary images are seen as a function of the original images and the segmentation parameters. The PST technique was tested with two of the most popular techniques employed to segment phytoplankton images: the Canny edge detection and a binarisation method. The results of the thresholding technique were validated by comparing them to those of the Otsu method and the Canny method with a ground truth. They show that PST is effective to find the best parameters.
Article
Full-text available
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/.
Article
Full-text available
Background: Plankton, including phytoplankton and zooplankton, are the main source of food for organisms in the ocean and form the base of marine food chain. As the fundamental components of marine ecosystems, plankton is very sensitive to environment changes, and the study of plankton abundance and distribution is crucial, in order to understand environment changes and protect marine ecosystems. This study was carried out to develop an extensive applicable plankton classification system with high accuracy for the increasing number of various imaging devices. Literature shows that most plankton image classification systems were limited to only one specific imaging device and a relatively narrow taxonomic scope. The real practical system for automatic plankton classification is even non-existent and this study is partly to fill this gap. Results: Inspired by the analysis of literature and development of technology, we focused on the requirements of practical application and proposed an automatic system for plankton image classification combining multiple view features via multiple kernel learning (MKL). For one thing, in order to describe the biomorphic characteristics of plankton more completely and comprehensively, we combined general features with robust features, especially by adding features like Inner-Distance Shape Context for morphological representation. For another, we divided all the features into different types from multiple views and feed them to multiple classifiers instead of only one by combining different kernel matrices computed from different types of features optimally via multiple kernel learning. Moreover, we also applied feature selection method to choose the optimal feature subsets from redundant features for satisfying different datasets from different imaging devices. We implemented our proposed classification system on three different datasets across more than 20 categories from phytoplankton to zooplankton. The experimental results validated that our system outperforms state-of-the-art plankton image classification systems in terms of accuracy and robustness. Conclusions: This study demonstrated automatic plankton image classification system combining multiple view features using multiple kernel learning. The results indicated that multiple view features combined by NLMKL using three kernel functions (linear, polynomial and Gaussian kernel functions) can describe and use information of features better so that achieve a higher classification accuracy.
Article
Full-text available
This paper deals with automatic taxa identification based on machine learning methods. The aim is therefore to automatically classify diatoms, in terms of pattern recognition terminology. Diatoms are a kind of algae microorganism with high biodiversity at the species level, which are useful for water quality assessment. The most relevant features for diatom description and classification have been selected using an extensive dataset of 80 taxa with a minimum of 100 samples/taxon augmented to 300 samples/taxon. In addition to published morphological, statistical and textural descriptors, a new textural descriptor, Local Binary Patterns (LBP), to characterize the diatom’s valves, and a log Gabor implementation not tested before for this purpose are introduced in this paper. Results show an overall accuracy of 98.11% using bagging decision trees and combinations of descriptors. Finally, some phycological features of diatoms that are still difficult to integrate in computer systems are discussed for future work.
Article
Full-text available
Diatoms, a kind of algae microorganisms with several species, are quite useful for water quality determination, one of the hottest topics in applied biology nowadays. At the same time, deep learning and convolutional neural networks (CNN) are becoming an extensively used technique for image classification in a variety of problems. This paper approaches diatom classification with this technique, in order to demonstrate whether it is suitable for solving the classification problem. An extensive dataset was specifically collected (80 types, 100 samples/type) for this study. The dataset covers different illumination conditions and it was computationally augmented to more than 160,000 samples. After that, CNNs were applied over datasets pre-processed with different image processing techniques. An overall accuracy of 99% is obtained for the 80-class problem and different kinds of images (brightfield, normalized). Results were compared to previous presented classification techniques with different number of samples. As far as the authors know, this is the first time that CNNs are applied to diatom classification.
Article
Chaetoceros is a dominant genus of marine planktonic diatoms with worldwide distribution. Due to the difficulty of extracting setae from Chaetoceros images, automatic segmentation of Chaetoceros is still a challenging task. In this paper, we address this difficult task by regarding the whole segmentation process as unsupervised pixel-wise classification without human participation. First, we automatically produce positive (object) and negative (background) samples for follow-up training, by combining the advantages of two image processing algorithms: Grayscale Surface Direction Angle Model (GSDAM) for extracting setae information and Canny for detecting cell edges from low-contrast and strong-noisy microscopic images. Second, we develop pixel-wise training by using the produced samples in the training process of Deep Convolutional Neural Network (DCNN). At last, the trained DCNN is used to label other pixels into object and background for final segmentation. We compare our method with eight mainstream segmentation approaches: Otsu's thresholding, Canny, Watershed, Mean Shift, gPb-owt-ucm, Normalized Cut, Efficient Graph-based method and GSDAM. To objectively evaluate segmentation results, we apply six well-known evaluation indexes. Experimental results on a new Chaetoceros image dataset with human labelled ground truth show that our method outperforms the eight mainstream segmentation methods in terms of both quantitative and qualitative evaluation.
We study the question of feature sets for robust visual object recognition, adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of Histograms of Oriented Gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.
Conference Paper
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
Article
Saliency-based marker-controlled watershed method was proposed to detect and segment phytoplankton cells from microscopic images of non-setae species. This method first improved IG saliency detection method by combining saturation feature with colour and luminance feature to detect cells from microscopic images uniformly and then produced effective internal and external markers by removing various specific noises in microscopic images for efficient performance of watershed segmentation automatically. The authors built the first benchmark dataset for cell detection and segmentation, including 240 microscopic images across multiple phytoplankton species with pixel-wise cell regions labelled by a taxonomist, to evaluate their method. They compared their cell detection method with seven popular saliency detection methods and their cell segmentation method with six commonly used segmentation methods. The quantitative comparison validates that their method performs better on cell detection in terms of robustness and uniformity and cell segmentation in terms of accuracy and completeness. The qualitative results show that their improved saliency detection method can detect and highlight all cells, and the following marker selection scheme can remove the corner noise caused by illumination, the small noise caused by specks, and debris, as well as deal with blurred edges.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Deep learning has shown great potential for curvilinear structure (e.g. retinal blood vessels and neurites) segmentation as demonstrated by a recent auto-context regression architecture based on filter banks learned by convolutional sparse coding. However, learning such filter banks is very time-consuming, thus limiting the amount of filters employed and the adaptation to other data sets (i.e. slow re-training). We address this limitation by proposing a novel acceleration strategy to speedup convolutional sparse coding filter learning for curvilinear structure segmentation. Our approach is based on a novel initialisation strategy (warm start), and therefore it is different from recent methods improving the optimisation itself. Our warmstart strategy is based on carefully designed hand-crafted filters (SCIRD-TS), modelling appearance properties of curvilinear structures which are then refined by convolutional sparse coding. Experiments on four diverse data sets, including retinal blood vessels and neurites, suggest that the proposed method reduces significantly the time taken to learn convolutional filter banks (i.e. up to 􀀀82%) compared to conventional initialisation strategies. Remarkably, this speed-up does not worsen performance; in fact, filters learned with the proposed strategy often achieve a much lower reconstruction error and match or exceed the segmentation performance of random and DCT-based initialisation, when used as input to a random forest classifier.
Article
We propose a novel semantic segmentation algorithm by learning a deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer net. The deconvolution network is composed of deconvolution and unpooling layers, which identify pixel-wise class labels and predict segmentation masks. We apply the trained network to each proposal in an input image, and construct the final semantic segmentation map by combining the results from all proposals in a simple manner. The proposed algorithm mitigates the limitations of the existing methods based on fully convolutional networks by integrating deep deconvolution network and proposal-wise prediction; our segmentation method typically identifies detailed structures and handles objects in multiple scales naturally. Our network demonstrates outstanding performance in PASCAL VOC 2012 dataset, and we achieve the best accuracy (72.5%) among the methods trained with no external data through ensemble with the fully convolutional network.
Article
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
A novel technique to extract accurate cell contours applied for segmentation of phytoplankton images
  • A Gelzinis
  • A Verikas
  • E Vaiciukynas
  • M Bacauskiene
Gelzinis, A., Verikas, A., Vaiciukynas, E., Bacauskiene, M.: A novel technique to extract accurate cell contours applied for segmentation of phytoplankton images. Machine Vision and Applications 26(2-3), 305-315 (2015)
Automated identification and classification of diatoms from water resources
  • J A Libreros
  • G Bueno
  • M Trujillo
  • M Ospina
Libreros, J.A., Bueno, G., Trujillo, M., Ospina, M.: Automated identification and classification of diatoms from water resources. In: Lecture Notes in Computer Science. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2018. Lecture Notes in Computer Science. Springer (2019)
  • J Redmon
  • A Farhadi
Redmon, J., Farhadi, A.: Yolo9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)