Article

Hyperparameter optimization for image analysis: application to prostate tissue images and live cell data of virus-infected cells

Authors:
  • Merantix Momentum
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose Automated analysis of microscopy image data typically requires complex pipelines that involve multiple methods for different image analysis tasks. To achieve best results of the analysis pipelines, method-dependent hyperparameters need to be optimized. However, complex pipelines often suffer from the fact that calculation of the gradient of the loss function is analytically or computationally infeasible. Therefore, first- or higher-order optimization methods cannot be applied. Methods We developed a new framework for zero-order black-box hyperparameter optimization called HyperHyper, which has a modular architecture that separates hyperparameter sampling and optimization. We also developed a visualization of the loss function based on infimum projection to obtain further insights into the optimization problem. Results We applied HyperHyper in three different experiments with different imaging modalities, and evaluated in total more than 400.000 hyperparameter combinations. HyperHyper was used for optimizing two pipelines for cell nuclei segmentation in prostate tissue microscopy images and two pipelines for detection of hepatitis C virus proteins in live cell microscopy data. We evaluated the impact of separating the sampling and optimization strategy using different optimizers and employed an infimum projection for visualizing the hyperparameter space. Conclusions The separation of sampling and optimization strategy of the proposed HyperHyper optimization framework improves the result of the investigated image analysis pipelines. Visualization of the loss function based on infimum projection enables gaining further insights on the optimization process.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Computational molecular medicine aims to construct a comprehensive understanding of molecular networks by gaining insights into the concentrations of biomolecules. Such assistance leads to more informed clinical decisions [50]. Computational physiological medicine aims to create disease models that integrate information from multiple levels of biological organization and apply these computational models to patient care. ...
... Computational physiological medicine aims to create disease models that integrate information from multiple levels of biological organization and apply these computational models to patient care. Such biological organizations include molecules, cells, tissues, and organ systems [50][51][52]. Computational anatomy aims to utilize mathematical theories to model anatomical structures and their variations in health and disease [53,54]. An example of computational anatomy is the identification of changes in the shape and motion of a heart, and using this data to predict specific cardiac diseases [55,56]. ...
... An essential task in developing a CNN-based solution is the choice of a suitable neural network architecture, its initialization, and (training) hyperparameters [9,10,11,12]. CNN architectures and hyperparameters are usually selected based on experience and manual optimization, where di erent architectures are tried, multiple trainings are performed, and hyperparameters are adjusted. This process is very time consuming and computationally intensive for a number of reasons: The optimization task is multi-dimensional, involving categorical (e.g., CNN architecture), Boolean (e.g., whether to use certain CNN building blocks), integer (e.g., number of epochs or early stopping patience), and continuous (e.g., learning rate) variables. ...
... For some time, research has been carried out on methods to automate the search for optimal CNN architectures and hyperparameters and make these steps more e cient. Such "AutoML" methods have been published for generic applications [16,17] and medical image analysis [9]. Their main building blocks are automatic neural architecture search [18,19,20] and algorithmic hyperparameter optimization [10,10]. ...
Preprint
Image analysis tasks in computational pathology are commonly solved using convolutional neural networks (CNNs). The selection of a suitable CNN architecture and hyperparameters is usually done through exploratory iterative optimization, which is computationally expensive and requires substantial manual work. The goal of this article is to evaluate how generic tools for neural network architecture search and hyperparameter optimization perform for common use cases in computational pathology. For this purpose, we evaluated one on-premises and one cloud-based tool for three different classification tasks for histological images: tissue classification, mutation prediction, and grading. We found that the default CNN architectures and parameterizations of the evaluated AutoML tools already yielded classification performance on par with the original publications. Hyperparameter optimization for these tasks did not substantially improve performance, despite the additional computational effort. However, performance varied substantially between classifiers obtained from individual AutoML runs due to non-deterministic effects. Generic CNN architectures and AutoML tools could thus be a viable alternative to manually optimizing CNN architectures and parametrizations. This would allow developers of software solutions for computational pathology to focus efforts on harder-to-automate tasks such as data curation.
... In this chapter, a framework for black-box hyperparameter optimization for biomedical image analysis pipelines called HyperHyper is proposed. The work has been published in Wollmann, Ritter, et al. [11,4]. The HyperHyper framework has several advantages compared to existing hyperparameter optimization frameworks such as Google Vizier [221], Sherpa [222], Auto-WEKA [223], Spearmint [224], and Hyperopt [225]. ...
... PDAE uses multiple measurements from DetNet and an elliptical sampler, and integrates the information using a Kalman filter via combined innovation. Model parameters of PDAE were optimized using CMA-ES in HyperHyper [4]. The DetNet-PDAE method has been benchmarked using data from the Particle Tracking Challenge. ...
Thesis
Full-text available
High-content microscopy led to many advances in biology and medicine. This fast emerging technology is transforming cell biology into a big data driven science. Computer vision methods are used to automate the analysis of microscopy image data. In recent years, deep learning became popular and had major success in computer vision. Most of the available methods are developed to process natural images. Compared to natural images, microscopy images pose domain specific challenges such as small training datasets, clustered objects, and class imbalance. In this thesis, new deep learning methods for object detection and cell segmentation in microscopy images are introduced. For particle detection in fluorescence microscopy images, a deep learning method based on a domain-adapted Deconvolution Network is presented. In addition, a method for mitotic cell detection in heterogeneous histopathology images is proposed, which combines a deep residual network with Hough voting. The method is used for grading of whole-slide histology images of breast carcinoma. Moreover, a method for both particle detection and cell detection based on object centroids is introduced, which is trainable end-to-end. It comprises a novel Centroid Proposal Network, a layer for ensembling detection hypotheses over image scales and anchors, an anchor regularization scheme which favours prior anchors over regressed locations, and an improved algorithm for Non-Maximum Suppression. Furthermore, a novel loss function based on Normalized Mutual Information is proposed which can cope with strong class imbalance and is derived within a Bayesian framework. For cell segmentation, a deep neural network with increased receptive field to capture rich semantic information is introduced. Moreover, a deep neural network which combines both paradigms of multi-scale feature aggregation of Convolutional Neural Networks and iterative refinement of Recurrent Neural Networks is proposed. To increase the robustness of the training and improve segmentation, a novel focal loss function is presented. In addition, a framework for black-box hyperparameter optimization for biomedical image analysis pipelines is proposed. The framework has a modular architecture that separates hyperparameter sampling and hyperparameter optimization. A visualization of the loss function based on infimum projections is suggested to obtain further insights into the optimization problem. Also, a transfer learning approach is presented, which uses only one color channel for pre-training and performs fine-tuning on more color channels. Furthermore, an approach for unsupervised domain adaptation for histopathological slides is presented. Finally, Galaxy Image Analysis is presented, a platform for web-based microscopy image analysis. Galaxy Image Analysis workflows for cell segmentation in cell cultures, particle detection in mice brain tissue, and MALDI/H&E image registration have been developed. The proposed methods were applied to challenging synthetic as well as real microscopy image data from various microscopy modalities. It turned out that the proposed methods yield state-of-the-art or improved results. The methods were benchmarked in international image analysis challenges and used in various cooperation projects with biomedical researchers.
... However, this can be limited by resource constraints in real-world environments such as hospitals. To tackle this challenge, researchers have turned to HPO techniques such as BO, despite their substantial costs, to improve model performance [39,17]. Therefore, in this section, we show the efficacy of BOSS on two critical medical image analysis tasks. ...
Preprint
Bayesian optimization (BO) has contributed greatly to improving model performance by suggesting promising hyperparameter configurations iteratively based on observations from multiple training trials. However, only partial knowledge (i.e., the measured performances of trained models and their hyperparameter configurations) from previous trials is transferred. On the other hand, Self-Distillation (SD) only transfers partial knowledge learned by the task model itself. To fully leverage the various knowledge gained from all training trials, we propose the BOSS framework, which combines BO and SD. BOSS suggests promising hyperparameter configurations through BO and carefully selects pre-trained models from previous trials for SD, which are otherwise abandoned in the conventional BO process. BOSS achieves significantly better performance than both BO and SD in a wide range of tasks including general image classification, learning with noisy labels, semi-supervised learning, and medical image analysis tasks.
... While discussing the effect of dense layers on the network, the number of layers was also studied. One to three layers were simulated, the most common number of dense layers [56] as one of the most influential parameters in the whole network. The ReLU function for the activation function, which plays the role of neuron transformation for each layer, was designated. ...
Article
Full-text available
A comprehensive approach to understand the mechanical behavior of materials involves costly and time-consuming experiments. Recent advances in machine learning and in the field of computational material science could significantly reduce the need for experiments by enabling the prediction of a material’s mechanical behavior. In this paper, a reliable data pipeline consisting of experimentally validated phase field simulations and finite element analysis was created to generate a dataset of dual-phase steel microstructures and mechanical behaviors under different heat treatment conditions. Afterwards, a deep learning-based method was presented, which was the hybridization of two well-known transfer-learning approaches, ResNet50 and VGG16. Hyper parameter optimization (HPO) and fine-tuning were also implemented to train and boost both methods for the hybrid network. By fusing the hybrid model and the feature extractor, the dual-phase steels’ yield stress, ultimate stress, and fracture strain under new treatment conditions were predicted with an error of less than 1%.
... Health & Biomedical (63) Diagnosis (32) Breast Cancer (6) [383,394,456,462,504,505] Mental Health (6) [172,173,269,338,473,543] Other (20) [62,99,107,119,134,194,212,275,276,280,389,425,466,468,488,507,517,523,567,573] Condition (1) [508] Sport (1) [229] Media (1) [50] In total, 169 papers have been included in this survey. They are cited within Table 121; discussion around the listed groupings is deferred to later. ...
Preprint
With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and industry. For the still relatively young field of Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period is under way, spurred on by a burgeoning interest from broader society. Yet, to date, little research has been undertaken to assess the current state of this dissemination and its uptake. Thus, this review makes two primary contributions to knowledge around this topic. Firstly, it provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial. Secondly, it motivates and outlines a framework for assessing whether an AutoML solution designed for real-world application is 'performant'; this framework extends beyond the limitations of typical academic criteria, considering a variety of stakeholder needs and the human-computer interactions required to service them. Thus, additionally supported by an extensive assessment and comparison of academic and commercial case-studies, this review evaluates mainstream engagement with AutoML in the early 2020s, identifying obstacles and opportunities for accelerating future uptake.
... Thus, analyzing the visible execution of movements can reveal crucial information about the subject's internal condition without requiring invasive procedures such as brain surgery. learning has lately shown tremendous successes in automatizing several medical studies [133,22,198,147,153,140,168], a large amount of neuroscience researchers still perform behavior analyzes manually by closely observing and annotating hours of recordings. ...
Thesis
Computer vision intends to provide the human abilities of understanding and interpreting the visual surroundings to computers. An essential element to comprehend the environment is to extract relevant information from complex visual data so that the desired task can be solved. For instance, to distinguish cats from dogs the feature 'body shape' is more relevant than 'eye color' or the 'amount of legs'. In traditional computer vision it is conventional to develop handcrafted functions that extract specific low-level features such as edges from visual data. However, in order to solve a particular task satisfactorily we require a combination of several features. Thus, the approach of traditional computer vision has the disadvantage that whenever a new task is addressed, a developer needs to manually specify all the features the computer should look for. For that reason, recent works have primarily focused on developing new algorithms that teach the computer to autonomously detect relevant and task-specific features. Deep learning has been particularly successful for that matter. In deep learning, artificial neural networks automatically learn to extract informative features directly from visual data. The majority of developed deep learning strategies require a dataset with annotations which indicate the solution of the desired task. The main bottleneck is that creating such a dataset is very tedious and time-intensive considering that every sample needs to be annotated manually. This thesis presents new techniques that attempt to keep the amount of human supervision to a minimum while still reaching satisfactory performances on various visual understanding tasks. In particular, this thesis focuses on self-supervised learning algorithms that train a neural network on a surrogate task where no human supervision is required. We create an artificial supervisory signal by breaking the order of visual patterns and asking the network to recover the original structure. Besides demonstrating the abilities of our model on common computer vision tasks such as action recognition, we additionally apply our model to biomedical scenarios. Many research projects in medicine involve profuse manual processes that extend the duration of developing successful treatments. Taking the example of analyzing the motor function of neurologically impaired patients we show that our self-supervised method can help to automate tedious, visually based processes in medical research. In order to perform a detailed analysis of motor behavior and, thus, provide a suitable treatment, it is important to discover and identify the negatively affected movements. Therefore, we propose a magnification tool that can detect and enhance subtle changes in motor function including motor behavior differences across individuals. In this way, our automatic diagnostic system does not only analyze apparent behavior but also facilitates the perception and discovery of impaired movements. Learning a feature representation without requiring annotations significantly reduces human supervision. However, using annotated dataset leads generally to better performances in contrast to self-supervised learning methods. Hence, we additionally examine semi-supervised approaches which efficiently combine few annotated samples with large unlabeled datasets. Consequently, semi-supervised learning represents a good trade-off between annotation time and accuracy.
... Sherpa is already being used in a wide variety of applications such as machine learning methods [29], solid state physics [30], particle physics [31], medical image analysis [32], and cyber security [33]. Due to the fact that the number of machine learning applications is growing rapidly we can expect there to be a growing need for hyperparameter optimization software such as Sherpa. ...
Article
Full-text available
Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Sherpa can be run on either a single machine or in parallel on a cluster. Finally, an interactive dashboard enables users to view the progress of models as they are trained, cancel trials, and explore which hyperparameter combinations are working best. Sherpa empowers machine learning practitioners by automating the more tedious aspects of model tuning. Its source code and documentation are available at https://github.com/sherpa-ai/sherpa.
... Optimization of a CNN during the training process is necessary to achieve high classification accuracies [37,38]. Hyperparameters may be optimized which often includes testing various hyperparameter combinations, a process that may be limited by the computational power available [39,40]. With respect to the available computational power, we decided to optimize regularization, dropout, and learning rate. ...
Article
Full-text available
Background: Gastritis is a prevalent disease and commonly classified into autoimmune (A), bacterial (B), and chemical (C) type gastritis. While the former two subtypes are associated with an increased risk of developing gastric intestinal adenocarcinoma, the latter subtype is not. In this study, we evaluated the capability to classify common gastritis subtypes using convolutional neuronal networks on a small dataset of antrum and corpus biopsies. Methods: 1230 representative 500 × 500 µm images of 135 patients with type A, type B, and type C gastritis were extracted from scanned histological slides. Patients were allocated randomly into a training set (60%), a validation set (20%), and a test set (20%). One classifier for antrum and one classifier for corpus were trained and optimized. After optimization, the test set was analyzed using a joint result from both classifiers. Results: Overall accuracy in the test set was 84% and was particularly high for type B gastritis with a sensitivity of 100% and a specificity of 93%. Conclusions: Classification of gastritis subtypes is possible using convolutional neural networks on a small dataset of histopathological images of antrum and corpus biopsies. Deep learning strategies to support routine diagnostic pathology merit further evaluation.
Article
The potential of deep learning to improve the performance of image classification has been demonstrated. In this paper, we present a study on the optimization of the hyperparameters for the classification of the MNIST dataset. We performed a comparison between a standard two-layer perceptron model and a CNN model with different techniques. The optimized hyperparameters for CNN are based on the number of filters, kernel size, and convolutional layers. The optimized CNN model performed better than the default model on the classification task of the MNIST dataset. The various hyperparameters included the learning rate, batch size, the number of hidden layers, the dropout rate, the activation function, and the optimizer. The optimized CNN model was able to achieve an accuracy of 99% on a test, which is significantly better than the 96% accuracy of the default model. The difference between the two models is that the former takes longer to train and has a slightly longer time per image. The study demonstrates the importance of optimizing the hyperparameters in deep learning-focused classification tasks. The findings show how CNN architectures perform well in these applications, and it shows how optimizing these components can yield superior results. These recommendations can help further develop efficient and accurate models for this field.
Article
Image analysis tasks in computational pathology are commonly solved using convolutional neural networks (CNNs). The selection of a suitable CNN architecture and hyperparameters is usually done through exploratory iterative optimization, which is computationally expensive and requires substantial manual work. The goal of this article is to evaluate how generic tools for neural network architecture search and hyperparameter optimization perform for common use cases in computational pathology. For this purpose, we evaluated one on-premises and one cloud-based tool for three different classification tasks for histological images: tissue classification, mutation prediction, and grading. We found that the default CNN architectures and parameterizations of the evaluated AutoML tools already yielded classification performance on par with the original publications. Hyperparameter optimization for these tasks did not substantially improve performance, despite the additional computational effort. However, performance varied substantially between classifiers obtained from individual AutoML runs due to non-deterministic effects. Generic CNN architectures and AutoML tools could thus be a viable alternative to manually optimizing CNN architectures and parametrizations. This would allow developers of software solutions for computational pathology to focus efforts on harder-to-automate tasks such as data curation.
Preprint
Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Sherpa can be run on either a single machine or in parallel on a cluster. Finally, an interactive dashboard enables users to view the progress of models as they are trained, cancel trials, and explore which hyperparameter combinations are working best. Sherpa empowers machine learning practitioners by automating the more tedious aspects of model tuning. Its source code and documentation are available at https://github.com/sherpa-ai/sherpa.
Article
Full-text available
Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Sherpa can be run on either a single machine or in parallel on a cluster. Finally, an interactive dashboard enables users to view the progress of models as they are trained, cancel trials, and explore which hyperparameter combinations are working best. Sherpa empowers machine learning practitioners by automating the more tedious aspects of model tuning. Its source code and documentation are available at https://github.com/sherpa-ai/sherpa.
Article
Full-text available
Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.
Article
Full-text available
We present a combined report on the results of three editions of the Cell Tracking Challenge, an ongoing initiative aimed at promoting the development and objective evaluation of cell segmentation and tracking algorithms. With 21 participating algorithms and a data repository consisting of 13 data sets from various microscopy modalities, the challenge displays today's state-of-the-art methodology in the field. We analyzed the challenge results using performance measures for segmentation and tracking that rank all participating methods. We also analyzed the performance of all of the algorithms in terms of biological measures and practical usability. Although some methods scored high in all technical aspects, none obtained fully correct solutions. We found that methods that either take prior information into account using learning strategies or analyze cells in a global spatiotemporal video context performed better than other methods under the segmentation and tracking scenarios included in the challenge.
Conference Paper
Full-text available
Any sufficiently complex system acts as a black box when it becomes easier to experiment with than to understand. Hence, black-box optimization has become increasingly important as systems have become more complex. In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. Google Vizier is used to optimize many of our machine learning models and other systems, and also provides core capabilities to Google's Cloud Machine Learning HyperTune subsystem. We discuss our requirements, infrastructure design, underlying algorithms, and advanced features such as transfer learning and automated early stopping that the service provides.
Article
Full-text available
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Full-text available
Tracking subcellular structures as well as viral structures displayed as 'particles' in fluorescence microscopy images yields quantitative information on the underlying dynamical processes. We have developed an approach for tracking multiple fluorescent particles based on probabilistic data association. The approach combines a localization scheme that uses a bottom-up strategy based on the spot-enhancing filter as well as a top-down strategy based on an ellipsoidal sampling scheme that uses the Gaussian probability distributions computed by a Kalman filter. The localization scheme yields multiple measurements that are incorporated into the Kalman filter via a combined innovation, where the association probabilities are interpreted as weights calculated using an image likelihood. To track objects in close proximity, we compute the support of each image position relative to the neighboring objects of a tracked object and use this support to re-calculate the weights. To cope with multiple motion models, we integrated the interacting multiple model algorithm. The approach has been successfully applied to synthetic 2D and 3D images as well as to real 2D and 3D microscopy images, and the performance has been quantified. In addition, the approach was successfully applied to the 2D and 3D image data of the recent Particle Tracking Challenge at the IEEE International Symposium on Biomedical Imaging (ISBI) 2012.
Article
Full-text available
Human immunodeficiency virus type 1 (HIV-1) particles assemble at the plasma membrane, which is lined by a dense network of filamentous actin (F-actin). Large amounts of actin have been detected in HIV-1 virions, proposed to be incorporated by interactions with the nucleocapsid domain of the viral polyprotein Gag. Previous studies addressing the role of F-actin in HIV-1 particle formation using F-actin-interfering drugs did not yield consistent results. Filamentous structures pointing toward nascent HIV-1 budding sites, detected by cryo-electron tomography and atomic force microscopy, prompted us to revisit the role of Factin in HIV-1 assembly by live-cell microscopy. HeLa cells coexpressing HIV-1 carrying fluorescently labeled Gag and a labeled F-actin-binding peptide were imaged by live-cell total internal reflection fluorescence microscopy (TIR-FM). Computational analysis of image series did not reveal characteristic patterns of F-actin in the vicinity of viral budding sites. Furthermore, no transient recruitment of F-actin during bud formation was detected by monitoring fluorescence intensity changes at nascent HIV-1 assembly sites. The chosen approach allowed us to measure the effect of F-actin-interfering drugs on the assembly of individual virions in parallel with monitoring changes in the F-actin network of the respective cell. Treatment of cells with latrunculin did not affect the efficiency and dynamics of Gag assembly under conditions resulting in the disruption of F-actin filaments. Normal assembly rates were also observed upon transient stabilization of F-actin by short-term treatment with jasplakinolide. Taken together, these findings indicate that actin filament dynamics are dispensable for HIV-1 Gag assembly at the plasma membrane of HeLa cells.
Article
Full-text available
Particle tracking is of key importance for quantitative analysis of intracellular dynamic processes from time-lapse microscopy image data. Because manually detecting and following large numbers of individual particles is not feasible, automated computational methods have been developed for these tasks by many groups. Aiming to perform an objective comparison of methods, we gathered the community and organized an open competition in which participating teams applied their own methods independently to a commonly defined data set including diverse scenarios. Performance was assessed using commonly defined measures. Although no single method performed best across all scenarios, the results revealed clear differences between the various approaches, leading to notable practical conclusions for users and developers.
Article
Full-text available
Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that addresses these issues in isolation. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection/hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.
Conference Paper
Full-text available
State-of-the-art algorithms for hard computational problems often expose many parameters that can be modified to improve empirical performance. However, manually exploring the resulting combinatorial space of parameter settings is tedious and tends to lead to unsatisfactory outcomes. Recently, automated approaches for solving this algorithm configuration problem have led to substantial improvements in the state of the art for solving various problems. One promising approach constructs explicit regression models to describe the dependence of target algorithm performance on parameter settings; however, this approach has so far been limited to the optimization of few numerical algorithm parameters on single instances. In this paper, we extend this paradigm for the first time to general algorithm configuration problems, allowing many categorical parameters and optimization for sets of instances. We experimentally validate our new algorithm configuration procedure by optimizing a local search and a tree search solver for the propositional satisfiability problem (SAT), as well as the commercial mixed integer programming (MIP) solver CPLEX. In these experiments, our procedure yielded state-of-the-art performance, and in many cases outperformed the previous best configuration approach.
Article
Full-text available
We present a new, robust, computational procedure for tracking fluorescent markers in time-lapse microscopy. The algorithm is optimized for finding the time-trajectory of single particles in very noisy dynamic (two- or three-dimensional) image sequences. It proceeds in three steps. First, the images are aligned to compensate for the movement of the biological structure under investigation. Second, the particle's signature is enhanced by applying a Mexican hat filter, which we show to be the optimal detector of a Gaussian-like spot in 1/omega2 noise. Finally, the optimal trajectory of the particle is extracted by applying a dynamic programming optimization procedure. We have used this software, which is implemented as a Java plug-in for the public-domain ImageJ software, to track the movement of chromosomal loci within nuclei of budding yeast cells. Besides reducing trajectory analysis time by several 100-fold, we achieve high reproducibility and accuracy of tracking. The application of the method to yeast chromatin dynamics reveals different classes of constraints on mobility of telomeres, reflecting differences in nuclear envelope association. The generic nature of the software allows application to a variety of similar biological imaging tasks that require the extraction and quantitation of a moving particle's trajectory.
Conference Paper
Tracking subcellular structures displayed as small spots in fluorescence microscopy images is important to determine quantitative information of biological processes. We have developed an approach for tracking multiple fluorescent particles based on two-filter smoothing and probabilistic data association. Compared to previous work, our approach exploits information from past and future time points, integrates multiple measurements, and combines Kalman filtering and particle filtering. We evaluated our approach based on data from the ISBI Particle Tracking Challenge and found that it yields state-of-the-art results for low signal-to-noise ratios. We also applied our method to live cell fluorescence microscopy image sequences of HIV-1 particles and HCV proteins. It turned out that the new approach generally outperforms existing methods.
Article
We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order query oracles. Such problems arise naturally in a variety of practical applications, including optimizing experimental or simulation parameters with many variables. Under sparsity assumptions on the gradients or function values, we present a successive component/feature selection algorithm and a noisy mirror descent algorithm with Lasso gradient estimates and show that both algorithms have convergence rates depending only logarithmically on the ambient problem dimension. Empirical results verify our theoretical findings and suggest that our designed algorithms outperform classical zeroth-order optimization methods in the high-dimensional setting.
Article
Automated microscopy has given researchers access to great amounts of live cell imaging data from in vitro and in vivo experiments. Much focus has been put on extracting cell tracks from such data using a plethora of segmentation and tracking algorithms, but further analysis is normally required to draw biologically relevant conclusions. Such relevant conclusions may be whether the migration is directed or not, whether the population has homogeneous or heterogeneous migration patterns. This review focuses on the analysis of cell migration data that are extracted from time lapse images. We discuss a range of measures and models used to analyze cell tracks independent of the biological system or the way the tracks were obtained. For single-cell migration, we focus on measures and models giving examples of biological systems where they have been applied, for example, migration of bacteria, fibroblasts, and immune cells. For collective migration, we describe the model systems wound healing, neural crest migration, and Drosophila gastrulation and discuss methods for cell migration within these systems. We also discuss the role of the extracellular matrix and subsequent differences between track analysis in vitro and in vivo. Besides methods and measures, we are putting special focus on the need for openly available data and code, as well as a lack of common vocabulary in cell track analysis. © 2017 International Society for Advancement of Cytometry
Article
In large scale biological experiments, like high-throughput or high-content cellular screening, the amount and the complexity of images to be analyzed is steadily increasing. To handle and process these images, well defined image processing and analysis steps need to be performed by applying dedicated workflows. Multiple software tools have emerged with the aim to facilitate creation of such workflows by integrating existing methods, tools, and routines, and by adapting them to different applications and questions, as well as making them reusable and interchangeable. In this review, we describe workflow systems for the integration of microscopy image analysis techniques with focus on KNIME and Galaxy.
Article
Purpose Oncological treatment is being increasingly complex, and therefore, decision making in multidisciplinary teams is becoming the key activity in the clinical pathways. The increased complexity is related to the number and variability of possible treatment decisions that may be relevant to a patient. In this paper, we describe validation of a multidisciplinary cancer treatment decision in the clinical domain of head and neck oncology. Method Probabilistic graphical models and corresponding inference algorithms, in the form of Bayesian networks, can support complex decision-making processes by providing a mathematically reproducible and transparent advice. The quality of BN-based advice depends on the quality of the model. Therefore, it is vital to validate the model before it is applied in practice. ResultsFor an example BN subnetwork of laryngeal cancer with 303 variables, we evaluated 66 patient records. To validate the model on this dataset, a validation workflow was applied in combination with quantitative and qualitative analyses. In the subsequent analyses, we observed four sources of imprecise predictions: incorrect data, incomplete patient data, outvoting relevant observations, and incorrect model. Finally, the four problems were solved by modifying the data and the model. Conclusion The presented validation effort is related to the model complexity. For simpler models, the validation workflow is the same, although it may require fewer validation methods. The validation success is related to the model’s well-founded knowledge base. The remaining laryngeal cancer model may disclose additional sources of imprecise predictions.
Article
To gain a better understanding of cellular and molecular processes it is important to quantitatively analyze the motion of subcellular particles in live cell microscopy image sequences. Since generally the subcellular particles move and cell nuclei move as well as deform, it is important to decouple the movement of particles from that of the cell nuclei using nonrigid registration methods. We have developed a diffeomorphic multi-frame approach for non-rigid registration of cell nuclei in 2D and 3D live cell fluorescence microscopy images. Our non-rigid registration approach is based on local optic flow estimation, exploits information from multiple consecutive image frames, and determines diffeomorphic transformations in the logdomain which allows efficient computation of the inverse transformations. To register single images of an image sequence to a reference image, we use a temporally weighted mean image which is constructed based on inverse transformations and multiple consecutive frames. Using multiple consecutive frames improves the registration accuracy compared to pairwise registration, and using a temporally weighted mean image significantly reduces the computation time compared to previous work. In addition, we use a flow boundary preserving method for regularization of computed deformation vector fields, which prevents from over-smoothing compared to standard Gaussian filtering. Our approach has been successfully applied to 2D and 3D synthetic as well as real live cell microscopy image sequences, and an experimental comparison with non-rigid pairwise, multi-frame, and temporal groupwise registration has been carried out.
Conference Paper
Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Book
Introduction.- Sensors.- Architecture.- Common Representational Format.- Spatial Alignment.- Temporal Alignment.- Semantic Alignment.- Radiometric Normalization.- Bayesian Inference.- Parameter Estimation.- Robust Statistics.- Sequential Bayesian Inference.- Bayesian Decision Theory.- Ensemble Learning.- Sensor Management.
Article
Assuming that numerical scores are available for the performance of each of n persons on each of n jobs, the "assignment problem" is the quest for an assignment of persons to jobs so that the sum of the n scores so obtained is as large as possible. It is shown that ideas latent in the work of two Hungarian mathematicians may be exploited to yield a new method of solving this problem.
Chapter
The subject of this chapter is image fusion using the methods of ensemble learning. Ensemble learning is a method for constructing accurate predictors or classifiers from an ensemble of weak predictors or classifiers. In the context of image fusion, we use the term ensemble learning to denote the fusion of K input images I k ,k ∈ {1,2, . . .,K}, where the I k are all derived from the same base image I*. The I k themselves highlight different features in I*. The theory of ensemble learning suggests that by fusing together the I k we may obtain a fused image with a substantially improved quality. In the first part of the chapter we consider methods for constructing I k . In the second part we consider methods for fusing the I k .
Article
Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.
Article
Image-guided interventions are medical procedures that use computer-based systems to provide virtual image overlays to help the physician precisely visualize and target the surgical site. This field has been greatly expanded by the advances in medical imaging and computing power over the past 20 years. This review begins with a historical overview and then describes the component technologies of tracking, registration, visualization, and software. Clinical applications in neurosurgery, orthopedics, and the cardiac and thoracoabdominal areas are discussed, together with a description of an evolving technology named Natural Orifice Transluminal Endoscopic Surgery (NOTES). As the trend toward minimally invasive procedures continues, image-guided interventions will play an important role in enabling new procedures, while improving the accuracy and success of existing approaches. Despite this promise, the role of image-guided systems must be validated by clinical trials facilitated by partnerships between scientists and physicians if this field is to reach its full potential.
Sequential model-based optimization for general algorithm configuration
  • F Hutter
  • H H Hoos
  • K Leyton-Brown