Stephan Winkler

Stephan Winkler
  • University of Applied Sciences Upper Austria

About

229
Publications
28,573
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,118
Citations
Current institution
University of Applied Sciences Upper Austria

Publications

Publications (229)
Chapter
The increasing complexity of machine learning models has motivated the need to ensure that the results are understandable and transparent, enabling trust and accountability. This work provides an extensive overview of methods to measure the explainability and interpretability of machine learning results. This work addresses the challenges posed by...
Chapter
Gradient descent-based local search can dramatically improve solution performance in symbolic regression tasks, at the cost of significantly higher runtime as well as increased risks of overfitting. In this paper, we investigate exactly what amount of local search is really needed within the GP population. We show that low intensity local search is...
Article
Full-text available
Symbolic regression is commonly used in domains where both high accuracy and interpretability of models is required. While symbolic regression is capable to produce highly accurate models, small changes in the training data might cause highly dissimilar solution. The implications in practice are huge, as interpretability as key-selling feature degr...
Article
Full-text available
Variation in nuclear size and shape is an important criterion of malignancy for many tumor types; however, categorical estimates by pathologists have poor reproducibility. Measurements of nuclear characteristics can improve reproducibility, but current manual methods are time-consuming. The aim of this study was to explore the limitations of estima...
Article
Full-text available
The recently developed high-throughput system for cell spheroid generation (SpheroWell) is a promising technology for cost- and time-efficient in vitro analysis of, for example, chondrogenic differentiation. It is a compartmental growth surface where spheroids develop from a cell monolayer by self-assembling and aggregation. In order to automatize...
Article
Full-text available
Background Acute myeloid leukemia (AML) is characterized by the abnormal proliferation of myeloid precursor cells and presents significant challenges in treatment due to its heterogeneity. Recently, the NLRP3 inflammasome has emerged as a potential contributor to AML pathogenesis, although its precise mechanisms remain poorly understood. Methods P...
Article
A novel micro‐channel technique for analyzing the coalescence of bubbles and obtaining relevant information for the creation of a coalescence database is presented. The micro‐channel improves the coalescence investigations by a continuously operated setup, reduces the accumulation of impurities and increases the amount of recorded data. To introduc...
Article
Full-text available
The integration of deep learning-based tools into diagnostic workflows is increasingly prevalent due to their efficiency and reproducibility in various settings. We investigated the utility of automated nuclear morphometry for assessing nuclear pleomorphism (NP), a criterion of malignancy in the current grading system in canine pulmonary carcinoma...
Preprint
Full-text available
Symbolic regression is a machine learning method with the goal to produce interpretable results. Unlike other machine learning methods such as, e.g. random forests or neural networks, which are opaque, symbolic regression aims to model and map data in a way that can be understood by scientists. Recent advancements, have attempted to bridge the gap...
Article
Full-text available
Background: In genomics, highly sensitive point mutation detection is particularly relevant for cancer diagnosis and early relapse detection. Next-generation sequencing combined with unique molecular identifiers (UMIs) is known to improve the mutation detection sensitivity. Methods: We present an open-source bioinformatics framework named Interface...
Preprint
Full-text available
Vectorial Genetic Programming (Vec-GP) extends GP by allowing vectors as input features along regular, scalar features, using them by applying arithmetic operations component-wise or aggregating vectors into scalars by some aggregation function. Vec-GP also allows aggregating vectors only over a limited segment of the vector instead of the whole ve...
Chapter
We analyze data of 18,000 patients for identifying models that are able to detect complications in the data of surgeries and other medical treatments. High quality detection models are found using data available for those patients, for whom general data as well as risk factors are available. For identifying these detection models we use explainable...
Chapter
Due to the growing use of machine learning models in many critical domains, ambitions to make the models and their predictions explainable have increased recently significantly as new research interest. In this paper, we present an extension to the machine learning based data mining technique of variable interaction networks, to improve their struc...
Chapter
Describing dynamic medical systems using machine learning is a challenging topic with a wide range of applications. In this work, the possibility of modeling the blood glucose level of diabetic patients purely on the basis of measured data is described. A combination of the influencing variables insulin and calories are used to find an interpretabl...
Article
Full-text available
Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivi...
Preprint
Full-text available
Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially ot...
Article
In this paper, we present an heterogeneous ensemble modeling approach to learn predictors for yeast contamination in freshly harvested peppermint batches. Our research is based on data about numerous parameters of the harvesting process, such as planting, tillage, fertilization, harvesting, drying, as well as information about microbial contaminati...
Chapter
Vectorial Genetic Programming (GP) is a young branch of GP, where the training data for symbolic models not only include regular, scalar variables, but also allow vector variables. Also, the model’s abilities are extended to allow operations on vectors, where most vector operations are simply performed component-wise. Additionally, new aggregation...
Article
The development of energy management systems that optimize the electrical energy flows of residential buildings has become important nowadays. The optimization is formulated as symbolic regression problem that is solved by genetic programming, which provides near optimal results while being highly performant during application. Additionally, the so...
Article
Full-text available
Background Next-generation sequencing (NGS) is nowadays the most used high-throughput technology for DNA sequencing. Among others NGS enables the in-depth analysis of immune repertoires. Research in the field of T cell receptor (TCR) and immunoglobulin (IG) repertoires aids in understanding immunological diseases. A main objective is the analysis o...
Article
The traditional approach to modeling the polymer melt flow in single-screw extruders is based on analytical and numerical analyses. Due to increasing computational power, data-driven modeling has grown significantly in popularity in recent years. In this study, we compared and evaluated databased modeling approaches (i. e., gradient-boosted trees,...
Article
Full-text available
Background Antigen recognition of allo-peptides and HLA molecules leads to the activation of donor-reactive T-cells following transplantation, potentially causing T-cell-mediated rejection (TCMR). Sequencing of the T-cell receptor (TCR) repertoire can be used to track the donor-reactive repertoire in blood and tissue of patients after kidney transp...
Preprint
Full-text available
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Preprint
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
Article
Shaping the optical wavefront utilizing spatial light modulators (SLMs) is a valuable methodology for modification of a focal spot. It has numerous applications in microscopy, particularly in scattering media. Optical lithography techniques, i.e. multiphoton lithography, can also benefit from wavefront shaping. In this contribution, we present focu...
Preprint
Aberrant activation of the NLR family pyrin domain containing 3 (NLRP3) inflammasome mediates numerous inflammatory diseases. Oncogenes can activate the NLRP3 inflammasome and thereby promote myeloproliferative neoplasia, suggesting a crucial role of NLRP3 in the malignant transformation of hematopoietic cells. Here, we show that bone marrow-derive...
Article
Full-text available
Cross-linking mass spectrometry (XL-MS) has become a powerful technique that enables insights into protein structures and protein interactions. The development of cleavable cross-linkers has further promoted XL-MS through search space reduction, thereby allowing for proteome-wide studies. These new analysis possibilities foster the development of n...
Article
Full-text available
Rationale: Database search engines are the preferred method to identify peptides in mass spectrometry data. However, valuable software is in this context not only defined by a powerful algorithm to separate correct from false identifications but is also defined by constant maintenance and continuous improvements. Methods: In 2014, we presented o...
Article
Full-text available
In this paper, we present a new evolution-based algorithm that optimizes cell detection image processing workflows in a self-adaptive fashion. We use evolution strategies to optimize the parameters for all steps of the image processing pipeline and improve cell detection results. The algorithm reliably produces good cell detection results without t...
Article
This study was conducted to examine putative correlations between weather parameters during April-September and the amounts of nutrients, minerals and bioactive compounds in the juices of 16 apple varieties from four harvest years in Lower Austria. For most sugar-parameters, negative correlations were found with the total precipitation (r between -...
Conference Paper
In this paper we present results for the Blood Glucose Level Prediction Challenge for the Ohio2020 dataset. We have used four variants of genetic programming to build white-box models for predicting 30 minutes and 60 minutes ahead. The results are compared to classical methods including multi-variate linear regression, random forests, as well as tw...
Article
Full-text available
In this paper, we analyze the population diversity of grammatical evolution (GE) on multiple levels of genetic information: chromosome diversity, expression diversity, and output diversity. Thereby, we use a tree-similarity metric from tree-based GP literature to determine similarity of expression trees generated in GE. The similarity of outputs is...
Article
Full-text available
Background Allergen‐specific immunotherapy via the skin targets a tissue rich in antigen‐presenting cells, but can be associated with local and systemic side effects. Allergen‐polysaccharide neoglycogonjugates increase immunization efficacy by targeting and activating dendritic cells via C‐type lectin receptors and reduce side effects. Objective W...
Chapter
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
Article
T and B cells are known to play an important role in transplant rejection. Nevertheless, the factors that lead to rejection are not yet fully understood. We have developed a general bioinformatics pipeline for processing T cell receptor (TCR) and immunoglobulin (IG) repertoire next-generation sequencing (NGS) data for comparing immune repertoire pr...
Chapter
In recent years, renewable energy resources have become increasingly important. Due to the fluctuating and changing environment, these energy sources are not permanently available. At certain times, e.g. a photovoltaic (PV) power plant can only generate little or no electricity at all. This is why energy management systems (EMS), which store, use a...
Chapter
Black box machine learning techniques are methods that produce models which are functions of the inputs and produce outputs, where the internal functioning of the model is either hidden or too complicated to be analyzed. White box modeling, on the contrary, produces models whose structure is not hidden, but can be analyzed in detail. In this paper...
Preprint
Background Allergen-specific immunotherapy via the skin targets an area rich in antigen presenting cells, but can be associated with local and systemic side effect. Allergen-polysaccharide neoglycogonjugates can increase immunization efficacy by targeting and activating dendritic cells via C-type lectin receptors and reduce side effects. Objective...
Article
Full-text available
Background: Kidney transplantation is the optimal treatment in end stage renal disease but the allograft survival is still hampered by immune reactions against the allograft. This process is driven by the recognition of allogenic antigens presented to T-cells and their unique T-cell receptor (TCR) via the major histocompatibility complex (MHC), wh...
Article
Full-text available
An increase in adipose tissue is caused by the increased size and number of adipocytes. Lipids accumulate in intracellular stores, known as lipid droplets (LDs). Recent studies suggest that parameters such as LD size, shape and dynamics are closely related to the development of obesity. Berberine (BBR), a natural plant alkaloid, has been demonstrat...
Chapter
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Article
Full-text available
Co-eluting peptides are still a major challenge for the identification and validation of MS/MS spectra but carry great potential. To tackle these problems we have developed the here presented CharmeRT workflow, combining a chimeric spectra identification strategy implemented as part of the MS Amanda algorithm with the validation system Elutator, wh...
Article
Tribological systems are mechanical systems that rely on friction to transmit forces. The design and dimensioning of such systems requires prediction of various characteristic, such as the coefficient of friction. The core contribution of this paper is the analysis of two data-based modeling techniques which can be used to produce accurate and at t...
Chapter
Genetic Programming (GP) schemas are structural templates equivalent to hyperplanes in the search space. Schema theories provide information about the properties of subsets of the population and the behavior of genetic operators. In this paper we propose a practical methodology to identify relevant schemas and measure their frequency in the populat...
Chapter
Patients suffering from Diabetes Mellitus illness need to control their levels of sugar by a restricted diet, a healthy life and in the cases of those patients that do not produce insulin (or with a severe defect on the action of the insulin they produce), by injecting synthetic insulin before and after the meals. The amount of insulin, namely bolu...
Chapter
Full-text available
This paper proposes some algorithmic extensions to the general concept of offspring selection which itself is an algorithmic extension of genetic algorithms and genetic programming. Offspring selection is characterized by the fact that many offspring solution candidates will not participate in the ongoing evolutionary process if they do not achieve...
Chapter
Structure learning is the identification of the structure of graphical models based solely on observational data and is NP-hard. An important component of many structure learning algorithms are heuristics or bounds to reduce the size of the search space. We argue that variable relevance rankings that can be easily calculated for many standard regre...
Chapter
Full-text available
Population diversity plays an important role in the evolutionary dynamics of genetic programming (GP). In this paper we use structural and semantic similarity measures to investigate the evolution of diversity in three GP algorithmic flavors: standard GP, offspring selection GP (OS-GP), and age-layered population structure GP (ALPS-GP). Empirical m...
Chapter
One the most relevant application areas of artificial intelligence and machine learning in general is medical research. We here focus on research dedicated to diabetes, a disease that affects a high percentage of the population worldwide and that is an increasing threat due to the advance of the sedentary life in the big cities. Most recent studies...
Article
Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyse complex biological samples. The identification of proteins carrying post-translational modifications (PTMs), for example phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these...
Article
Naturally acquired immunity against allergens utilizes a broad spectrum of response types besides T regulatory cells. Therapeutic concepts should not be limited to T regulatory cell induction, but take advantage of the diversity found in natural responses.
Article
Full-text available
Predicting glucose values on the basis of insulin and food intakes is a difficult task that people with diabetes need to do daily. This is necessary as it is important to maintain glucose levels at appropriate values to avoid not only short-term, but also long-term complications of the illness. Artificial intelligence in general and machine learnin...
Conference Paper
Understanding the relationship between selection, genotype-phenotype map and loss of population diversity represents an important step towards more effective genetic programming (GP) algorithms. This paper describes an approach to capture dynamic changes in this relationship. We analyze the frequency distribution of points in the diversity plane de...
Chapter
In this chapter we examine how multi-objective genetic programming can be used to perform symbolic regression and compare its performance to single-objective genetic programming. Multi-objective optimization is implemented by using a slightly adapted version of NSGA-II, where the optimization objectives are the model’s prediction accuracy and its c...
Article
In transfusion medicine, the identification of the Rhesus D type is important to prevent anti-D immunisation in Rhesus D negative recipients. In particular, the detection of the very low expressed DEL phenotype is crucial and hence constitutes the bottleneck of standard immunohaematology. The current method of choice, adsorption-elution, does not p...
Conference Paper
Diabetes mellitus is a disease that affects more than three hundreds million people worldwide. Maintaining a good control of the disease is critical to avoid not only severe long-term complications but also dangerous short-term situations. Diabetics need to decide the appropriate insulin injection, thus they need to be able to estimate the level of...
Article
Full-text available
Induction of GLUT4 translocation in the absence of insulin is considered a key concept to decrease elevated blood glucose levels in diabetics. Due to the lack of pharmaceuticals that specifically increase the uptake of glucose from the blood circuit, application of natural compounds might be an alternative strategy. However, the effects and mechani...
Data
Specificity of GFP-signal increase upon PP60 treatment. (A) CHO-K1 GPI-GFP cells were seeded in 96-well plates (35,000 cells/well), grown over night, and starved for 3 hours in HBSS buffer. Fluorescence intensity was measured before and after 10 minutes of stimulation with PP60 in the same cells (n > 200 cells). Scale bar = 20 μm. (B) Increase of G...
Data
Effects of herbal compounds on actin remodeling. CHO-K1 hIR/GLUT4-myc-GFP cells were transiently transfected with the F-actin marker Lifeact-tdTomato, seeded in 96-well plates (100,000 cells/well), grown over night, and then starved for 3 hours in HBSS buffer. GLUT4-GFP and Lifeact-tdTomato signals recorded at 488 and 561 nm before and after stimul...
Article
Full-text available
Here, we discuss the identification of heterogeneous ensembles for short-term prediction of trends in stock markets. The goal is to predict trends (uptrend, sideways trend, or downtrend) for the next day, the next week, and the next month. A sliding window approach is used; model ensembles are iteratively learned and tested on subsequent data point...
Conference Paper
The behavior and actions of the human adaptive immune system and its key players, namely B and T cells, are often hard to understand in their entirety. We here present a workflow for modelling the states of adaptive immune systems by analyzing B and T cell receptor repertoires using next-generation sequencing data. For our workflow, we have blood a...
Conference Paper
Full-text available
In the optimization of real-world activities the effects of solutions on related activities need to be considered. The use of isolated problem models that do not adequately consider related processes does not allow addressing system-wide consequences. However, sometimes the complexity of the real-world model and its interplay with related activitie...
Conference Paper
In this paper we analyze the dynamics of the predictability and variable interactions in financial data of the years 2007–2014. Using a sliding window approach, we have generated mathematical prediction models for various financial parameters using other available parameters in this data set. For each variable we identify the relevance of other var...
Conference Paper
In this paper we present a method for the definition of characteristics of single molecules as well as of cell structures on fluorescence microscopy images for classifying human disease states. Fluorescence microscopy is one of the most emerging fields in modern laboratory diagnostics and is used in various research areas, for instance in studies o...
Conference Paper
This paper discusses the use of symbolic regression based ensemble modeling for obtaining more sensitive cancer predictors. The ensemble models are generated on the basis of blood parameters acting as model inputs which have been coupled with diagnosis data in order to predict breast cancer. In addition to previous works this contribution focuses o...
Conference Paper
It has been shown that it is possible to differentiate viable amniotic membrane towards osteogenic lineage, i.e. bony tissue. This process of mineralization may take several weeks and can show different manifestations per sample. The tissue can only be used, when the mineralization process is advanced in a certain degree. Therefore, a forecast of t...
Conference Paper
Automated synthesis of complex programs is still an unsolved problem even though some successes have been achieved recently for relatively contrived and specialized settings. One possible approach to automated programming is genetic programming, however, a diverse set of alternative techniques are possible which makes it rather difficult to make ge...
Article
Full-text available
Background: Today's modern research of B and T cell antigen receptors (the immunoglobulins (IG) or antibodies and T cell receptors (TR)) forms the basis for detailed analyses of the human adaptive immune system. For instance, insights in the state of the adaptive immune system provide information that is essentially important in monitoring transpl...
Article
Full-text available
Recent advances in high-throughput sequencing allow for the competitive analysis of the human B and T cell immune repertoire. In this study we compared Immunoglobulin and T cell receptor repertoires of lymphocytes found in kidney and blood samples of 10 patients with various renal diseases based on next-generation sequencing data. We used Biomed-2...
Article
We here present two new methods for the characterization of fluorescent localization microscopy images obtained from immunostained brain tissue sections. Direct stochastic optical reconstruction microscopy images of 5-HT1A serotonin receptors and glial fibrillary acidic proteins in healthy cryopreserved brain tissues are analyzed. In detail, we her...
Conference Paper
Peptide search engines are algorithms that are able to identify peptides (i.e., short proteins or parts of proteins) from mass spectra of biological samples. These identification algorithms report the best matching peptide for a given spectrum and a score that represents the quality of the match; usually, the higher this score, the higher is the re...
Chapter
In this chapter we discuss sliding window symbolic regression and its ability to systematically detect changing dynamics in data streams. The sliding window defines the portion of the data visible to the algorithm during training and is moved over the data. The window is moved regularly based on the generations or on the current selection pressure...

Network

Cited By