Gabriel Kronberger's research while affiliated with Fachhochschule Oberösterreich and other places

Publications (130)

Preprint
Full-text available
Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially ot...
Article
In material science, models are derived to predict emergent material properties (e.g. elasticity, strength, conductivity) and their relations to processing conditions. A major drawback is the calibration of model parameters that depend on processing conditions. Currently, these parameters must be optimized to fit measured data since their relations...
Preprint
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Preprint
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
Preprint
Optimization networks are a new methodology for holistically solving interrelated problems that have been developed with combinatorial optimization problems in mind. In this contribution we revisit the core principles of optimization networks and demonstrate their suitability for solving machine learning problems. We use feature selection in combin...
Preprint
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted and need not be specified a-priori. The result of the optimization is not a single solution anymore, but a whole Pareto-front describing the trade-off between accuracy and complexity. In this cont...
Preprint
The growing volume of data makes the use of computationally intense machine learning techniques such as symbolic regression with genetic programming more and more impractical. This work discusses methods to reduce the training data and thereby also the runtime of genetic programming. The data is aggregated in a preprocessing step before running the...
Preprint
The typical methods for symbolic regression produce rather abrupt changes in solution candidates. In this work, we have tried to transform symbolic regression from an optimization problem, with a landscape that is so rugged that typical analysis methods do not produce meaningful results, to one that can be compared to typical and very smooth real-v...
Preprint
The current development of today's production industry towards seamless sensor-based monitoring is paving the way for concepts such as Predictive Maintenance. By this means, the condition of plants and products in future production lines will be continuously analyzed with the objective to predict any kind of breakdown and trigger preventing actions...
Preprint
In material science, models are derived to describe emergent properties (e.g. elasticity, strength, conductivity, ...) and their relations to the material and processing conditions. Constitutive models are models that describe the behaviour of materials for instance deformation processes through applied forces. We describe a general method for the...
Preprint
With the increasing number of created and deployed prediction models and the complexity of machine learning workflows we require so called model management systems to support data scientists in their tasks. In this work we describe our technological concept for such a model management system. This concept includes versioned storage of data, support...
Preprint
Full-text available
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a...
Preprint
Full-text available
We describe and analyze algorithms for shape-constrained symbolic regression, which allows the inclusion of prior knowledge about the shape of the regression function. This is relevant in many areas of engineering -- in particular whenever a data-driven model obtained from measurements must have certain properties (e.g. positivity, monotonicity or...
Preprint
Friction systems are mechanical systems wherein friction is used for force transmission (e.g. mechanical braking systems or automatic gearboxes). For finding optimal and safe design parameters, engineers have to predict friction system performance. This is especially difficult in real-world applications, because it is affected by many parameters. W...
Preprint
We describe a method for the identification of models for dynamical systems from observational data. The method is based on the concept of symbolic regression and uses genetic programming to evolve a system of ordinary differential equations (ODE). The novelty is that we add a step of gradient-based optimization of the ODE parameters. For this we c...
Article
Full-text available
We investigate the addition of constraints on the function image and its derivatives for the incorporation of prior knowledge in symbolic regression. The approach is called shape-constrained symbolic regression and allows us to enforce e.g. monotonicity of the function over selected inputs. The aim is to find models which conform to expected behavi...
Article
In numerical process simulations, in-depth knowledge about material behavior during processing in the form of trustworthy material models is crucial. Among the different constitutive models used in the literature one can distinguish a physics-based approach (white-box model), which considers the evolution of material internal state variables, such...
Chapter
Residual stresses are originated during manufacturing processes of metallic materials, so its study is important to avoid catastrophic accidents during component service. There are two main types of residual stresses, according to the length scale; macroscopic and microscopic. While the determination of tmacroscopic ones is almost a routine analysi...
Preprint
We investigate the addition of constraints on the function image and its derivatives for the incorporation of prior knowledge in symbolic regression. The approach is called shape-constrained symbolic regression and allows us to enforce e.g. monotonicity of the function over selected inputs. The aim is to find models which conform to expected behavi...
Article
Full-text available
Abstract Predictive models are increasingly deployed within smart manufacturing for the control of industrial plants. With this arises, the need for long‐term monitoring of model performance and adaptation of models if surrounding conditions change and the desired prediction accuracy is no longer met. The heterogeneous landscape of application scen...
Article
Full-text available
In this paper we analyze the effects of using nonlinear least squares for parameter identification of symbolic regression models and integrate it as local search mechanism in tree-based genetic programming. We employ the Levenberg–Marquardt algorithm for parameter optimization and calculate gradients via automatic differentiation. We provide exampl...
Article
Full-text available
In this paper, we analyze the population diversity of grammatical evolution (GE) on multiple levels of genetic information: chromosome diversity, expression diversity, and output diversity. Thereby, we use a tree-similarity metric from tree-based GP literature to determine similarity of expression trees generated in GE. The similarity of outputs is...
Chapter
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
Chapter
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a...
Chapter
The current development of today’s production industry towards seamless sensor-based monitoring is paving the way for concepts such as Predictive Maintenance. By this means, the condition of plants and products in future production lines will be continuously analyzed with the objective to predict any kind of breakdown and trigger preventing actions...
Chapter
The growing volume of data makes the use of computationally intense machine learning techniques such as symbolic regression with genetic programming more and more impractical. This work discusses methods to reduce the training data and thereby also the runtime of genetic programming. The data is aggregated in a preprocessing step before running the...
Chapter
With the increasing number of created and deployed prediction models and the complexity of machine learning workflows we require so called model management systems to support data scientists in their tasks. In this work we describe our technological concept for such a model management system. This concept includes versioned storage of data, support...
Chapter
We describe a method for the identification of models for dynamical systems from observational data. The method is based on the concept of symbolic regression and uses genetic programming to evolve a system of ordinary differential equations (ODE). The novelty is that we add a step of gradient-based optimization of the ODE parameters. For this we c...
Article
Full-text available
Predictive models are an important success factor for smart manufacturing. Accordingly, purely data-driven models as well as hybrid models are increasingly deployed within manufacturing environments for optimal control of plants. However, long-term monitoring and adaptation of predictive models has not been a focus of studies so far but will likely...
Chapter
Ontologies are useful for modeling domains and can be used to capture expert knowledge about a system. Genetic programming can be used to identify statistical relationships or models from data. Combining expert knowledge as well as statistical rules identified solely from data is necessary in application domains where data is scarce and a large bod...
Conference Paper
We investigate in this paper the suitability of multi-objective algorithms for Symbolic Regression (SR), where desired properties of parsimony and diversity are explicitly stated as optimization goals. We evaluate different secondary objectives such as length, complexity and diversity on a selection of symbolic regression benchmark problems. Our ex...
Conference Paper
An in-depth understanding of material flow behaviour is crucial for numerical simulation of plastic deformation processes. In present work, we use a Symbolic Regression method in combination with Genetic Programming for modelling flow stress curves. In contrast to classical regression methods that fit parameters to an equation of a given form, symb...
Preprint
Diversity represents an important aspect of genetic programming, being directly correlated with search performance. When considered at the genotype level, diversity often requires expensive tree distance measures which have a negative impact on the algorithm's runtime performance. In this work we introduce a fast, hash-based tree distance measure t...
Chapter
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Conference Paper
Friction systems are mechanical systems wherein friction is used for force transmission (e.g. mechanical braking systems or automatic gearboxes). For finding optimal and safe design parameters, engineers have to predict friction system performance. This is especially difficult in real-worlds applications, because it is affected by many parameters....
Article
Tribological systems are mechanical systems that rely on friction to transmit forces. The design and dimensioning of such systems requires prediction of various characteristic, such as the coefficient of friction. The core contribution of this paper is the analysis of two data-based modeling techniques which can be used to produce accurate and at t...
Chapter
Patients suffering from Diabetes Mellitus illness need to control their levels of sugar by a restricted diet, a healthy life and in the cases of those patients that do not produce insulin (or with a severe defect on the action of the insulin they produce), by injecting synthetic insulin before and after the meals. The amount of insulin, namely bolu...
Chapter
Genetic Programming (GP) schemas are structural templates equivalent to hyperplanes in the search space. Schema theories provide information about the properties of subsets of the population and the behavior of genetic operators. In this paper we propose a practical methodology to identify relevant schemas and measure their frequency in the populat...
Chapter
Full-text available
Optimization networks are a new methodology for holistically solving interrelated problems that have been developed with combinatorial optimization problems in mind. In this contribution we revisit the core principles of optimization networks and demonstrate their suitability for solving machine learning problems. We use feature selection in combin...
Chapter
Structure learning is the identification of the structure of graphical models based solely on observational data and is NP-hard. An important component of many structure learning algorithms are heuristics or bounds to reduce the size of the search space. We argue that variable relevance rankings that can be easily calculated for many standard regre...
Chapter
Full-text available
This paper proposes some algorithmic extensions to the general concept of offspring selection which itself is an algorithmic extension of genetic algorithms and genetic programming. Offspring selection is characterized by the fact that many offspring solution candidates will not participate in the ongoing evolutionary process if they do not achieve...
Chapter
One the most relevant application areas of artificial intelligence and machine learning in general is medical research. We here focus on research dedicated to diabetes, a disease that affects a high percentage of the population worldwide and that is an increasing threat due to the advance of the sedentary life in the big cities. Most recent studies...
Chapter
Full-text available
Population diversity plays an important role in the evolutionary dynamics of genetic programming (GP). In this paper we use structural and semantic similarity measures to investigate the evolution of diversity in three GP algorithmic flavors: standard GP, offspring selection GP (OS-GP), and age-layered population structure GP (ALPS-GP). Empirical m...
Article
Full-text available
Predicting glucose values on the basis of insulin and food intakes is a difficult task that people with diabetes need to do daily. This is necessary as it is important to maintain glucose levels at appropriate values to avoid not only short-term, but also long-term complications of the illness. Artificial intelligence in general and machine learnin...
Conference Paper
Understanding the relationship between selection, genotype-phenotype map and loss of population diversity represents an important step towards more effective genetic programming (GP) algorithms. This paper describes an approach to capture dynamic changes in this relationship. We analyze the frequency distribution of points in the diversity plane de...
Chapter
In this chapter we examine how multi-objective genetic programming can be used to perform symbolic regression and compare its performance to single-objective genetic programming. Multi-objective optimization is implemented by using a slightly adapted version of NSGA-II, where the optimization objectives are the model’s prediction accuracy and its c...
Conference Paper
Diabetes mellitus is a disease that affects more than three hundreds million people worldwide. Maintaining a good control of the disease is critical to avoid not only severe long-term complications but also dangerous short-term situations. Diabetics need to decide the appropriate insulin injection, thus they need to be able to estimate the level of...
Article
Full-text available
Here, we discuss the identification of heterogeneous ensembles for short-term prediction of trends in stock markets. The goal is to predict trends (uptrend, sideways trend, or downtrend) for the next day, the next week, and the next month. A sliding window approach is used; model ensembles are iteratively learned and tested on subsequent data point...
Conference Paper
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted and need not be specified a-priori. The result of the optimization is not a single solution anymore, but a whole Pareto-front describing the trade-off between accuracy and complexity. In this cont...
Conference Paper
In this paper we analyze the dynamics of the predictability and variable interactions in financial data of the years 2007–2014. Using a sliding window approach, we have generated mathematical prediction models for various financial parameters using other available parameters in this data set. For each variable we identify the relevance of other var...
Conference Paper
The typical methods for symbolic regression produce rather abrupt changes in solution candidates. In this work, we have tried to transform symbolic regression from an optimization problem, with a landscape that is so rugged that typical analysis methods do not produce meaningful results, to one that can be compared to typical and very smooth real-v...
Conference Paper
Automated synthesis of complex programs is still an unsolved problem even though some successes have been achieved recently for relatively contrived and specialized settings. One possible approach to automated programming is genetic programming, however, a diverse set of alternative techniques are possible which makes it rather difficult to make ge...
Chapter
In this chapter we discuss sliding window symbolic regression and its ability to systematically detect changing dynamics in data streams. The sliding window defines the portion of the data visible to the algorithm during training and is moved over the data. The window is moved regularly based on the generations or on the current selection pressure...
Chapter
Dynamic and stochastic problem environments are often difficult to model using standard problem formulations and algorithms. One way to model and then solve them is simulation-based optimization: Simulations are integrated into the optimization process in order to evaluate the quality of solution candidates and to identify optimized system configur...
Chapter
Optimization of supply chains and logistic networks have been largely addressed by simulation-based optimization. The logistics for low-energy biological residues poses a great challenge for logistics that has to be tackled at an international scale. For this application, a new simplified model of logistic networks was created that allows not only...
Article
Genetic programming gradually assembles high-level structures from low-level entities or building blocks. This chapter describes methods for investigating emergent phenomena in genetic programming by looking at a population’s collective behavior. It details how these methods can be used to trace genotypic changes across lineages and genealogies. Pa...
Article
In this chapter, we have a closer look at search strategies for optimization problems, where the structure of valid solutions is defined through a formal grammar. These problems frequently occur in the genetic programming (GP) literature, especially in the context of grammar-guided genetic programming [18]. Even though a lot of progress has been ma...
Article
This paper investigates the modeling of brushless permanent-magnet synchronous machines (PMSMs). The focus is on deriving an automatable process for obtaining dynamic motor models that take nonlinear effects, such as saturation, into account. The modeling is based on finite element (FE) simulations for different current vectors in the $dq$ plane ov...
Article
Rapid prototyping and testing of new ideas has been a major argument for evolutionary computation frameworks. These frameworks facilitate the application of evolutionary computation and allow experimenting with new and modified algorithms and problems by building on existing, well tested code. However, one could argue, that despite the many framewo...
Article
In this publication genetic programming (GP) with data migration for symbolic regression is presented. The motivation for the development of the algorithm is to evolve models which generalize well on previously unseen data. GP with data migration uses multiple subpopulations to maintain the genetic diversity during the algorithm run and a sophistic...
Chapter
A distinguishing feature of symbolic regression using genetic programming is its ability to identify complex nonlinear white-box models. This is especially relevant in practice where models are extensively scrutinized in order to gain knowledge about underlying processes. This potential is often diluted by the ambiguity and complexity of the models...
Chapter
Many optimization problems cannot be solved by classical mathematical optimization techniques due to their complexity and the size of the solution space. In order to achieve solutions of high quality though, heuristic optimization algorithms are frequently used. These algorithms do not claim to find global optimal solutions, but offer a reasonable...
Article
We here show the application of heterogeneous ensemble modeling for training short term predictors of trends in stock markets. A sliding window approach is used; model ensembles are iteratively learned and tested on subsequent data points. The goal is to predict trends (positive, neutral, or negative stock changes) for the next day, the next week,...
Chapter
In this chapter we present results of empirical research work done on the data based identification of estimation models for tumor markers and cancer diagnoses: Based on patients’ data records including standard blood parameters, tumor markers, and information about the diagnosis of tumors we have trained mathematical models that represent virtual...
Conference Paper
In this publication a constant optimization approach for symbolic regression is introduced to separate the task of finding the correct model structure from the necessity to evolve the correct numerical constants. A gradient-based nonlinear least squares optimization algorithm, the Levenberg-Marquardt (LM) algorithm, is used for adjusting constant v...
Conference Paper
Full-text available
Defining custom problem types in genetic programming (GP) software systems is a tedious task that usually involves the implementation of custom classes and methods including framework-specific code. Users who want to solve a custom problem have to know the details of the targeted framework, for instance cloning semantics, and often have to write a...
Conference Paper
Many studies emphasize the importance of genetic diversity and the need for an appropriate tuning of selection pressure in genetic programming. Additional important aspects are the performance and effects of the genetic operators (crossover and mutation) on the transfer and stabilization of inherited information blocks during the run of the algorit...
Conference Paper
Probabilistic programming allows specification of probabilistic models in a declarative manner. Recently, several new software systems and languages for probabilistic programming have been developed on the basis of newly developed and improved methods for approximate inference in probabilistic models. In this contribution a probabilistic model for...