
Michael AffenzellerFachhochschule Oberösterreich | fh-ooe · Heuristic and Evolutionary Algorithms Laboratory (HEAL)
Michael Affenzeller
Prof. Dr.
About
366
Publications
55,146
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,034
Citations
Introduction
Michael Affenzeller is professor for heuristic optimization and machine learning and head of the Heuristic and Evolutionary Algorithms Laboratory. In 2001 he received his PhD in engineering sciences and in 2004 he received his habilitation in applied systems engineering, both from the Johannes Kepler University of Linz, Austria. Starting es as vice dean for research and head of the master degree program for sortware engineering.
Additional affiliations
November 1999 - October 2005
September 1998 - October 1999
Primetals Technologies (former VAI)
Position
- Engineer
Description
- Simulation-based process modeling
Education
November 2001 - June 2004
November 1999 - November 2001
October 1991 - November 1997
Publications
Publications (366)
Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially ot...
In this paper, we propose a hybrid algorithm for exact nearest neighbors queries in high-dimensional spaces. Indexing structures typically used for exact nearest neighbors search become less efficient in high-dimensional spaces, effectively requiring brute-force search. Our method uses a massively-parallel approach to brute-force search that effici...
This work provides the exact expression of the probability distribution of the hypervolume improvement (HVI) for bi-objective generalization of Bayesian optimization. Here, instead of a single-objective improvement, we consider the improvement of the hypervolume indicator concerning the current best approximation of the Pareto front. Gaussian proce...
In this study, the modification of the quantum multi-swarm optimization algorithm is proposed for dynamic optimization problems. The modification implies using the search operators from differential evolution algorithm with a certain probability within particle swarm optimization to improve the algorithm’s search capabilities in dynamically changin...
Clean and renewable wind energy has made an outstanding contribution to alleviating the energy crisis. However, the randomness and volatility of wind brings great risk to the integration of wind power to the grid. Therefore, it is essential to obtain reliable and efficient wind speed forecasts. Quantile-based machine learning techniques, which usua...
The development of energy management systems that optimize the electrical energy flows of residential buildings has become important nowadays. The optimization is formulated as symbolic regression problem that is solved by genetic programming, which provides near optimal results while being highly performant during application. Additionally, the so...
Many real-world processes are of dynamic nature and therefore subject to change. In this paper, dynamic warehouse operations are taken care of, more specifically crane operations that involve moving steel coils between storage locations within a large warehouse. An open-ended optimization approach is employed to create an optimal schedule of crane...
Fitness Landscape Analysis (FLA) denotes the task of analyzing black-box optimization problems and capturing their characteristic features with the goal of providing additional information, that may help in algorithm selection, parametrization or guidance. Many real-world optimization tasks require dynamic on-going optimization and a plethora of me...
Artificial intelligence, especially in the form of machine learning methods, has the potential to stabilize and optimize manufacturing processes in terms of productivity, quality and resource efficiency as numerous publications in the recent past verify. To make this possible, information from various sources (machine controller, external sensory,...
Genetic programming (GP) based symbolic regression is a powerful technique for white-box modelling. However, the prediction uncertainties of the symbolic regression are still unknown. This paper proposes to use Kriging to model the residual of a symbolic expression. The residual model follows a normal distribution with parameters of a mean value an...
Many real-world use cases benefit from fast training and prediction times, and much research went into speeding up distance-based outlier detection methods to millions of data points. Contrary to popular belief, our findings suggest that little data is often enough for distance-based outlier detection models. We show that using only a tiny fraction...
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted and need not be specified a-priori. The result of the optimization is not a single solution anymore, but a whole Pareto-front describing the trade-off between accuracy and complexity. In this cont...
Optimization networks are a new methodology for holistically solving interrelated problems that have been developed with combinatorial optimization problems in mind. In this contribution we revisit the core principles of optimization networks and demonstrate their suitability for solving machine learning problems. We use feature selection in combin...
This paper describes a methodology for analyzing the evolutionary dynamics of genetic programming (GP) using genealogical information, diversity measures and information about the fitness variation from parent to offspring. We introduce a new subtree tracing approach for identifying the origins of genes in the structure of individuals, and we show...
The current development of today's production industry towards seamless sensor-based monitoring is paving the way for concepts such as Predictive Maintenance. By this means, the condition of plants and products in future production lines will be continuously analyzed with the objective to predict any kind of breakdown and trigger preventing actions...
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a...
Abstract Predictive models are increasingly deployed within smart manufacturing for the control of industrial plants. With this arises, the need for long‐term monitoring of model performance and adaptation of models if surrounding conditions change and the desired prediction accuracy is no longer met. The heterogeneous landscape of application scen...
Solving manufacturing optimization problems in the context of intelligent production involves the consideration of continuously changing events of the respective enterprise environment in real time. Smart solution methods are needed which are able to cope with such necessary reactions to uncertainty and dynamics. In general, this field of research...
Worker cross-training is a problem arising in many companies that involve human work. To perform certain activities, workers are required to possess certain skills. Cross-trained workers possess even multiple skills, which enables a more flexible deployment, but also incurs higher costs. Thus, companies seek to balance the available skills such tha...
Project scheduling in manufacturing environments often requires flexibility in terms of the selection and the exact length of alternative production activities. Moreover, the simultaneous scheduling of multiple lots is mandatory in many production planning applications. To meet these requirements, a new resource-constrained project scheduling probl...
In this paper we analyze the effects of using nonlinear least squares for parameter identification of symbolic regression models and integrate it as local search mechanism in tree-based genetic programming. We employ the Levenberg–Marquardt algorithm for parameter optimization and calculate gradients via automatic differentiation. We provide exampl...
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic progr...
The current development of today’s production industry towards seamless sensor-based monitoring is paving the way for concepts such as Predictive Maintenance. By this means, the condition of plants and products in future production lines will be continuously analyzed with the objective to predict any kind of breakdown and trigger preventing actions...
We introduce in this paper a runtime-efficient tree hashing algorithm for the identification of isomorphic subtrees, with two important applications in genetic programming for symbolic regression: fast, online calculation of population diversity and algebraic simplification of symbolic expression trees. Based on this hashing approach, we propose a...
Incremental evaluation is a big advantage for trajectory-based optimization algorithms. Previously, the application of similar ideas to crossover-based algorithms, such as genetic algorithms did not seem appealing as the expected benefit would be marginal. We propose the use of an immutable data structure that stores partial evaluation results insi...
Exploratory fitness landscape analysis (FLA) is a category of techniques that try to capture knowledge about a black-box optimization problem. This is achieved by assigning features to a certain problem instance utilizing only information obtained by evaluating the black-box. This knowledge can be used to obtain new domain knowledge but more often...
In the context of real-world optimization problems in the area of production and logistics, multiple objectives have to be considered very often. Precisely such a situation is also regarded in this work. For a resource-constrained project scheduling problem with activity selection and time flexibility, a new bi-objective extension is developed. Mot...
In recent years, renewable energy resources have become increasingly important. Due to the fluctuating and changing environment, these energy sources are not permanently available. At certain times, e.g. a photovoltaic (PV) power plant can only generate little or no electricity at all. This is why energy management systems (EMS), which store, use a...
The dynamic block relocation problem is a variant of the BRP where the initial configuration and retrieval priorities are known but are subject to change during the implementation of an optimized solution. This paper investigates two kinds of potential changes. The exchange of assigned priorities between two blocks and the arrival of new blocks. Fo...
Black box machine learning techniques are methods that produce models which are functions of the inputs and produce outputs, where the internal functioning of the model is either hidden or too complicated to be analyzed. White box modeling, on the contrary, produces models whose structure is not hidden, but can be analyzed in detail. In this paper...
The main idea of this paper is to use Simple Symbolic Formulas generated offline with the help of the deterministic function extraction algorithm as building blocks for Genetic Programming. This idea comparison to Automatically Defined Functions approach was considered. A possibility to take into consideration an expert’s knowledge about the proble...
The main idea of this paper is to add model set pre-processing for Genetic Programming based Evolvement of Models of Models. Simple Symbolic Formulas generated offline with the help of the deterministic function extraction algorithm will be used as building blocks for Genetic Programming. In this work, a pre-processing of models set is generated by...
In the steel industry, logistics is very often part of the value chain since storage processes and therefore cooling processes contribute to the product quality to a very larger degree. As a result, steel logistics is concerned with the storage and movement of – in our case – work in process (WIP) materials. Thousands of tons of steel are transport...
Predictive models are an important success factor for smart manufacturing. Accordingly, purely data-driven models as well as hybrid models are increasingly deployed within manufacturing environments for optimal control of plants. However, long-term monitoring and adaptation of predictive models has not been a focus of studies so far but will likely...
Simulation-based optimization problems are often an inherent part in engineering design tasks. This paper introduces one such use case, the design of a box-type boom of a crane, which requires a time consuming structural analysis for validation. To overcome high runtimes for optimization approaches with numerous calls to the structural analysis too...
In this work we present a machine learning based approach for detecting drifting behavior – so-called concept drifts – in continuous data streams. The motivation for this contribution originates from the currently intensively investigated topic Predictive Maintenance (PdM), which refers to a proactive way of triggering servicing actions for industr...
In this position paper we describe challenges related to uncertainty handling when solving stacking problems within storage zones in the steel production value chain. Manipulations in those zones are often relocations of materials performed with gantry cranes. Thereby the crane operators themselves or dispatchers constantly solve a complex stacking...
We investigate in this paper the suitability of multi-objective algorithms for Symbolic Regression (SR), where desired properties of parsimony and diversity are explicitly stated as optimization goals. We evaluate different secondary objectives such as length, complexity and diversity on a selection of symbolic regression benchmark problems. Our ex...
Real-world project scheduling often requires flexibility in terms of the selection and the exact length of alternative production activities. Moreover, the simultaneous scheduling of multiple lots is mandatory in many production planning applications. To meet these requirements, a new flexible resource-constrained multi-project scheduling problem i...
Diversity represents an important aspect of genetic programming, being directly correlated with search performance. When considered at the genotype level, diversity often requires expensive tree distance measures which have a negative impact on the algorithm's runtime performance. In this work we introduce a fast, hash-based tree distance measure t...
In the era of commonly available problem-solving tools for, it is especially important to choose the best available method. We use local optima network analysis and machine learning to select appropriate algorithms on the instance-to-instance basis. The preliminary results show that such method can be successfully applied for sufficiently distinct...
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target func...
Predictive Maintenance is one of the most intensively investigated topics in the current Industry 4.0 movement. It aims at scheduling maintenance actions based on industrial production plants’ past and current condition and therefore incorporates other trending technological developments such as the Internet of Things, Cyber-Physical Systems or Big...
This paper presents for the first time a reinforcement learning algorithm with function approximation for stacking problems with continuous production and retrieval. The stacking problem is a hard combinatorial optimization problem. It deals with the arrangement of items in a localized area, where they are organized into stacks to allow a delivery...
This paper introduces a new, highly asynchronous method for surrogate-assisted optimization where it is possible to concurrently create surrogate models, evaluate fitness functions and do parameter optimization for the underlying problem, effectively eliminating sequential workflows of other surrogate-assisted algorithms. Using optimization network...
When attempting to improve the non-functional requirements of software, specifically run-time performance of code, an important requirement is to preserve the correctness of the optimized code. Additionally when attempting to integrate Genetic Improvement into a compiler or interpreter, the large search spaces resulting from the amount of operators...
A recent approach for improving the accuracy of ensemble models is confidence-based modeling. Thereby, confidence measures, which indicate an ensemble prediction's reliability, are used for identifying unreliable predictions in order to improve a model's accuracy among reliable predictions. However, despite promising results in previous work, no co...
Combinatorial optimization problems come in a wide variety of types but five common problem components can be identified. This categorization can aid the selection of interesting and diverse set of problems for inclusion in the combinatorial black-box problem benchmark. We suggest two real-world problems for inclusion into the benchmark. One is a t...
Algorithm selection is useful in decision situations where among many alternative algorithm instances one has to be chosen. This is often the case in heuristic optimization and is detailed by the well-known no-free-lunch (NFL) theorem. A consequence of the NFL is that a heuristic algorithm may only gain a performance improvement in a subset of the...
In genetic programming (GP), population diversity represents a key aspect of evolutionary search and a major factor in algorithm performance. In this paper we propose a new schema-based approach for observing and steering the loss of diversity in GP populations. We employ a well-known hyperschema definition from the literature to generate tree stru...
Exploratory landscape analysis is a useful method for algorithm selection, parametrization and creating an understanding of how a heuristic optimization algorithm performs on a problem and why. A prominent family of fitness landscape analysis measures are based on random walks through the search space. However, most of these features were only intr...
Tribological systems are mechanical systems that rely on friction to transmit forces. The design and dimensioning of such systems requires prediction of various characteristic, such as the coefficient of friction. The core contribution of this paper is the analysis of two data-based modeling techniques which can be used to produce accurate and at t...
Conventional solution methods for logistics optimization problems often have to be adapted when objectives or restrictions of organizations in logistics environments are changing. In this paper, a new, generic solution approach called optimization network (ON) is developed and applied to a logistics optimization problem, the Location Routing Proble...
Genetic Programming (GP) schemas are structural templates equivalent to hyperplanes in the search space. Schema theories provide information about the properties of subsets of the population and the behavior of genetic operators. In this paper we propose a practical methodology to identify relevant schemas and measure their frequency in the populat...
Evolutionary algorithm analysis is often impeded by the large amounts of intermediate data that is usually discarded and has to be painstakingly reconstructed for real-world large-scale applications. In the recent past persistent data structures have been developed which offer extremely compact storage with acceptable runtime penalties. In this wor...
Optimization networks are a new methodology for holistically solving interrelated problems that have been developed with combinatorial optimization problems in mind. In this contribution we revisit the core principles of optimization networks and demonstrate their suitability for solving machine learning problems. We use feature selection in combin...
Predictive Maintenance (PdM) is among the trending topics in the current Industry 4.0 movement and hence, intensively investigated. It aims at sophisticated scheduling of maintenance, mostly in the area of industrial production plants. The idea behind PdM is that, instead of following fixed intervals, service actions could be planned based upon the...
Much of the literature found on surrogate models presents new approaches or algorithms trying to solve black-box optimization problems with as few evaluations as possible. The comparisons of these new ideas with other algorithms are often very limited and constrained to non-surrogate algorithms or algorithms following very similar ideas as the pres...
The no free lunch (NFL) theorem puts a limit to the range of problems a certain metaheuristic algorithm can be applied to successfully. For many methods these limits are unknown a priori and have to be discovered by experimentation. With the use of fitness landscape analysis (FLA) it is possible to obtain characteristic data and understand why meth...