Krzysztof Krawiec

Krzysztof Krawiec
Poznan University of Technology · Institute of Computing Science

Professor

About

309
Publications
43,527
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,012
Citations
Introduction
Primary interests: program synthesis (in particular genetic programming), coevolutionary algorithms, machine learning and computer vision. Recent research and results: - best-to-date algorithm for segmenting blood vessels in fundus imaging (ophthalmology), - behavioral genetic programming, - semantic genetic programming.
Additional affiliations
August 2013 - March 2014
Massachusetts Institute of Technology
Position
  • Visiting Researcher
Description
  • Genetic programming, semantic GP, behavioral program synthesis, evolutionary feature construction for large-scale machine learning.
July 2002 - April 2003
University of California, Riverside
Description
  • Evolutionary synthesis of pattern recognition systems for visible spectrum and synthetic aperture radar (SAR) imagery
September 1993 - present
Poznan University of Technology
Description
  • Genetic programming, coevolutionary learning, evolutionary computation, machine learning

Publications

Publications (309)
Book
Genetic programming (GP) is a popular heuristic methodology of program synthesis with origins in evolutionary computation. In this generate-and-test approach, candidate programs are iteratively produced and evaluated. The latter involves running programs on tests, where they exhibit complex behaviors reflected in changes of variables, registers, or...
Conference Paper
Full-text available
We propose Coevolutionary Gradient Search, a blueprint for a family of iterative learning algorithms that combine elements of local search and population-based search. The approach is applied to learning Othello strategies represented as n-tuple networks, using different search operators and modes of learning. We focus on the interplay between the...
Article
Full-text available
In symbolic regression with formal constraints, the conventional formulation of regression problem is extended with desired properties of the target model, like symmetry, monotonicity, or convexity. We present a genetic programming algorithm that solves such problems using a Satisfiability Modulo Theories solver to formally verify the candidate sol...
Article
Full-text available
The hypothesis of this study was one of existence of spatially organized links between the time series of river runoff and climate variability indices, describing the oscillations in the atmosphere–ocean system: ENSO (El Niño–Southern Oscillation), PDO (Pacific Decadal Oscillation), AMO (Atlantic Multidecadal Oscillation), and NAO (North Atlantic O...
Article
Full-text available
The objective of this study is to provide a comprehensive review and characterization of selected climate variability indices. While we discuss many major climate variability mechanisms, we focus on four principal modes of climate variability related to the dynamics of Earth’s oceans and their interactions with the atmosphere: the El Niño–Southern...
Article
Full-text available
Transcriptional analysis and live-cell imaging are a powerful tool to investigate the dynamics of complex biological systems. In vitro expanded porcine oral mucosal cells, consisting of populations of epithelial and connective lineages, are interesting and complex systems for study via microarray transcriptomic assays to analyze gene expression pro...
Chapter
Discrete and combinatorial optimization can be notoriously difficult due to complex and rugged characteristics of the objective function. We address this challenge by mapping the search process to a continuous space using recurrent neural networks. Alongside with an evolutionary run, we learn three mappings: from the original search space to a cont...
Article
Full-text available
Stochastic synthesis of recursive functions has historically proved difficult, not least due to issues of non-termination and the often ad hoc methods for addressing this. This article presents a general method of implicit recursion which operates via an automatically-derivable decomposition of datatype structure by cases, thereby ensuring well-fou...
Conference Paper
In many applications of symbolic regression, domain knowledge constrains the space of admissible models by requiring them to have certain properties, like monotonicity, convexity, or symmetry. As only a handful of variants of genetic programming methods proposed to date can take such properties into account, we introduce a principled approach capab...
Conference Paper
Stochastic synthesis of recursive functions has historically proved difficult, not least due to issues of non-termination and the often ad hoc methods for addressing this. We propose a general method of implicit recursion which operates via an automatically-derivable decomposition of datatype structure by cases, thereby ensuring well-foundedness. T...
Article
The RNA World is currently the most plausible hypothesis for explaining the origins of life on Earth. The supporting body of evidence is growing and it comes from multiple areas, including astrobiology, chemistry, biology, mathematics, and, in particular, from computer simulations. Such methods frequently assume the existence of a hypothetical spec...
Preprint
Full-text available
Program synthesis from natural language (NL) is practical for humans and, once technically feasible, would significantly facilitate software development and revolutionize end-user programming. We present SAPS, an end-to-end neural network capable of mapping relatively complex, multi-sentence NL specifications to snippets of executable code. The pro...
Conference Paper
When search operators in genetic programming (GP) insert new instructions into programs, they usually draw them uniformly from the available instruction set. Prefering some instructions to others would require additional domain knowledge, which is typically unavailable. However, it has been recently demonstrated that the likelihoods of instructions...
Conference Paper
We propose Neural Estimation of Interaction Outcomes (NEIO), a method that reduces the number of required interactions between candidate solutions and tests in test-based problems. Given the outcomes of a random sample of all solution-test interactions, NEIO uses a neural network to predict the outcomes of remaining interactions and so estimate the...
Conference Paper
Full-text available
Genetic programming is an effective technique for inductive synthesis of programs from tests, i.e. training examples of desired input-output behavior. Programs synthesized in this way are not guaranteed to generalize beyond the training set, which is unacceptable in many applications. We present Counterexample-Driven Genetic Programming (CDGP) that...
Article
Conventional genetic programming (GP) can only guarantee that synthesized programs pass tests given by the provided input-output examples. The alternative to such test-based approach is synthesizing programs by formal specification, typically realized with exact, non-heuristic algorithms. In this paper, we build on our earlier study on Counterexamp...
Article
Full-text available
Mathematical Programming (MP) models are common in optimization of real-world processes. Models are usually built by optimization experts in an iterative manner: an imperfect model is continuously improved until it approximates the reality well-enough and meets all technical requirements (e.g., linearity). To facilitate this task, we propose a Gene...
Chapter
Design patterns capture the essentials of recurring best practice in an abstract form. Their merits are well established in domains as diverse as architecture and software development. They offer significant benefits, not least a common conceptual vocabulary for designers, enabling greater communication of high-level concerns and increased software...
Article
Full-text available
Genetic programming (GP) is a variant of evolutionary algorithm where the entities undergoing simulated evolution are computer programs. A fitness function in GP is usually based on a set of tests, each of which defines the desired output a correct program should return for an exemplary input. The outcomes of interactions between programs and tests...
Article
Full-text available
This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration,...
Article
Full-text available
Achieving superhuman playing level by AlphaGo corroborated the capabilities of convolutional neural architectures (CNNs) for capturing complex spatial patterns. This result was to a great extent due to several analogies between Go board states and 2D images CNNs have been designed for, in particular translational invariance and a relatively large b...
Conference Paper
Full-text available
Since its introduction, Geometric Semantic Genetic Programming (GSGP) has been the inspiration to ideas on how to reach optimal solutions efficiently. Among these, in 2016 Pawlak has shown how to analytically construct optimal programs by means of a linear combination of a set of random programs. Given the simplicity and excellent results of this m...
Conference Paper
Program synthesis via heuristic search often requires a great deal of boilerplate code to adapt program APIs to the search mechanism. In addition, the majority of existing approaches are not type-safe: i.e. they can fail at runtime because the search mechanisms lack the strict type information often available to the compiler. In this article, we de...
Conference Paper
In genetic programming (GP), the outcomes of the evaluation phase can be represented as an interaction matrix, with rows corresponding to programs in a population and columns corresponding to tests that define a program synthesis task. Recent contributions on Discovery of Objectives via Clustering (DOC) and Discovery of Objectives by Factorization...
Conference Paper
Genetic programming is an effective technique for inductive synthesis of programs from training examples of desired input-output behavior (tests). Programs synthesized in this way are not guaranteed to generalize beyond the training set, which is unacceptable in many applications. We present Counterexample-Driven Genetic Programming (CDGP) that emp...
Conference Paper
Geometric Semantic Genetic Programming (GSGP) induces a unimodal fitness landscape for any problem that consists in finding a function fitting given input/output examples. Most of the work around GSGP to date has focused on real-world applications and on improving the originally proposed search operators, rather than on broadening its theoretical f...
Article
Full-text available
Imaging of living cells based on traditional fluorescence and confocal laser scanning microscopy has delivered an enormous amount of information critical for understanding biological processes in single cells. However, the requirement for a high numerical aperture and fluorescent markers still limits researchers’ ability to visualize the cellular a...
Conference Paper
We identify a novel application of Genetic Programming to automatic synthesis of mathematical programming (MP) models for business processes. Given a set of examples of states of a business process, the proposed Genetic Constraint Synthesis (GenetiCS) method constructs well-formed constraints for an MP model. The form of synthesized constraints (e....
Conference Paper
Program synthesis can be posed as a satisfiability problem and approached with generic SAT solvers. Only short programs can be however synthesized in this way. Program sketching by Solar-Lezama assumes that a human provides a partial program (sketch), and that synthesis takes place only within the uncompleted parts of that program. This allows synt...
Article
Attention Deficit Hyperactivity Disorder (ADHD) is associated with altered cerebellar volume and cerebellum is associated with cognitive performance. However there are mixed results regarding the cerebellar volume in young patients with ADHD. To clarify the size and direction of this effect, we conducted the analysis on the large public database of...
Article
Program semantics is a promising recent research thread in Genetic Programming (GP). Over a dozen of semantic-aware search, selection, and initialization operators for GP have been proposed to date. Some of those operators are designed to exploit the geometric properties of semantic space, while some others focus on making offspring effective, i.e....
Article
Constraints form an essential part of most practical search and optimization problems, and are usually assumed to be given. However, there are plausible real-world scenarios in which constraints are not known or can be only approximated, for instance when the process in question is complex and/or noisy. To address such problems, we propose a method...
Chapter
Genetic programming (GP) is a stochastic, iterative generate-and-test approach to synthesizing programs from tests, i.e. examples of the desired input-output mapping. The number of passed tests, or the total error in continuous domains, is a natural objective measure of a program’s performance and a common yardstick when experimentally comparing al...
Chapter
In Sect. 1.4, we identified several challenges for program synthesis, among others the vastness of search space and the intricate way in which program code determines the effects of computation. In this chapter, we identify and discuss the consequences of the conventional approach to program evaluation in generative program synthesis. Though focuse...
Chapter
To this point, our attempts to widen the evaluation bottleneck focused on defining alternative evaluation functions, which we conceptualize as search drivers in Chap. 9. However, an analysis of an execution record (Chap. 3), whether conducted with information-theoretic measures (Chap. 6) or machine learning algorithms (Chap. 7), also reveals inform...
Chapter
Previous chapters presented a range of approaches that characterize program behavior in terms of execution record and search drivers. The experiments reported in Chap. 10 demonstrated that these approaches increase the likelihood of synthesizing a correct program.What are the other, not necessarily empirical, implications of behavioral programsynth...
Article
Attention Deficit Hyperactivity Disorder (ADHD) is associated with altered cerebellar volume and cerebellum is associated with cognitive performance. However there are mixed results regarding the cerebellar volume in young patients with ADHD. To clarify the size and direction of this effect, we conducted the analysis on the large public database of...
Presentation
Full-text available
A running program may exhibit complex behavior, not only in terms of the produced output, but also regarding the execution states it traverses. In program synthesis as practiced with conventional genetic programming, only a fraction of that behavior, usually compressed into a scalar fitness, is used to navigate the search space. This ’evaluation bo...
Conference Paper
In genetic programming (GP), the outcomes of the evaluation phase in an evolutionary loop can be represented as an interaction matrix, with rows corresponding to programs in a population, columns corresponding to tests that define a program synthesis task, and ones and zeroes signaling respectively passing a test and failing to do so. The conventio...
Conference Paper
We consider simultaneous evolutionary synthesis of multiple functions, and verify whether such approach leads to computational savings compared to conventional synthesis of functions one-by-one. We also extend the proposed synthesis model with scaffolding, a technique originally intended to facilitate evolution of recursive programs, and consisting...
Article
Full-text available
Motivated by the search for a counterexample to the Poincar\'e conjecture in three and four dimensions, the Andrews-Curtis conjecture was proposed in 1965. It is now generally suspected that the Andrews-Curtis conjecture is false, but small potential counterexamples are not so numerous, and previous work has attempted to eliminate some via combinat...
Chapter
Full-text available
Much recent progress in Genetic Programming (GP) can be ascribed to work in semantic GP, which facilitates program induction by considering program behavior on individual fitness cases. It is therefore interesting to consider whether alternative decompositions of fitness cases might also provide useful information. The one we present here is motiva...
Presentation
Full-text available
In many optimization and learning problems, candidate solutions need to interact with multiple tests or environments in order to be evaluated. When evolving computer programs or controllers, passing a test requires producing the desired output for a given input; when learning game strategies, a test is an opponent strategy to play against. Conventi...
Conference Paper
Full-text available
A common approach in Geometric Semantic Genetic Programming (GSGP) is to seed initial populations using conventional, semantic-unaware methods like Ramped Half-and-Half. We formally demonstrate that this may limit GSGP’s ability to find a program with the sought semantics. To overcome this issue, we determine the desired properties of geometric-awa...
Article
The condition of the vascular network of human eye is an important diagnostic factor in ophthalmology. Its segmentation in fundus imaging is a nontrivial task due to variable size of vessels, relatively low contrast, and potential presence of pathologies like microaneurysms and hemorrhages. Many algorithms, both unsupervised and supervised, have be...
Conference Paper
We propose SFIMX, a method that reduces the number of required interactions between programs and tests in genetic programming. SFIMX performs factorization of the matrix of the outcomes of interactions between the programs in a working population and the tests. Crucially, that factorization is applied to matrix that is only partially filled with in...
Article
In test-based problems, commonly approached with competitive coevolutionary algorithms, the fitness of a candidate solution is determined by the outcomes of its interactions with multiple tests. Usually, fitness is a scalar aggregate of interaction outcomes, and as such imposes a complete order on the candidate solutions. However, passing different...
Article
Full-text available
Metrics are essential for geometric semantic genetic programming. On one hand, they structure the semantic space and govern the behavior of geometric search operators; on the other, they determine how fitness is calculated. The interactions between these two types of metrics are an important aspect that to date was largely neglected. In this paper,...
Chapter
In the previous chapter, we identified evaluation bottleneck, and showed that the information lost in aggregation of interaction outcomes can be essential for the success of a program synthesis process. In this chapter, we set out to present behavioral program synthesis, a new paradigm of program behavioral program synthesis which, in a sense, puts...
Chapter
In this introductory chapter, we characterize and formalize the key concepts of this book, in particular computer programs. We also define the task of program synthesis and determine the main factors that make it challenging. Finally, we delineate several paradigms of program synthesis, among them genetic programming.
Chapter
As argued in Sect. 2.2.2, one of the vices of conventional scalar evaluation is symmetry: the same reward is granted for passing every test. Yet some tests can be objectively more difficult than others in the sense of (2.2), i.e. harder to pass by a randomly generated program. They may vary also with respect to subjective difficulty, i.e. particula...
Chapter
The motivation behind analyzing consistency of execution traces with desired output in Chap. 6 was to identify and promote the programs that contain prospectively useful subprograms. The approach described in this chapter generalizes the trace consistency method in two respects. Firstly, we seek here for a more general relatedness between the inter...
Chapter
This book proposes a new conceptual perspective on generate-and-test program synthesis. The framework of behavioral program synthesis is intended to provide more information on candidate programs on one hand, and to make search algorithms capable of exploiting that information on the other. The core elements of that framework are execution records,...
Chapter
The main motif of this book is providing search algorithms with rich information on solutions’ characteristics. The formalism of execution record, a complete, instruction-by-instruction account on program execution for every test, is a technical means to achieve that goal. In the approaches presented to this point, only the final execution states i...
Chapter
In this chapter, we provide a unified perspective on the methods presented in Chaps. 4-8, the key consequence of which is the concept of search driver detailed in Sect. 9.3.
Chapter
This chapter presents the results of a comparative experiment involving various combinations of search drivers.
Chapter
Full-text available
Representation of input data has an essential influence on the performance of machine learning systems. Evolutionary algorithms can be used to transform data representation by selecting some of the existing features (evolutionary feature selection) or constructing new features from the existing ones (evolutionary feature construction). This entry p...
Data
Full-text available
Data
Full-text available
Data
Full-text available
Code
FUEL is a succinct Scala framework for implementing metaheuristic algorithms, in particular evolutionary algorithms. It originated in my work on the book ”Behavioral Program Synthesis with Genetic Programming” (Springer 2016, http://www.cs.put.poznan.pl/kkrawiec/bps/, http://www.springer.com/gp/book/9783319275635). FUEL is written primarily in func...
Article
Full-text available
This paper provides a structured, unified, formal and empirical perspective on all geometric semantic crossover operators proposed so far, including the exact geometric crossover by Moraglio, Krawiec, and Johnson, as well as the approximately geometric operators. We start with presenting the theory of geometric semantic genetic programming, and dis...