# Sven KosubUniversität Konstanz | Uni-Konstanz · Department of Computer and Information Science

Sven Kosub

Prof. Dr. rer. nat. habil., Dipl.-Math.

## About

46

Publications

10,344

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

586

Citations

Citations since 2017

Introduction

I am an Adjunct Professor at the Department of Computer and Information Science leading a Theory of Computing group since 2015. After studies in mathematics and computer science, I received a PhD for research in computational complexity theory. Finishing a Habilitation in computer science, I joined the University of Konstanz in 2008 for a lectureship in formal foundations of computer science. My scientific expertise lies in the field of theoretical computer science, i.e., algorithms and complexity, discrete mathematics and logic, and specific interests of mine are the theoretical foundations of artificial intelligence, data science, and social network analysis. I am slightly biased towards examples from the sports domain.

**Skills and Expertise**

## Publications

Publications (46)

In this paper, we study the problem of finding a periodic attractor of a Boolean network (BN), which arises in computational systems biology and is known to be NP-hard. Since a general case is quite hard to solve, we consider special but biologically important subclasses of BNs. For finding an attractor of period 2 of a BN consisting of $n$ OR func...

This research uses football online betting odds of a broad variety of matches and bookmakers to identify known
biases in odds pricing, namely the favorite-longshot bias and the away-favorite bias. Furthermore, it tries to answer the question whether a naive strategy of betting against these biases can be
profitable. Our findings are consistent with...

Automatic and interactive data analysis is instrumental in making use of increasing amounts of complex data. Owing to novel sensor modalities, analysis of data generated in professional team sport leagues such as soccer, baseball, and basketball has recently become of concern, with potentially high commercial and research interest. The analysis of...

Two simple proofs of the triangle inequality for the Jaccard distance in terms of nonnegative, monotone, submodular functions are given and discussed.

We transfer distances on clusterings to the building process of decision trees, and as a consequence extend the classical ID3 algorithm to perform modifications based on the global distance of the tree to the ground truth--instead of considering single leaves. Next, we evaluate this idea in comparison with the original version and discuss occurring...

We prove that the constructive weighted coalitional manipulation problem for the Schulze voting rule can be solved in polynomial time for an unbounded number of candidates and an unbounded number of manipulators.

Tries are general purpose data structures for information retrieval. The most significant parameter of a trie is its height $H$ which equals the length of the longest common prefix of any two string in the set $A$ over which the trie is built. Analytical investigations of random tries suggest that ${\bf E}(H)\in O(\log(\|A\|))$, although $H$ is unb...

We prove that the constructive weighted coalitional manipulation problem for the Schulze voting rule can be solved in polynomial time for an unbounded number of candidates and an unbounded number of manipulators.

Die folgenden Überlegungen betreffen die Frage, inwieweit es möglich ist, Texte als Programme, Narrative als Algorithmen, Erzählungen als Berechnungen aufzufassen. Dabei wird auf das Textverständnis im Sinne der literaturwissenschaftlichen Erzähltheorie und auf Begrifflichkeiten der Berechenbarkeits- und Komplexitätstheorie aus der Theoretischen In...

We present dichotomy theorems regarding the computational complexity of counting fixed points in boolean (discrete) dynamical systems, i.e., finite discrete dynamical systems over the domain {0, 1}. For a class F of boolean functions and a class G of graphs, an (F,G)-system is a boolean dynamical system with local transitions functions lying in F a...

We investigate the growth of the number w_k of walks of length k in undirected graphs as well as related inequalities. In the first part, we deduce the inequality w_2a+c⋅w_2(a+b)+c ≤ w_2a⋅w_2(a+b+c), which we call the Sandwich Theorem. It unifies and generalizes an inequality by Lagarias et al. and an inequality by Dress and Gutman. In the same way...

Unser Fokus sind die theoretischen Grundlagen gängiger Methoden zur Bestimmung von Zentralität in Netzwerken.

This work draws attention to combinatorial network abstraction problems which are specified by a class \(\mathcal{P}\) of pattern graphs and a real-valued similarity measure \(\varrho\) based on certain graph properties. For fixed \(\mathcal{P}\) and \(\varrho\), the optimization task on any graph G is to find a subgraph G′ which belongs to \(\math...

We introduce the boolean hierarchy of k-partitions over NP for k 3 as a generalization of the booelean hierarchy of sets (i.e., 2-partitions) over NP. Whereas the structure of the latter hierarchy is rather simple the structure of the boolean hierarchy of k-partitions over NP for k 3 turns out to be much more complicated. We establish the Embedding...

A complete classification of the computational complexity of the fixed-point existence problem for boolean dynamical systems, i.e., finite discrete dynamical systems over the domain {0, 1}, is presented. For function classes F and graph classes G, an (F, G)-system is a boolean dynamical system such that all local transition functions lie in F and t...

We present dichotomy theorems regarding the computational complexity of counting fixed points in boolean (discrete) dynamical systems, i.e., finite discrete dynamical systems over the domain {0,1}. For a class F of boolean functions and a class G of graphs, an (F,G)-system is a boolean dynamical system with local transitions functions lying in F an...

Tries are very simple general purpose data structures for information retrieval. A crucial parameter of a trie is its height. In the worst case the height is unbounded when the trie is built over a set of $n$ strings. Analytical investigations have shown that the average heught under many random sources is
logarithmic in $n$. Experimental studies o...

Inferring relevant sysem parameters from monitorable data is a fundamental requisite for handling large-scale socio-technical systems. In this thesis we address this set of problems both theoretically and application-oriented. We mathematically study discrete ex post models for dynamical sysems. A particular focus is on algorithms for identifying s...

An experimental study of the feasibility and accuracy of the acyclicity approach introduced in [14] for the inference of business relationships among autonomous systems (ASes) is provided. We investigate the maximum acyclic type-of-relationship problem: on a given set of AS paths, find a maximum-cardinality subset which allows an acyclic and valley...

We contribute to the study of inferring commercial relationships between autonomous systems (AS relationships) from observable
BGP routes. We deduce several forbidden patterns of AS relationships that impose a certain type of acyclicity on the AS graph.
We investigate algorithms for solving the acyclic all-paths type-of-relationship problem, i.e.,...

This work studies (lowest) common ancestor problems in (weighted) directed acyclic graphs. We improve previous algorithms for the all-pairs representative LCA problem to O(n^2.575) by using fast rectangular matrix multiplication. We prove a first non-trivial upper bound of O( min {n^2
m, n^3.575 }) for the all-pairs all lowest common ancestors prob...

We study the complexity of finding a subgraph of a certain size and a certain density, where density is measured by the average degree. Let gamma: N -> Q be any density function, i.e., gamma is computable in polynomial time and satisfies gamma(k) 0 and has a polynomial-time algorithm for gamma=2 O(1/k).

Although complexity theory already extensively studies path-cardinality-based restrictions on the power of nondeterminism, this paper is motivated by a more recent goal: To gain insight into how much of a restriction it is of nondeterminism to limit machines to have just one contiguous (with respect to some simple order) interval of accepting paths...

Given a p-order A over a universe of strings (i.e., a transitive, reflexive, antisymmetric relation such that if (x, y) is an element of A then |x| is polynomially bounded by |y|), an interval size function of A returns, for each string x in the universe, the number of strings in the interval between strings b(x) and t(x) (with respect to A), where...

Actors in networks usually do not act alone. By a selective process of establishing relationships with other actors, they form groups. The groups are typically founded by common goals, interests, preferences or other similarities. Standard examples include personal acquaintance relations, collaborative relations in several social domains, and coali...

The boolean hierarchy of k-partitions over NP for k 2 was introduced as a generalization of the well-known boolean hierarchy of sets. The classes of this hierarchy are exactly those classes of NPpartitions which are generated by nite labeled lattices. We extend the boolean hierarchy of NP-partitions by considering partition classes which are genera...

When studying complexity classes of partitions we often face the situation that different partition classes have the same component classes. The projective closures are the largest classes among these with respect to set inclusion. In this paper we investigate projective closures of classes of boolean NP-partitions, i.e., partitions with components...

In the early nineties of the previous century, leaf languages were introduced as a means for the uniform characterization of many complexity classes, mainly in the range between P (polynomial time) and PSPACE (polynomial space). It was shown that the separability of two complexity classes can be reduced to a combinatorial property of the correspond...

When studying complexity classes of partitions we often face the situation that different partition classes have the same component classes. The projective closures are the largest classes among these with respect to set inclusion. In this paper we investigate projective closures of classes of boolean NP-partitions, i.e., partitions with components...

Abstract We study the complexity of nding a subgraph of a certain size and a certain density, where density is measured by the average degree. Let : N ! Q+ be any density function, i.e., is computable in polynomial time and satises (k) k 1 for all k 2 N. Then -Cluster is the problem of deciding, given an undirected graph G and a natural number k, w...

We study the complexity of finding a subgraph of a certain size and a certain density, where density is measured by the average degree. Let γ : ℕ → ℚ+ be any density function, i.e., γ is computable in polynomial time and satisfies γ(k) ≤ k − 1 for all k ∈ ℕ. Then γ-Cluster is the problem of deciding, given an undirected graph G and a natural number...

We introduce a general framework for the denition of function classes. Our model, which is based on nondeterministic polynomial-time Turing transducers, allows uniform characterizations of FP, FP NP , FP NP [O(log n)], FP NP tt , counting classes (#P, #NP, #coNP, GapP, GapP NP ), optimization classes (maxP, minP, maxNP, minNP), promise classes (NPS...

When studying the complexity of partitions one often faces the situation that different partition classes have the same projection classes. The projectively closed classes are the greatest (with respect to set-inclusion) among these. In this paper we determine important partition classes that are projectively closed and we prove the rather surprisi...

We study computational effects of persistent Turing machines, independently introduced by Goldin and Wegner [GW98], and Kosub [Kos98]. Persistence is a mode of interaction which makes it possible to consider the computational behavior of a Turing machine as an infinite sequence of autonomous computations. We investigate different computability conc...

We study the complexity of counting the number of elements in intervals of feasible partial orders. Depending on the properties
that partial orders may have, such counting functions have different complexities. If we consider total, polynomial-time decidable
orders then we obtain exactly the #P functions. We show that the interval size functions fo...

Computational complexity theory usually investigates the complexity of sets, i.e., the complexity of partitions into two parts. But often it is more appropriate to represent natural problems by partitions into more than two parts. A particularly interesting class of such problems consists of classification problems for relations. For instance, a bi...

In this paper we demonstrate that the studies of structural properties of the boolean hierarchy of NP-partitions are not only worthwhile in their own, e.g., as a framework for capturing the complexity of classication problems but have interesting ties with other research in computational complexity: We discuss the relationships to the study of sepa...

We introduce the boolean hierarchy of k-partitions over NP for k ≥ 3 as a generalization of the boolean hierarchy of sets (i.e., 2-partitions) over NP. Whereas the structure of the latter hierarchy is rather simple the structure of the boolean hierarchy of k-partitions over NP for k ≥ 3 turns out to be much more complicated. We formulate the Embedd...

The boolean hierarchy of k-partitions over NP for k ≥ 2 was introduced as a generalization of the well-known boolean hierarchy of sets. The classes of this hierarchy are exactly
those classes of NP-partitions which are generated by finite labeled lattices. We refine the boolean hierarchy of NP-partitions
by considering partition classes which are g...

Introduction Unambiguous computation according to UP has become a classical notion in computational complexity theory. Unambiguity is also used in a theorem of Wagner [14]. A set L is in P iff there are a set A 2 NP and a polynomial p such that for all x and y with jyj p(jxj), if (x; y) 2 A then (x; y Gamma 1) 2 A, and x 2 L iff the maximal y with...

We introduce the boolean hierarchy of k-partitions over NP for k 3 as a generalization of the booelean hierarchy of sets (i.e., 2-partitions) over NP. Whereas the structure of the latter hierarchy is rather simple the structure of the boolean hierarchy of k-partitions over NP for k 3 turns out to be much more complicated. We establish the Embedding...

We introduce a general framework for the definition of function classes. Our model, which is based on polynomial time nondeterministic Turing transducers, allows uniform characterizations of FP, FP NP , counting classes (#DeltaP, #DeltaNP, #DeltacoNP, GapP, GapP NP ), optimization classes (maxDeltaP, minDeltaP, maxDeltaNP, minDeltaNP), promise clas...

We consider a special kind of non-deterministic Turing machines. Cluster machines are distinguished by a neighbourhood relationship between accepting paths. Based on a formalization using equivalence relations some subtle properties of these machines are proven. Moreover, by abstraction we gain the machine-independend concept of cluster sets which...

## Projects

Projects (3)

A unifying long-term project focusing all-encompassingly on algorithmic methods for network data in mathematical, theoretical, and practical perspectives. Network data (i.e., overlapping dyadic data) is collected in many different domains of empirical research, each of which equipped with specific methodologies. The challenge is to look at how researchers in their domains work with data computationally and to come up with founded algorithmic methods to support them in their daily work. A particular interest is in staggered processes (pipelines) of algorithmic data transformations observable in empirical studies (e.g., sequences of projections of incidence matrices on either side, distance/walk/similarity-based derivations, geometrical embeddings). An exemplary research goal is to establish interpretable pipelines for clusterings in networks.

A project based on the conjecture that a formal description of self-organization (notably, communication) can be based on calculi nullifying the difference of operator and operands. The famous, semigraphical calculus of forms (aka calculus of indications) of George Spencer Brown, which identifies distinction as an operation (cross) with distinction as the result of an operation (separated spaces), is such a calculus. Receiving criticisms from the computer science community for this indifference which apparently contradicts the principles of programming languages, it is nevertheless beneficial to see how the calculus can be used to describe systems, in particular, social (and socio-technical) systems. In the application-oriented part of the project, several tools are designed and implemented to support field work with Spencer Brown’s forms: an automated tool for generating layouts of re-entry forms (in LaTeX) while optimizing the layout according to several criteria (like planarity or minimizing crossings) and apps understanding forms drawn on a tablet and generating code out of it. The theoretical part of the project is devoted to a complete form analysis and form synthesis. Existing studies only consider simple re-entry forms with just one re-entry. There are circular relations between the coding part (apps) and the theory part (theorems) which give rise to several questions involving machine learning and computer vision techniques.

A mission to identify performance indicators, produce forecasts, and support decision making in the area of team sports using descriptive, predictive, or visual analytics and any kind of data available. A particular focus is on soccer, the beautiful game. For instance, on the basis of spatiotemporal data obtained from sensors in shoes, we are interested in methods to recognize, evaluate, and visualize all possible pass options for a ball-possessing player. This allows for the assessment of the quality of actually realized passes. Another idea we follow is the use of betting odds as "ground truth.” Coming up with well-founded interaction models in team sports (considered to be the ultimate theoretical goal) is a challenging and yet unresolved task. Beloved inferring team strengths from collected match outcomes in the past is based on an information basis presumably too thin for both explanation and theory-building. Prediction markets like bookmakers or betting exchanges promise more enriched signals. Despite the well-studied tendency to information efficiency in financial markets, there has been, and still is, much discussion on biases in betting markets, e.g., the favourite-longshot bias in general or the draw bias in soccer. A clarification of possible bias structures is required. An opening of the project towards amateur and mass sports, organization of sport events, or fitness & health (quantified-self movement) is planned.