# Frank-Michael Schleif's research while affiliated with Hochschule für angewandte Wissenschaften Würzburg-Schweinfurt and other places

**What is this page?**

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

## Publications (143)

Deep learning is reaching state of the art in many applications. However, the generalization capabilities of the learned networks are limited to the training or source domain. The predictive power decreases when these models are evaluated in a target domain different from the source domain. Joint adversarial domain adaptation networks are currently...

In recent years social media became an important part of everyday life for many people. A big challenge of social media is, to find posts, that are interesting for the user. Many social networks like Twitter handle this problem with so-called hashtags. A user can label his own Tweet (post) with a hashtag, while other users can search for posts cont...

Concept drift is a change of the underlying data distribution which occurs especially with streaming data. Besides other challenges in the field of streaming data classification, concept drift has to be addressed to obtain reliable predictions. Robust Soft Learning Vector Quantization as well as Generalized Learning Vector Quantization has already...

Matrix approximations are a key element in large-scale algebraic machine learning approaches. The recently proposed method MEKA (Si et al., 2014) effectively employs two common assumptions in Hilbert spaces: the low-rank property of an inner product matrix obtained from a shift-invariant kernel function and a data compactness hypothesis by means of...

In non-stationary environments, several constraints require algorithms to be fast, memory-efficient, and highly adaptable. While there are several classifiers of the family of lazy learners and tree classifiers in the streaming context, the application of prototype-based classifiers has not found much attention. Prototype-based classifiers however...

Statistical and adversarial adaptation are currently two extensive categories of neural network architectures in unsupervised deep domain adaptation. The latter has become the new standard due to its good theoretical foundation and empirical performance. However, there are two shortcomings. First, recent studies show that these approaches focus too...

Statistical and adversarial adaptation are currently two ex-tensive categories of neural network architectures in unsupervised deepdomain adaptation. The latter has become the new standard dueto itsgood theoretical foundation and empirical performance. However, thereare two shortcomings. First, recent studies show that these approachesfocus too muc...

Over the last two decades, kernel learning attracted enormous interest and led to the development of a variety of successful machine learning models. The selection of an efficient data representation is one of the critical aspects to get high-quality results. In a variety of domains, this is achieved by incorporating expert knowledge in the used do...

Life science data are often encoded in a non-standard way by means of alpha-numeric sequences, graph representations, numerical vectors of variable length, or other formats. Domain-specific or data-driven similarity measures like alignment functions have been employed with great success. The vast majority of more complex data analysis algorithms re...

Similar as traditional algorithms, deep learning networks struggle in generalizing across domain boundaries. A current solution is the simultaneous training of the classification model and the minimization of domain differences in the deep network. In this work, we propose a new unsupervised deep domain adaptation architecture, which trains a class...

In static environments Random Projection (RP) is a popular and efficient technique to preprocess high-dimensional data and to reduce its dimensionality. While RP has been widely used and evaluated in stationary data analysis scenarios, non-stationary environments are not well analyzed. In this paper we provide an evaluation of RP on streaming data...

Current supervised learning models cannot generalize well across domain boundaries, which is a known problem in many applications, such as robotics or visual classification. Domain adaptation methods are used to improve these generalization properties. However, these techniques suffer either from being restricted to a particular task, such as visua...

Proximities are at the heart of almost all machine learning methods. If the input data are given as numerical vectors of equal lengths, euclidean distance, or a Hilbertian inner product is frequently used in modeling algorithms. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey par...

Transfer learning is focused on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these fields, the learning scenarios are naturally changing but often remain related to each other motivating the reuse of existing supervised models. Current transfer learning...

The amount of real-time communication between agents in an information system has increased rapidly since the beginning of the decade. This is because the use of these systems, e. g. social media, has become commonplace in today's society. This requires analytical algorithms to learn and predict this stream of information in real-time. The nature o...

The amount of real-time communication between agents in an information system has increased rapidly since the beginning of the decade. This is because the use of these systems, e.g. social media, has become commonplace in today’s society. This requires analytical algorithms to learn and predict this stream of information in real-time. The nature of...

Proximities are at the heart of almost all machine learning methods. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey particular mathematical properties. This renders many machine learning methods invalid, leading to convergence problems and the loss of generalization behavior. In...

Concept drift is a change of the underlying data distribution which occurs especially with streaming data. Besides other challenges in the field of streaming data classification, concept drift should be addressed to obtain reliable predictions. The Robust Soft Learning Vector Quantization has already shown good performance in traditional settings a...

Supervised learning employing positive semi definite kernels has gained wide attraction and lead to a variety of successful machine learning approaches. The restriction to positive semi definite kernels and a hilbert space is common to simplify the mathematical derivations of the respective learning methods, but is also limiting because more recent...

Transfer learning is focused on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these fields, the learning scenarios are naturally changing but often remain related to each other motivating the reuse of existing supervised models. Current transfer learning...

Todays datasets, especially in streaming context, are more and more non-static and require algorithms to detect and adapt to change. Recent work shows vital research in the field, but mainly lack stable performance during model adaptation. In this work, a concept drift detection strategy followed by a prototype based insertion strategy is proposed....

Transfer learning focuses on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these areas, learning scenarios change by nature, but often remain related and motivate the reuse of existing supervised models. While the majority of symmetric and asymmetric dom...

The increasing availability of wireless networks inside buildings has opened up numerous opportunities for new innovative smart systems. For a lot of these systems, acquisition of context-sensitive information about attendant people has evolved to a key challenge. Especially the position and distribution of attendants significantly influence the sy...

In an era of smart information systems and smart buildings, detecting, tracking and identifying the presence of attendants inside of enclosed rooms have evolved to a key challenge in the research area of smart building systems. Therefore, several types of sensing systems were proposed over the past decade to tackle these challenge. Depending on the...

Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores, but are also common in other fields like shape measures in image retrieval. Lacking an underlying vector space, the data are given as pairwise similarities only. The few algorithms available for such data do not scale to larger datasets. Focusing...

Transfer learning is focused on the reuse of supervised learning models in a new context. Prominent applications can be found in robotics, image processing or web mining. In these fields, the learning scenarios are naturally changing but often remain related to each other motivating the reuse of existing supervised models. Current transfer learning...

Existing algorithms for the detection of stellar structures in
the Milky Way are most efficient when full phase-space and color information is available. This is rarely the case. Since recently, the Gaia satellite
surveys the whole sky and is providing highly accurate positions for more
than one billion sources. In this contribution we propose two...

Non-metric proximity measures got wide interest in various domains such as life sciences, robotics and image processing. The majority of learning algorithms for these data are focusing on classification problems. Here we derive a regression algorithm for indefinite data representations based on the support vector machine. The approach avoids heuris...

The recently proposed Krĕin space Support Vector Machine (KSVM) is an efficient classifier for indefinite learning problems, but with quadratic to cubic complexity and a non-sparse decision function. In this paper a Krĕin space Core Vector Machine (iCVM) solver is derived. A sparse model with linear runtime complexity can be obtained under a low ra...

Kernel based learning is very popular in machine learning, but many classical methods have at least quadratic runtime complexity. Random fourier features are very effective to approximate shift-invariant kernels by an explicit kernel expansion. This permits to use efficient linear models with much lower runtime complexity. As one key approach to ke...

Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores, but are also common in other fields like shape measures in image retrieval. Lacking an underlying vector space, the data are given as pairwise similarities only. The few algorithms available for such data do not scale to larger datasets. Focusing...

In
supervised
learning
feature vectors are often implicitly mapped to a high-dimensional space using the kernel trick with quadratic costs for the learning algorithm. The recently proposed random Fourier features provide an explicit mapping such that classical algorithms with often linear complexity can be applied. Yet, the random Fourier feature a...

Sequence data are widely used to get a deeper insight into biological systems. From a data analysis perspective they are given as a set of sequences of symbols with varying length. In general they are compared using nonmetric score functions. In this form the data are nonstandard, because they do not provide an immediate metric vector space and the...

Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores. Lacking an underlying vector space, the data are given as pairwise similarities only. Indefinite Kernel Fisher Discriminant (iKFD) is a very effective classifier for this type of data but has cubic complexity and does not scale to larger problems...

Efficient learning of a data analysis task strongly depends on the data representation. Most methods rely on (symmetric) similarity or dissimilarity representations by means of metric inner products or distances, providing easy access to powerful mathematical formalisms like kernel or branch-and-bound approaches. Similarities and dissimilarities ar...

In supervised learning probabilistic models are attractive to define discriminative models in a rigid mathematical framework. More recently, prototype approaches, known for compact and efficient models, were defined in a probabilistic setting, but are limited to metric vectorial spaces. Here we propose a generalization of the discriminative probabi...

Metric learning constitutes a well-investigated field for vectorial data with successful applications, e.g. in computer vision, information retrieval, or bioinformatics. One particularly promising approach is offered by low-rank metric adaptation integrated into modern variants of learning vector quantization (LVQ). This technique is scalable with...

In supervised learning the parameters of a parametric Euclidean distance or mahalanobis distance can be effectively learned by so called Matrix Relevance Learning. This adaptation is not only useful to improve the discrimination capabilities of the model, but also to identify relevant features or relevant correlated features in the input data. Clas...

Odor classification by a robot equipped with an electronic nose (e-nose) is a challenging task for pattern recognition since volatiles have to be classified quickly and reliably even in the case of short measurement sequences, gathered under operation in the field. Signals obtained in these circumstances are characterized by a high-dimensionality,...

Domain specific (dis-)similarity or proximity measures used e.g. in alignment
algorithms of sequence data, are popular to analyze complex data objects and to
cover domain specific data properties. Without an underlying vector space these
data are given as pairwise (dis-)similarities only. The few available methods
for such data focus widely on simi...

Existing semi-supervised learning algorithms focus on vectorial data given in Euclidean space. But many real life data are non-metric, given as (dis-)similarities which are not widely addressed. We propose a conformal prototype-based classifier for dissimilarity data to semi-supervised tasks. A 'secure region' of unlabeled data is identified to imp...

Neighbor-preserving embedding of relational data in low-dimensional Euclidean spaces is studied. Contrary to variants of stochastic neighbor embedding that minimize divergence measures between estimated neighborhood probability distributions, the proposed approach fits configurations in the output space by maximizing correlation with potentially as...

Since they represent a model in terms of few typical representatives, prototype based learning such as learning vector quantization (LVQ) constitutes a directly interpretable machine learning technique. Recently, several LVQ schemes have been extended towards a kernelized or dissimilarity based version which can be applied if data are represented b...

Proximity matrices like kernels or dissimilarity matrices provide non-standard data representations common in the life science domain. Here we extend fast soft competitive learning to a discriminative and vector labeled learning algorithm for proximity data. It provides a more stable and consistent integration of label information in the cost funct...

Existing classification algorithms focus on vectorial data given in Euclidean space or representations by means of positive semi-definite kernel matrices. Many real world data, like biological sequences are not vectorial, often non-euclidean and given only in the form of (dis-)similarities between examples, requesting for efficient and interpretabl...

Prototype-based methods often display very intuitive classification and learning rules. However, popular prototype based classifiers such as learning vector quantization (LVQ) are restricted to vectorial data only. In this contribution, we discuss techniques how to extend LVQ algorithms to more general data characterized by pairwise similarities or...

We introduce a generalization of Multivariate Robust Soft Learning Vector Quantization. The approach is a probabilistic classifier and can deal with vectorial class labelings for the training data and the prototypes. It employs t-norms, known from fuzzy learning and fuzzy set theory, in the class label assignments, leading to a more flexible model...

Due to the increasing amount of large data sets, efficient learning algorithms are necessary. Also the interpretation of the final model is desirable to draw efficient conclusions from the model results. Prototype based learning algorithms have been extended recently to proximity learners to analyze data given in non-standard data formats. The supe...

Domain specific (dis-)similarity or proximity measures, employed e.g. in alignment algorithms in bio-informatics, are often used to compare complex data objects and to cover domain specific data properties. Lacking an underlying vector space, data are given as pairwise (dis-)similarities. The few available methods for such data do not scale well to...

The amount and complexity of data increase rapidly, however, due to time and cost constrains, only few of them are fully labeled. In this context non-vectorial relational data given by pairwise (dis-)similarities without explicit vectorial representation, like score- values in sequences alignments, are particularly challenging. Existing semi-superv...

Soft competitive learning is an advanced k-means like clustering approach overcoming some severe drawbacks of k-means, like initialization dependence and sticking to local minima. It achieves lower distortion error than k-means and has shown very good performance in the clustering of complex data sets, using various metrics or kernels. While very e...

Current classification algorithms focus on vectorial data, given in eu-clidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require com-plex transformations and provid...

In the life sciences, short time series with high dimensional entries are becoming more and more popular such as spectrometric data or gene expression profiles taken over time. Data characteristics rule out classical time series analysis due to the few time points, and they prevent a simple vectorial treatment due to the high dimensionality. In thi...

Recently, diverse high quality prototype-based clustering techniques have been developed which can directly deal with data sets given by general pairwise dissimilarities rather than standard Euclidean vectors. Examples include affinity propagation, relational neural gas, or relational generative topographic mapping. Corresponding to the size of the...

Prototype based learning offers an intuitive interface to inspect large quantities of electronic data in supervised or unsupervised settings. Recently, many techniques have been extended to data described by general dissimilarities rather than Euclidean vectors, so-called relational data settings. Unlike the Euclidean counterparts, the techniques h...

Recently, an extension of popular learning vector quantiza-tion (LVQ) to general dissimilarity data has been proposed, relational generalized LVQ (RGLVQ) [10, 9]. An intuitive prototype based classi-fication scheme results which can divide data characterized by pairwise dissimilarities into priorly given categories. However, the technique relies on...

We suggest and investigate the use of Generalized Matrix Relevance Learning (GMLVQ) in the context of discriminative visualization. This prototype-based, supervised learning scheme parameterizes an adaptive distance measure in terms of a matrix of relevance factors. By means of a few benchmark problems, we demonstrate that the training process yiel...

While state-of-the-art classifiers such as support vector machines offer efficient classification for kernel data, they suffer from two drawbacks: the underlying classifier acts as a black box which can hardly be inspected by humans, and non-positive definite Gram matrices require additional preprocessing steps to arrive at a valid kernel. In this...

We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data. This allows...

Prototype-based models offer an intuitive interface to given data sets by means of an inspection of the model prototypes. Supervised classification can be achieved by popular techniques such as learning vector quantization (LVQ) and extensions derived from cost functions such as generalized LVQ (GLVQ) and robust soft LVQ (RSLVQ). These methods, how...

Topographic mapping offers an intuitive interface to inspect large quantities of electronic data. Recently, it has been extended
to data described by general dissimilarities rather than Euclidean vectors. Unlike its Euclidean counterpart, the technique
has quadratic time complexity due to the underlying quadratic dissimilarity matrix. Thus, it is i...

Clustering approaches are very important methods to analyze data sets in an initial unsupervised setting. Traditionally many clustering approaches assume data points to be independent. Here we present a method to make use of local dependencies to improve clustering under guaranteed distortions. Such local dependencies are very common for data gener...

Clustering approaches constitute important methods for unsupervised data analysis. Traditionally, many clustering models focus
on spherical or ellipsoidal clusters in Euclidean space. Kernel methods extend these approaches to more complex cluster forms,
and they have been recently integrated into several clustering techniques. While leading to very...

Topographic mapping offers a very flexible tool to inspect large quantities of high-dimensional data in an intuitive way.
Often, electronic data are inherently non-Euclidean and modern data formats are connected to dedicated non-Euclidean dissimilarity
measures for which classical topographic mapping cannot be used. We give an overview about extens...

The increasing size and complexity of modern data sets turns modern data mining techniques to indispensable tools when inspecting biomedical data sets. Thereby, dedicated data formats and detailed information often cause the need for problem specific similarities or dissimilarities instead of the standard Euclidean norm. Therefore, a number of clus...

This paper introduces a hierarchical model for the description and deconvolution of composite patterns. The patterns are described in a basis system of spectral basis functions.The mixture coefficients for the composite patterns are determined by solving a linear mixture model with nonnegative coefficients. In life science research, wet-lab mixed s...

In content-based image retrieval (CBIR), relevance feedback has been proven to be a powerful tool for bridging the gap between low level visual features and high level semantic concepts. Traditionally, relevance feedback driven CBIR is often considered ...

We discuss the use of divergences in dissimilarity-based classification. Divergences can be employed whenever vectorial data consists of non-negative, potentially normalized features. This is, for instance, the case in spectral data or histograms. In particular, we introduce and study divergence based learning vector quantization (DLVQ). We derive...