## About

66

Publications

26,261

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

486

Citations

Citations since 2016

Introduction

My primary research interests: artificial intelligence, machine learning, data modeling, big data, data management, IoT. More info: http://conceptoriented.org

Additional affiliations

January 2022 - January 2022

August 2019 - December 2021

April 2015 - July 2019

**Bosch Software Innovations GmbH**

Position

- Analyst

Description

- - Anomaly detection algorithms - Column-oriented analysis engine for IoT (Python) - Analytical cloud services (Cloud Foundry, AWS)

Education

November 1990 - November 1993

September 1983 - June 1989

## Publications

Publications (66)

Most of the currently existing query languages and data processing frameworks rely on one or another form of the group-by operation for data aggregation. In this paper, we critically analyze properties of this operation and describe its major drawbacks. We also describe an alternative approach to data aggregation based on accumulate functions and d...

The plethora of existing data models and specific data modeling techniques is not only confusing but leads to complex, eclectic and inefficient designs of systems for data management and analytics. The main goal of this paper is to describe a unified approach to data modeling, called the concept-oriented model (COM), by using functions as a basis f...

In this paper, we describe a novel approach to data integration, transformation and analysis, called DataCommandr. Its main distinguishing feature is that it is based on operations with columns rather than operations with tables in the relational model or operations with cells in spreadsheet applications. This data processing model is free of such...

Since the introduction of the relational model of data, the join operation is part of almost all query languages and data processing engines. Nowadays, it is not only a formal operation but rather a dominating pattern of thought for the concept of data connectivity. In this paper, we critically analyze properties of this operation, its role and use...

In this paper we argue that representing entity properties by tuple attributes, as evangelized in most set-oriented data models, is a controversial method conflicting with the principle of tuple immutability. As a principled solution to this problem of tuple immutability on one hand and the need to modify tuple attributes on the other hand, we prop...

We describe a new logical data model, called the concept-oriented model (COM). It uses mathematical functions as first-class constructs for data representation and data processing as opposed to using exclusively sets in conventional set-oriented models. Functions and function composition are used as primary semantic units for describing data connec...

This paper describes an approach to detecting anomalous behavior of devices by analyzing their event data. Devices from a fleet are supposed to be connected to the Internet by sending log data to the server. The task is to analyze this data by automatically detecting unusual behavioral patterns. Another goal is to provide analysis templates that ar...

For the past several decades, programmers have been modeling things in the
world with trees using hierarchies of classes and object-oriented programming
(OOP) languages. In this paper, we describe a novel approach to programming,
called concept-oriented programming (COP), which generalizes classes and
inheritance by introducing concepts and inclusi...

The main goal of concept-oriented programming (COP) is describing how objects are represented and accessed. It makes references (object locations) first-class elements of the program responsible for many important functions which are difficult to model via objects. COP rethinks and generalizes such primary notions of object-orientation as class and...

In spite of its fundamental importance, inference has not been an inherent
function of multidimensional models and analytical applications. These models
are mainly aimed at numeric (quantitative) analysis where the notions of
inference and semantics are not well defined. In this paper we argue that
inference can be and should be integral part of mu...

Data integration as well as other data wrangling tasks account for a great deal of the difficulties in data analysis and frequently constitute the most tedious part of the overall analysis process. We describe a new system, ConceptMix, which radically simplifies analytical data integration for a broad range of non-IT users who do not possess deep k...

Concept-oriented model of data (COM) has been recently defined syntactically by means of the concept-oriented query language (COQL). In this paper we propose a formal embodiment of this model, called nested partially ordered sets (nested posets), and demonstrate how it is connected with its syntactic counterpart. Nested poset is a novel formal cons...

We study properties of the join operation in query languages and describe some of its major drawbacks. We provide strong arguments against using joins as a main construct for retrieving related data elements in general purpose query languages and argue for using references instead. Since conventional references are quite restrictive when applied to...

In spite of its fundamental importance, inference has not been an inherent function of multidimensional models and analytical applications. These models are mainly aimed at numeric analysis where the notion of inference is not well defined. In this paper we define inference using only multidimensional terms like axes and coordinates as opposed to u...

The main goal of concept-oriented programming (COP) is describing how objects are represented and accessed. References (object locations) in COP are made first-class elements responsible for many important functions which are difficult to model via objects. COP rethinks and generalizes such primary notions of object-orientation as class and inherit...

We present the concept-oriented model (COM) and demonstrate how its three main structural principles — duality, inclusion and partial order — naturally account for various typical data modeling issues. We argue that elements should be modeled as identity-entity couples and describe how a novel data modeling construct, called concept, can be used to...

The concept-oriented data model (COM) is an emerging approach to data modeling which is based on three novel principles: duality, inclusion and order. These three structural principles provide a basis for modeling domain-specific identities, object hierarchies and data semantics. In this paper these core principles of COM are presented from the poi...

In the paper we describe a novel query language, called the concept-oriented query language (COQL), and demonstrate how it can be used for data modeling and analysis. The query language is based on a novel construct, called concept, and two relations between concepts, inclusion and partial order. Concepts generalize conventional classes and are use...

We describe a new approach to data modeling, called the concept-oriented model (COM), and a novel concept-oriented query language (COQL). The model is based on three principles: duality principle postulates that any element is a couple consisting of one identity and one entity, inclusion principle postulates that any element has a super-element, an...

In this paper we present a new approach to data modelling, called the concept-oriented model (CoM), and describe its main features and characteristics including data semantics and operations. The distinguishing feature of this model is that it is based on the formalism of nested ordered sets where any element participates in two structures simultan...

Object-oriented programming (OOP) is aimed at describing the structure and behaviour of objects by hiding the mechanism of their representation and access in primitive references. In this article we describe an approach, called concept-oriented programming (COP), which focuses on modelling references assuming that they also possess application-spec...

In the paper we introduce a new programming language construct, called concept, which is defined as a pair of two classes: one reference class and one object class. Instances of the reference class are passed-by-value and are intended to indirectly represent objects. Instances of the object class are passed-by-reference. Each concept has a parent c...

The paper describes a mechanism for indirect object representation and access (ORA) in programming languages. The mechanism is based on using a new programming construct which is referred to as concept. Concept consists of one object class and one reference class both having their fields and methods. The object class is the conventional class as de...

In this paper we describe a new approach to data modelling called the concept-oriented model (CoM). This model is based on the formalism of nested ordered sets which uses inclusion relation to produce hierarchical structure of sets and ordering relation to produce multi-dimensional structure among its elements. Nested ordered set is defined as an o...

In this paper we describe a new approach to programming which generalizes object-oriented programming. It is based on using a new programming construct, called concept, which generalizes classes. Concept is defined as a pair of two classes: one reference class and one object class. Each concept has a parent concept which is specified using inclusio...

In the paper a new programming construct, called concept, is introduced. Concept is pair of two classes: a reference class and an object class. Instances of the reference classes are passed-by-value and are intended to represent objects. Instances of the object class are passed-by-reference. An approach to programming where concepts are used instea...

This paper describes a new approach to programming, called the concept-oriented programming (COP). It is based on using a new programming construct, called concept, which generalizes conventional classes. Concepts describe behaviour of both objects and references. Hence references are completely legalized and made first-class citizens with the same...

The paper describes an approach to query processing in the concept-oriented data model. This approach is based on imposing constraints and specifying the result type. The constraints are then automatically propagated over the model and the result contains all related data items. The simplest constraint propagation strategy consists of two steps: pr...

In the paper we describe the problem of grouping and aggregation in the concept-oriented data model. The model is based on ordering its elements within a hierarchical multidimensional space. This order is then used to define all its main properties and mechanisms. In particular, it is assumed that elements positioned higher are interpreted as group...

In the paper the concept-oriented data model (COM) is described from the point of view of its hierarchical and multidimensional properties. The model consists of two levels: syntactic and semantic. At the syntactic level each element is defined as a combination of its superconcepts. At the semantic level each item is defined as a combination of its...

The paper describes logical navigation in the concept-oriented data model. This model explicitly and formally separates physical structure and logical structure so that each element of the model is simultaneously a collection and a combination of other elements. The physical structure is used to representing and access by elements by means of refer...

In the paper we describe a new construct which is referred to as concept and a new concept-oriented approach to program-ming. Concept generalizes conventional classes and consists of two parts: an objects class and a reference class. Each concept has a parent concept specified via inclusion relation. Instances of reference class are passed by value...

In the paper a new approach to data representation and manipulation is described, which is called the concept-oriented data model (CODM). It is supposed that items represent data units, which are stored in concepts. A concept is a combination of superconcepts, which determine the concept's dimensionality or properties. An item is a combination of s...

The SPIN! data mining system has a component-based architecture, where each component encapsulates some specific functionality such as a data source, an analysis algorithm or visualization. Individual components can be visually linked within one workspace for solving different data mining tasks. The SPIN! friendly user interface and flexible underl...

In the paper a new data mining algorithm for finding the most interesting dependence rules is described. Dependence rules are derived from the itemsets with support significantly different from its expected value and therefore considered interesting. Since such itemsets are distributed non-monotonically in the lattice of all itemsets the support mo...

The rapidly expanding market for Spatial Data Mining systems and technologies is driven by pressure from the public sector, environmental agencies and industry to provide innovative solutions to a wide range of different problems. The main objective of the described spatial data mining platform is to provide an open, highly extensible, n-tier syste...

Most rule induction algorithms including those for association rule mining use high support as one of the main measures of interestingness. In this paper we follow an opposite approach and describe an algorithm, called Optimist, which finds all largest empty intervals in data and then transforms then into the form of multiple-valued rules. It is de...

The rapidly expanding market for Spatial Data Mining systems and technologies is driven by pressure from the public sector, environmental agencies and industry to provide innovative solutions to a wide range of different problems. The main objective of the described spatial data mining platform is to provide an open, highly extensible, n-tier syste...

Data Mining and Geographic Information Systems (GIS) have existed so far as separate technologies. The overall objective of the SPIN!-project is to develop a web-based spatial data mining system by integrating state of the art (GIS) and data mining functionality in a closely coupled open and extensible system architecture. The general architecture...

In the article the notion of analytical space is introduced and its application to data representation and interactive analysis is studied. Analytical space is defined through membership relation among its elements where each element is characterised by its extensional and intensional. All data element properties are derived from this fundamental r...

Classification was traditionally used as an instrument for producing choropleth maps. In our study we consider the role of the process of classification in exploratory data analysis. We developed a set of tools for classification that facilitate looking on data from various viewpoints and thereby investigate different aspects of the data. The tools...

We describe the problem of mining set valued rules in large relational tables containing categorical attributes taking a finite number of values. An example of such a rule might be “IF HOUSEHOLDSIZE = {Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD = {CashCheck (Max=249, Sum=4952) OR DebitCard (Max=175, Sum=3021)} WHERE...

We present new methods for analyzing geo-referenced statistical data. These methods combine visualization and direct manipulation techniques of exploratory data analysis and algorithms for data mining. The methods have been implemented by integrating two hitherto separate software tools: Descartes for interactive thematic mapping, and the data mini...

Data Mining and Geographic Information Systems (GIS) have existed so far as separate technologies. The overall objective of the SPIN!- project is to develop a web-based spatial data mining system by integrating state of the art (GIS) and data mining functionality in a closely coupled open and extensible system architecture. The general architecture...

We describe the problem of mining set valued rules in large relational tables containing categorical attributes taking a finite number of values. Such rules allow for an interval of possible values to be selected for each attribute in condition instead of a single value for association rules, while conclusion contains a projection of the data restr...

. We present a new algorithm, called Optimist, which generates possibilistic setvalued rules from tables containing categorical attributes taking a finite number of values. An example of such a rule might be "IF HOUSEHOLDSIZE={Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD={CashCheck (Max=249) OR DebitCard (Max=175)}. Th...

An algorithm for finding logical dependencies in tables of data is described. Attributes corresponding to the table columns are supposed to take a finite number of values. An example of such a rule might be "IF HOUSEHOLDSIZE={Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD={CashCheck OR DebitCard}. A formal basis for the...

We describe the problem of mining possibilistic set-valued rules in large relational tables containing categorical attributes taking a finite number of values. An example of such a rule might be “IF HOUSEHOLDSIZE={Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD={CashCheck (Max=249) OR DebitCard (Max=175)}. The table seman...

The goal of multi-dimensional fuzzy analysis consists in discovering different properties in multi-dimensional fuzzy distributions represented either extensionally (database) or intensionally (knowledge base). In this paper we show how this approach can be applied to such problems as decision making and knowledge discovery in databases. For uniform...

In this paper a new original approach to the analysis of fuzzy multi-dimensional distributions is described. A uniform method for representing fuzzy multi-dimensional distributions by means of sectioned vectors and matrices is proposed. Sectioned ma-trix is interpreted as fuzzy conjunctive normal form, while its line vectors are interpreted as fuzz...

In the paper new resolution principle and some its properties are considered. The new resolution rule is applied to any two disjuncts on some variable and results in a third disjunct called resolvent. Disjuncts in the logic of possibility distributions consist of explicitly represented possibility distributions over sets of values of variables. The...

In the paper a qualitative approach to the problem of aggregation of information is discussed. The method is based on the notion of fuzzy attribute model of the aggregation problem in which informa- tion is represented in the form of fuzzy constraints on the values of attributes. The aggregate is built by carrying out logical inference on the knowl...

In the paper the problem of diagnostics in the terms of fuzzy attribute model is considered, where the problem domain is described by a number of attributes and their values. The diagnostic knowledge about the problem domain is thought of as fuzzy constraints on the possible combinations of the values of attributes in the corresponding attribute mo...

The fuzzy propositional logic (FPL) described in the article presents a new alternative approach to thinking about such notions as a proposition, interpretation, inference. This logic generalizes the classical propositional logic in two directions: (i) propositions are considered to be fuzzy and (ii) logical variables are considered to be many-valu...

The paper presents a new alternative approach to thinking about such notions as a proposition, interpretation, inference. Fuzzy propositional logic described generalizes the classical propositional logic in two directions: (i) propositions are considered to be fuzzy, and (ii) logical variables are considered to be many-valued. The formulas are inte...

An approach to the construction of diagnostic expert systems which relies on a special matrix representation of fuzzy predicates in an attribute model of the subject area is proposed in this paper. The sectionalized matrix is the fuzzy analog of the conjunctive normal form of predicate representation. To form a knowledge base, one can use rules, ex...

An approach to the diagnostic type expert systems design based on the special matrix representation of fuzzy predicates in the attribute model of the problem domain is presented. Inten- sive representation of predicates by means of sectional matrices is an analogue of the conjunctive normal form. Rules, positive examples and negative examples (in g...

## Projects

Projects (2)